| 4 4 2 3 2 28 28 28 28 3 28 5 5 5 5 5 5 1 4 5 5 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 57 57 30 30 31 31 31 25 25 25 18 18 7 25 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 | // SPDX-License-Identifier: GPL-2.0-only /* * This contains encryption functions for per-file encryption. * * Copyright (C) 2015, Google, Inc. * Copyright (C) 2015, Motorola Mobility * * Written by Michael Halcrow, 2014. * * Filename encryption additions * Uday Savagaonkar, 2014 * Encryption policy handling additions * Ildar Muslukhov, 2014 * Add fscrypt_pullback_bio_page() * Jaegeuk Kim, 2015. * * This has not yet undergone a rigorous security audit. * * The usage of AES-XTS should conform to recommendations in NIST * Special Publication 800-38E and IEEE P1619/D16. */ #include <linux/pagemap.h> #include <linux/mempool.h> #include <linux/module.h> #include <linux/scatterlist.h> #include <linux/ratelimit.h> #include <crypto/skcipher.h> #include "fscrypt_private.h" static unsigned int num_prealloc_crypto_pages = 32; module_param(num_prealloc_crypto_pages, uint, 0444); MODULE_PARM_DESC(num_prealloc_crypto_pages, "Number of crypto pages to preallocate"); static mempool_t *fscrypt_bounce_page_pool = NULL; static struct workqueue_struct *fscrypt_read_workqueue; static DEFINE_MUTEX(fscrypt_init_mutex); struct kmem_cache *fscrypt_inode_info_cachep; void fscrypt_enqueue_decrypt_work(struct work_struct *work) { queue_work(fscrypt_read_workqueue, work); } EXPORT_SYMBOL(fscrypt_enqueue_decrypt_work); struct page *fscrypt_alloc_bounce_page(gfp_t gfp_flags) { if (WARN_ON_ONCE(!fscrypt_bounce_page_pool)) { /* * Oops, the filesystem called a function that uses the bounce * page pool, but it didn't set needs_bounce_pages. */ return NULL; } return mempool_alloc(fscrypt_bounce_page_pool, gfp_flags); } /** * fscrypt_free_bounce_page() - free a ciphertext bounce page * @bounce_page: the bounce page to free, or NULL * * Free a bounce page that was allocated by fscrypt_encrypt_pagecache_blocks(), * or by fscrypt_alloc_bounce_page() directly. */ void fscrypt_free_bounce_page(struct page *bounce_page) { if (!bounce_page) return; set_page_private(bounce_page, (unsigned long)NULL); ClearPagePrivate(bounce_page); mempool_free(bounce_page, fscrypt_bounce_page_pool); } EXPORT_SYMBOL(fscrypt_free_bounce_page); /* * Generate the IV for the given data unit index within the given file. * For filenames encryption, index == 0. * * Keep this in sync with fscrypt_limit_io_blocks(). fscrypt_limit_io_blocks() * needs to know about any IV generation methods where the low bits of IV don't * simply contain the data unit index (e.g., IV_INO_LBLK_32). */ void fscrypt_generate_iv(union fscrypt_iv *iv, u64 index, const struct fscrypt_inode_info *ci) { u8 flags = fscrypt_policy_flags(&ci->ci_policy); memset(iv, 0, ci->ci_mode->ivsize); if (flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) { WARN_ON_ONCE(index > U32_MAX); WARN_ON_ONCE(ci->ci_inode->i_ino > U32_MAX); index |= (u64)ci->ci_inode->i_ino << 32; } else if (flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32) { WARN_ON_ONCE(index > U32_MAX); index = (u32)(ci->ci_hashed_ino + index); } else if (flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) { memcpy(iv->nonce, ci->ci_nonce, FSCRYPT_FILE_NONCE_SIZE); } iv->index = cpu_to_le64(index); } /* Encrypt or decrypt a single "data unit" of file contents. */ int fscrypt_crypt_data_unit(const struct fscrypt_inode_info *ci, fscrypt_direction_t rw, u64 index, struct page *src_page, struct page *dest_page, unsigned int len, unsigned int offs, gfp_t gfp_flags) { union fscrypt_iv iv; struct skcipher_request *req = NULL; DECLARE_CRYPTO_WAIT(wait); struct scatterlist dst, src; struct crypto_skcipher *tfm = ci->ci_enc_key.tfm; int res = 0; if (WARN_ON_ONCE(len <= 0)) return -EINVAL; if (WARN_ON_ONCE(len % FSCRYPT_CONTENTS_ALIGNMENT != 0)) return -EINVAL; fscrypt_generate_iv(&iv, index, ci); req = skcipher_request_alloc(tfm, gfp_flags); if (!req) return -ENOMEM; skcipher_request_set_callback( req, CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP, crypto_req_done, &wait); sg_init_table(&dst, 1); sg_set_page(&dst, dest_page, len, offs); sg_init_table(&src, 1); sg_set_page(&src, src_page, len, offs); skcipher_request_set_crypt(req, &src, &dst, len, &iv); if (rw == FS_DECRYPT) res = crypto_wait_req(crypto_skcipher_decrypt(req), &wait); else res = crypto_wait_req(crypto_skcipher_encrypt(req), &wait); skcipher_request_free(req); if (res) { fscrypt_err(ci->ci_inode, "%scryption failed for data unit %llu: %d", (rw == FS_DECRYPT ? "De" : "En"), index, res); return res; } return 0; } /** * fscrypt_encrypt_pagecache_blocks() - Encrypt data from a pagecache page * @page: the locked pagecache page containing the data to encrypt * @len: size of the data to encrypt, in bytes * @offs: offset within @page of the data to encrypt, in bytes * @gfp_flags: memory allocation flags; see details below * * This allocates a new bounce page and encrypts the given data into it. The * length and offset of the data must be aligned to the file's crypto data unit * size. Alignment to the filesystem block size fulfills this requirement, as * the filesystem block size is always a multiple of the data unit size. * * In the bounce page, the ciphertext data will be located at the same offset at * which the plaintext data was located in the source page. Any other parts of * the bounce page will be left uninitialized. * * This is for use by the filesystem's ->writepages() method. * * The bounce page allocation is mempool-backed, so it will always succeed when * @gfp_flags includes __GFP_DIRECT_RECLAIM, e.g. when it's GFP_NOFS. However, * only the first page of each bio can be allocated this way. To prevent * deadlocks, for any additional pages a mask like GFP_NOWAIT must be used. * * Return: the new encrypted bounce page on success; an ERR_PTR() on failure */ struct page *fscrypt_encrypt_pagecache_blocks(struct page *page, unsigned int len, unsigned int offs, gfp_t gfp_flags) { const struct inode *inode = page->mapping->host; const struct fscrypt_inode_info *ci = inode->i_crypt_info; const unsigned int du_bits = ci->ci_data_unit_bits; const unsigned int du_size = 1U << du_bits; struct page *ciphertext_page; u64 index = ((u64)page->index << (PAGE_SHIFT - du_bits)) + (offs >> du_bits); unsigned int i; int err; if (WARN_ON_ONCE(!PageLocked(page))) return ERR_PTR(-EINVAL); if (WARN_ON_ONCE(len <= 0 || !IS_ALIGNED(len | offs, du_size))) return ERR_PTR(-EINVAL); ciphertext_page = fscrypt_alloc_bounce_page(gfp_flags); if (!ciphertext_page) return ERR_PTR(-ENOMEM); for (i = offs; i < offs + len; i += du_size, index++) { err = fscrypt_crypt_data_unit(ci, FS_ENCRYPT, index, page, ciphertext_page, du_size, i, gfp_flags); if (err) { fscrypt_free_bounce_page(ciphertext_page); return ERR_PTR(err); } } SetPagePrivate(ciphertext_page); set_page_private(ciphertext_page, (unsigned long)page); return ciphertext_page; } EXPORT_SYMBOL(fscrypt_encrypt_pagecache_blocks); /** * fscrypt_encrypt_block_inplace() - Encrypt a filesystem block in-place * @inode: The inode to which this block belongs * @page: The page containing the block to encrypt * @len: Size of block to encrypt. This must be a multiple of * FSCRYPT_CONTENTS_ALIGNMENT. * @offs: Byte offset within @page at which the block to encrypt begins * @lblk_num: Filesystem logical block number of the block, i.e. the 0-based * number of the block within the file * @gfp_flags: Memory allocation flags * * Encrypt a possibly-compressed filesystem block that is located in an * arbitrary page, not necessarily in the original pagecache page. The @inode * and @lblk_num must be specified, as they can't be determined from @page. * * This is not compatible with fscrypt_operations::supports_subblock_data_units. * * Return: 0 on success; -errno on failure */ int fscrypt_encrypt_block_inplace(const struct inode *inode, struct page *page, unsigned int len, unsigned int offs, u64 lblk_num, gfp_t gfp_flags) { if (WARN_ON_ONCE(inode->i_sb->s_cop->supports_subblock_data_units)) return -EOPNOTSUPP; return fscrypt_crypt_data_unit(inode->i_crypt_info, FS_ENCRYPT, lblk_num, page, page, len, offs, gfp_flags); } EXPORT_SYMBOL(fscrypt_encrypt_block_inplace); /** * fscrypt_decrypt_pagecache_blocks() - Decrypt data from a pagecache folio * @folio: the pagecache folio containing the data to decrypt * @len: size of the data to decrypt, in bytes * @offs: offset within @folio of the data to decrypt, in bytes * * Decrypt data that has just been read from an encrypted file. The data must * be located in a pagecache folio that is still locked and not yet uptodate. * The length and offset of the data must be aligned to the file's crypto data * unit size. Alignment to the filesystem block size fulfills this requirement, * as the filesystem block size is always a multiple of the data unit size. * * Return: 0 on success; -errno on failure */ int fscrypt_decrypt_pagecache_blocks(struct folio *folio, size_t len, size_t offs) { const struct inode *inode = folio->mapping->host; const struct fscrypt_inode_info *ci = inode->i_crypt_info; const unsigned int du_bits = ci->ci_data_unit_bits; const unsigned int du_size = 1U << du_bits; u64 index = ((u64)folio->index << (PAGE_SHIFT - du_bits)) + (offs >> du_bits); size_t i; int err; if (WARN_ON_ONCE(!folio_test_locked(folio))) return -EINVAL; if (WARN_ON_ONCE(len <= 0 || !IS_ALIGNED(len | offs, du_size))) return -EINVAL; for (i = offs; i < offs + len; i += du_size, index++) { struct page *page = folio_page(folio, i >> PAGE_SHIFT); err = fscrypt_crypt_data_unit(ci, FS_DECRYPT, index, page, page, du_size, i & ~PAGE_MASK, GFP_NOFS); if (err) return err; } return 0; } EXPORT_SYMBOL(fscrypt_decrypt_pagecache_blocks); /** * fscrypt_decrypt_block_inplace() - Decrypt a filesystem block in-place * @inode: The inode to which this block belongs * @page: The page containing the block to decrypt * @len: Size of block to decrypt. This must be a multiple of * FSCRYPT_CONTENTS_ALIGNMENT. * @offs: Byte offset within @page at which the block to decrypt begins * @lblk_num: Filesystem logical block number of the block, i.e. the 0-based * number of the block within the file * * Decrypt a possibly-compressed filesystem block that is located in an * arbitrary page, not necessarily in the original pagecache page. The @inode * and @lblk_num must be specified, as they can't be determined from @page. * * This is not compatible with fscrypt_operations::supports_subblock_data_units. * * Return: 0 on success; -errno on failure */ int fscrypt_decrypt_block_inplace(const struct inode *inode, struct page *page, unsigned int len, unsigned int offs, u64 lblk_num) { if (WARN_ON_ONCE(inode->i_sb->s_cop->supports_subblock_data_units)) return -EOPNOTSUPP; return fscrypt_crypt_data_unit(inode->i_crypt_info, FS_DECRYPT, lblk_num, page, page, len, offs, GFP_NOFS); } EXPORT_SYMBOL(fscrypt_decrypt_block_inplace); /** * fscrypt_initialize() - allocate major buffers for fs encryption. * @sb: the filesystem superblock * * We only call this when we start accessing encrypted files, since it * results in memory getting allocated that wouldn't otherwise be used. * * Return: 0 on success; -errno on failure */ int fscrypt_initialize(struct super_block *sb) { int err = 0; mempool_t *pool; /* pairs with smp_store_release() below */ if (likely(smp_load_acquire(&fscrypt_bounce_page_pool))) return 0; /* No need to allocate a bounce page pool if this FS won't use it. */ if (!sb->s_cop->needs_bounce_pages) return 0; mutex_lock(&fscrypt_init_mutex); if (fscrypt_bounce_page_pool) goto out_unlock; err = -ENOMEM; pool = mempool_create_page_pool(num_prealloc_crypto_pages, 0); if (!pool) goto out_unlock; /* pairs with smp_load_acquire() above */ smp_store_release(&fscrypt_bounce_page_pool, pool); err = 0; out_unlock: mutex_unlock(&fscrypt_init_mutex); return err; } void fscrypt_msg(const struct inode *inode, const char *level, const char *fmt, ...) { static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); struct va_format vaf; va_list args; if (!__ratelimit(&rs)) return; va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; if (inode && inode->i_ino) printk("%sfscrypt (%s, inode %lu): %pV\n", level, inode->i_sb->s_id, inode->i_ino, &vaf); else if (inode) printk("%sfscrypt (%s): %pV\n", level, inode->i_sb->s_id, &vaf); else printk("%sfscrypt: %pV\n", level, &vaf); va_end(args); } /** * fscrypt_init() - Set up for fs encryption. * * Return: 0 on success; -errno on failure */ static int __init fscrypt_init(void) { int err = -ENOMEM; /* * Use an unbound workqueue to allow bios to be decrypted in parallel * even when they happen to complete on the same CPU. This sacrifices * locality, but it's worthwhile since decryption is CPU-intensive. * * Also use a high-priority workqueue to prioritize decryption work, * which blocks reads from completing, over regular application tasks. */ fscrypt_read_workqueue = alloc_workqueue("fscrypt_read_queue", WQ_UNBOUND | WQ_HIGHPRI, num_online_cpus()); if (!fscrypt_read_workqueue) goto fail; fscrypt_inode_info_cachep = KMEM_CACHE(fscrypt_inode_info, SLAB_RECLAIM_ACCOUNT); if (!fscrypt_inode_info_cachep) goto fail_free_queue; err = fscrypt_init_keyring(); if (err) goto fail_free_inode_info; return 0; fail_free_inode_info: kmem_cache_destroy(fscrypt_inode_info_cachep); fail_free_queue: destroy_workqueue(fscrypt_read_workqueue); fail: return err; } late_initcall(fscrypt_init) |
| 5 3 2 3 5 1 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | // SPDX-License-Identifier: GPL-2.0-or-later /* * ChaCha and XChaCha stream ciphers, including ChaCha20 (RFC7539) * * Copyright (C) 2015 Martin Willi * Copyright (C) 2018 Google LLC */ #include <asm/unaligned.h> #include <crypto/algapi.h> #include <crypto/internal/chacha.h> #include <crypto/internal/skcipher.h> #include <linux/module.h> static int chacha_stream_xor(struct skcipher_request *req, const struct chacha_ctx *ctx, const u8 *iv) { struct skcipher_walk walk; u32 state[16]; int err; err = skcipher_walk_virt(&walk, req, false); chacha_init_generic(state, ctx->key, iv); while (walk.nbytes > 0) { unsigned int nbytes = walk.nbytes; if (nbytes < walk.total) nbytes = round_down(nbytes, CHACHA_BLOCK_SIZE); chacha_crypt_generic(state, walk.dst.virt.addr, walk.src.virt.addr, nbytes, ctx->nrounds); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } return err; } static int crypto_chacha_crypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); return chacha_stream_xor(req, ctx, req->iv); } static int crypto_xchacha_crypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); struct chacha_ctx subctx; u32 state[16]; u8 real_iv[16]; /* Compute the subkey given the original key and first 128 nonce bits */ chacha_init_generic(state, ctx->key, req->iv); hchacha_block_generic(state, subctx.key, ctx->nrounds); subctx.nrounds = ctx->nrounds; /* Build the real IV */ memcpy(&real_iv[0], req->iv + 24, 8); /* stream position */ memcpy(&real_iv[8], req->iv + 16, 8); /* remaining 64 nonce bits */ /* Generate the stream and XOR it with the data */ return chacha_stream_xor(req, &subctx, real_iv); } static struct skcipher_alg algs[] = { { .base.cra_name = "chacha20", .base.cra_driver_name = "chacha20-generic", .base.cra_priority = 100, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha_ctx), .base.cra_module = THIS_MODULE, .min_keysize = CHACHA_KEY_SIZE, .max_keysize = CHACHA_KEY_SIZE, .ivsize = CHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, .setkey = chacha20_setkey, .encrypt = crypto_chacha_crypt, .decrypt = crypto_chacha_crypt, }, { .base.cra_name = "xchacha20", .base.cra_driver_name = "xchacha20-generic", .base.cra_priority = 100, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha_ctx), .base.cra_module = THIS_MODULE, .min_keysize = CHACHA_KEY_SIZE, .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, .setkey = chacha20_setkey, .encrypt = crypto_xchacha_crypt, .decrypt = crypto_xchacha_crypt, }, { .base.cra_name = "xchacha12", .base.cra_driver_name = "xchacha12-generic", .base.cra_priority = 100, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha_ctx), .base.cra_module = THIS_MODULE, .min_keysize = CHACHA_KEY_SIZE, .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, .setkey = chacha12_setkey, .encrypt = crypto_xchacha_crypt, .decrypt = crypto_xchacha_crypt, } }; static int __init chacha_generic_mod_init(void) { return crypto_register_skciphers(algs, ARRAY_SIZE(algs)); } static void __exit chacha_generic_mod_fini(void) { crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); } subsys_initcall(chacha_generic_mod_init); module_exit(chacha_generic_mod_fini); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Martin Willi <martin@strongswan.org>"); MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (generic)"); MODULE_ALIAS_CRYPTO("chacha20"); MODULE_ALIAS_CRYPTO("chacha20-generic"); MODULE_ALIAS_CRYPTO("xchacha20"); MODULE_ALIAS_CRYPTO("xchacha20-generic"); MODULE_ALIAS_CRYPTO("xchacha12"); MODULE_ALIAS_CRYPTO("xchacha12-generic"); |
| 3 3 3 3 3 3 3 19 21 3 2 19 17 3 21 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IP Payload Compression Protocol (IPComp) - RFC3173. * * Copyright (c) 2003 James Morris <jmorris@intercode.com.au> * * Todo: * - Tunable compression parameters. * - Compression stats. * - Adaptive compression. */ #include <linux/module.h> #include <linux/err.h> #include <linux/rtnetlink.h> #include <net/ip.h> #include <net/xfrm.h> #include <net/icmp.h> #include <net/ipcomp.h> #include <net/protocol.h> #include <net/sock.h> static int ipcomp4_err(struct sk_buff *skb, u32 info) { struct net *net = dev_net(skb->dev); __be32 spi; const struct iphdr *iph = (const struct iphdr *)skb->data; struct ip_comp_hdr *ipch = (struct ip_comp_hdr *)(skb->data+(iph->ihl<<2)); struct xfrm_state *x; switch (icmp_hdr(skb)->type) { case ICMP_DEST_UNREACH: if (icmp_hdr(skb)->code != ICMP_FRAG_NEEDED) return 0; break; case ICMP_REDIRECT: break; default: return 0; } spi = htonl(ntohs(ipch->cpi)); x = xfrm_state_lookup(net, skb->mark, (const xfrm_address_t *)&iph->daddr, spi, IPPROTO_COMP, AF_INET); if (!x) return 0; if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH) ipv4_update_pmtu(skb, net, info, 0, IPPROTO_COMP); else ipv4_redirect(skb, net, 0, IPPROTO_COMP); xfrm_state_put(x); return 0; } /* We always hold one tunnel user reference to indicate a tunnel */ static struct xfrm_state *ipcomp_tunnel_create(struct xfrm_state *x) { struct net *net = xs_net(x); struct xfrm_state *t; t = xfrm_state_alloc(net); if (!t) goto out; t->id.proto = IPPROTO_IPIP; t->id.spi = x->props.saddr.a4; t->id.daddr.a4 = x->id.daddr.a4; memcpy(&t->sel, &x->sel, sizeof(t->sel)); t->props.family = AF_INET; t->props.mode = x->props.mode; t->props.saddr.a4 = x->props.saddr.a4; t->props.flags = x->props.flags; t->props.extra_flags = x->props.extra_flags; memcpy(&t->mark, &x->mark, sizeof(t->mark)); t->if_id = x->if_id; if (xfrm_init_state(t)) goto error; atomic_set(&t->tunnel_users, 1); out: return t; error: t->km.state = XFRM_STATE_DEAD; xfrm_state_put(t); t = NULL; goto out; } /* * Must be protected by xfrm_cfg_mutex. State and tunnel user references are * always incremented on success. */ static int ipcomp_tunnel_attach(struct xfrm_state *x) { struct net *net = xs_net(x); int err = 0; struct xfrm_state *t; u32 mark = x->mark.v & x->mark.m; t = xfrm_state_lookup(net, mark, (xfrm_address_t *)&x->id.daddr.a4, x->props.saddr.a4, IPPROTO_IPIP, AF_INET); if (!t) { t = ipcomp_tunnel_create(x); if (!t) { err = -EINVAL; goto out; } xfrm_state_insert(t); xfrm_state_hold(t); } x->tunnel = t; atomic_inc(&t->tunnel_users); out: return err; } static int ipcomp4_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { int err = -EINVAL; x->props.header_len = 0; switch (x->props.mode) { case XFRM_MODE_TRANSPORT: break; case XFRM_MODE_TUNNEL: x->props.header_len += sizeof(struct iphdr); break; default: NL_SET_ERR_MSG(extack, "Unsupported XFRM mode for IPcomp"); goto out; } err = ipcomp_init_state(x, extack); if (err) goto out; if (x->props.mode == XFRM_MODE_TUNNEL) { err = ipcomp_tunnel_attach(x); if (err) { NL_SET_ERR_MSG(extack, "Kernel error: failed to initialize the associated state"); goto out; } } err = 0; out: return err; } static int ipcomp4_rcv_cb(struct sk_buff *skb, int err) { return 0; } static const struct xfrm_type ipcomp_type = { .owner = THIS_MODULE, .proto = IPPROTO_COMP, .init_state = ipcomp4_init_state, .destructor = ipcomp_destroy, .input = ipcomp_input, .output = ipcomp_output }; static struct xfrm4_protocol ipcomp4_protocol = { .handler = xfrm4_rcv, .input_handler = xfrm_input, .cb_handler = ipcomp4_rcv_cb, .err_handler = ipcomp4_err, .priority = 0, }; static int __init ipcomp4_init(void) { if (xfrm_register_type(&ipcomp_type, AF_INET) < 0) { pr_info("%s: can't add xfrm type\n", __func__); return -EAGAIN; } if (xfrm4_protocol_register(&ipcomp4_protocol, IPPROTO_COMP) < 0) { pr_info("%s: can't add protocol\n", __func__); xfrm_unregister_type(&ipcomp_type, AF_INET); return -EAGAIN; } return 0; } static void __exit ipcomp4_fini(void) { if (xfrm4_protocol_deregister(&ipcomp4_protocol, IPPROTO_COMP) < 0) pr_info("%s: can't remove protocol\n", __func__); xfrm_unregister_type(&ipcomp_type, AF_INET); } module_init(ipcomp4_init); module_exit(ipcomp4_fini); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("IP Payload Compression Protocol (IPComp/IPv4) - RFC3173"); MODULE_AUTHOR("James Morris <jmorris@intercode.com.au>"); MODULE_ALIAS_XFRM_TYPE(AF_INET, XFRM_PROTO_COMP); |
| 460 217 10 217 242 241 55 229 229 227 219 229 8 8 8 1 229 180 180 180 11 180 131 229 229 11 11 11 11 11 229 219 229 229 11 228 229 227 10 219 460 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 | // SPDX-License-Identifier: GPL-2.0-or-later /* SCTP kernel implementation * Copyright (c) 1999-2000 Cisco, Inc. * Copyright (c) 1999-2001 Motorola, Inc. * Copyright (c) 2002 International Business Machines, Corp. * * This file is part of the SCTP kernel implementation * * These functions are the methods for accessing the SCTP inqueue. * * An SCTP inqueue is a queue into which you push SCTP packets * (which might be bundles or fragments of chunks) and out of which you * pop SCTP whole chunks. * * Please send any bug reports or fixes you make to the * email address(es): * lksctp developers <linux-sctp@vger.kernel.org> * * Written or modified by: * La Monte H.P. Yarroll <piggy@acm.org> * Karl Knutson <karl@athena.chicago.il.us> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <net/sctp/sctp.h> #include <net/sctp/sm.h> #include <linux/interrupt.h> #include <linux/slab.h> /* Initialize an SCTP inqueue. */ void sctp_inq_init(struct sctp_inq *queue) { INIT_LIST_HEAD(&queue->in_chunk_list); queue->in_progress = NULL; /* Create a task for delivering data. */ INIT_WORK(&queue->immediate, NULL); } /* Properly release the chunk which is being worked on. */ static inline void sctp_inq_chunk_free(struct sctp_chunk *chunk) { if (chunk->head_skb) chunk->skb = chunk->head_skb; sctp_chunk_free(chunk); } /* Release the memory associated with an SCTP inqueue. */ void sctp_inq_free(struct sctp_inq *queue) { struct sctp_chunk *chunk, *tmp; /* Empty the queue. */ list_for_each_entry_safe(chunk, tmp, &queue->in_chunk_list, list) { list_del_init(&chunk->list); sctp_chunk_free(chunk); } /* If there is a packet which is currently being worked on, * free it as well. */ if (queue->in_progress) { sctp_inq_chunk_free(queue->in_progress); queue->in_progress = NULL; } } /* Put a new packet in an SCTP inqueue. * We assume that packet->sctp_hdr is set and in host byte order. */ void sctp_inq_push(struct sctp_inq *q, struct sctp_chunk *chunk) { /* Directly call the packet handling routine. */ if (chunk->rcvr->dead) { sctp_chunk_free(chunk); return; } /* We are now calling this either from the soft interrupt * or from the backlog processing. * Eventually, we should clean up inqueue to not rely * on the BH related data structures. */ list_add_tail(&chunk->list, &q->in_chunk_list); if (chunk->asoc) chunk->asoc->stats.ipackets++; q->immediate.func(&q->immediate); } /* Peek at the next chunk on the inqeue. */ struct sctp_chunkhdr *sctp_inq_peek(struct sctp_inq *queue) { struct sctp_chunk *chunk; struct sctp_chunkhdr *ch = NULL; chunk = queue->in_progress; /* If there is no more chunks in this packet, say so */ if (chunk->singleton || chunk->end_of_packet || chunk->pdiscard) return NULL; ch = (struct sctp_chunkhdr *)chunk->chunk_end; return ch; } /* Extract a chunk from an SCTP inqueue. * * WARNING: If you need to put the chunk on another queue, you need to * make a shallow copy (clone) of it. */ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) { struct sctp_chunk *chunk; struct sctp_chunkhdr *ch = NULL; /* The assumption is that we are safe to process the chunks * at this time. */ chunk = queue->in_progress; if (chunk) { /* There is a packet that we have been working on. * Any post processing work to do before we move on? */ if (chunk->singleton || chunk->end_of_packet || chunk->pdiscard) { if (chunk->head_skb == chunk->skb) { chunk->skb = skb_shinfo(chunk->skb)->frag_list; goto new_skb; } if (chunk->skb->next) { chunk->skb = chunk->skb->next; goto new_skb; } sctp_inq_chunk_free(chunk); chunk = queue->in_progress = NULL; } else { /* Nothing to do. Next chunk in the packet, please. */ ch = (struct sctp_chunkhdr *)chunk->chunk_end; /* Force chunk->skb->data to chunk->chunk_end. */ skb_pull(chunk->skb, chunk->chunk_end - chunk->skb->data); /* We are guaranteed to pull a SCTP header. */ } } /* Do we need to take the next packet out of the queue to process? */ if (!chunk) { struct list_head *entry; next_chunk: /* Is the queue empty? */ entry = sctp_list_dequeue(&queue->in_chunk_list); if (!entry) return NULL; chunk = list_entry(entry, struct sctp_chunk, list); if (skb_is_gso(chunk->skb) && skb_is_gso_sctp(chunk->skb)) { /* GSO-marked skbs but without frags, handle * them normally */ if (skb_shinfo(chunk->skb)->frag_list) chunk->head_skb = chunk->skb; /* skbs with "cover letter" */ if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) chunk->skb = skb_shinfo(chunk->skb)->frag_list; if (WARN_ON(!chunk->skb)) { __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); sctp_chunk_free(chunk); goto next_chunk; } } if (chunk->asoc) sock_rps_save_rxhash(chunk->asoc->base.sk, chunk->skb); queue->in_progress = chunk; new_skb: /* This is the first chunk in the packet. */ ch = (struct sctp_chunkhdr *)chunk->skb->data; chunk->singleton = 1; chunk->data_accepted = 0; chunk->pdiscard = 0; chunk->auth = 0; chunk->has_asconf = 0; chunk->end_of_packet = 0; if (chunk->head_skb) { struct sctp_input_cb *cb = SCTP_INPUT_CB(chunk->skb), *head_cb = SCTP_INPUT_CB(chunk->head_skb); cb->chunk = head_cb->chunk; cb->af = head_cb->af; } } chunk->chunk_hdr = ch; chunk->chunk_end = ((__u8 *)ch) + SCTP_PAD4(ntohs(ch->length)); skb_pull(chunk->skb, sizeof(*ch)); chunk->subh.v = NULL; /* Subheader is no longer valid. */ if (chunk->chunk_end + sizeof(*ch) <= skb_tail_pointer(chunk->skb)) { /* This is not a singleton */ chunk->singleton = 0; } else if (chunk->chunk_end > skb_tail_pointer(chunk->skb)) { /* Discard inside state machine. */ chunk->pdiscard = 1; chunk->chunk_end = skb_tail_pointer(chunk->skb); } else { /* We are at the end of the packet, so mark the chunk * in case we need to send a SACK. */ chunk->end_of_packet = 1; } pr_debug("+++sctp_inq_pop+++ chunk:%p[%s], length:%d, skb->len:%d\n", chunk, sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)), ntohs(chunk->chunk_hdr->length), chunk->skb->len); return chunk; } /* Set a top-half handler. * * Originally, we the top-half handler was scheduled as a BH. We now * call the handler directly in sctp_inq_push() at a time that * we know we are lock safe. * The intent is that this routine will pull stuff out of the * inqueue and process it. */ void sctp_inq_set_th_handler(struct sctp_inq *q, work_func_t callback) { INIT_WORK(&q->immediate, callback); } |
| 144 139 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* SCTP kernel implementation * (C) Copyright IBM Corp. 2001, 2004 * Copyright (c) 1999-2000 Cisco, Inc. * Copyright (c) 1999-2001 Motorola, Inc. * Copyright (c) 2001 Intel Corp. * * This file is part of the SCTP kernel implementation * * These are the definitions needed for the tsnmap type. The tsnmap is used * to track out of order TSNs received. * * Please send any bug reports or fixes you make to the * email address(es): * lksctp developers <linux-sctp@vger.kernel.org> * * Written or modified by: * Jon Grimm <jgrimm@us.ibm.com> * La Monte H.P. Yarroll <piggy@acm.org> * Karl Knutson <karl@athena.chicago.il.us> * Sridhar Samudrala <sri@us.ibm.com> */ #include <net/sctp/constants.h> #ifndef __sctp_tsnmap_h__ #define __sctp_tsnmap_h__ /* RFC 2960 12.2 Parameters necessary per association (i.e. the TCB) * Mapping An array of bits or bytes indicating which out of * Array order TSN's have been received (relative to the * Last Rcvd TSN). If no gaps exist, i.e. no out of * order packets have been received, this array * will be set to all zero. This structure may be * in the form of a circular buffer or bit array. */ struct sctp_tsnmap { /* This array counts the number of chunks with each TSN. * It points at one of the two buffers with which we will * ping-pong between. */ unsigned long *tsn_map; /* This is the TSN at tsn_map[0]. */ __u32 base_tsn; /* Last Rcvd : This is the last TSN received in * TSN : sequence. This value is set initially by * : taking the peer's Initial TSN, received in * : the INIT or INIT ACK chunk, and subtracting * : one from it. * * Throughout most of the specification this is called the * "Cumulative TSN ACK Point". In this case, we * ignore the advice in 12.2 in favour of the term * used in the bulk of the text. */ __u32 cumulative_tsn_ack_point; /* This is the highest TSN we've marked. */ __u32 max_tsn_seen; /* This is the minimum number of TSNs we can track. This corresponds * to the size of tsn_map. Note: the overflow_map allows us to * potentially track more than this quantity. */ __u16 len; /* Data chunks pending receipt. used by SCTP_STATUS sockopt */ __u16 pending_data; /* Record duplicate TSNs here. We clear this after * every SACK. Store up to SCTP_MAX_DUP_TSNS worth of * information. */ __u16 num_dup_tsns; __be32 dup_tsns[SCTP_MAX_DUP_TSNS]; }; struct sctp_tsnmap_iter { __u32 start; }; /* Initialize a block of memory as a tsnmap. */ struct sctp_tsnmap *sctp_tsnmap_init(struct sctp_tsnmap *, __u16 len, __u32 initial_tsn, gfp_t gfp); void sctp_tsnmap_free(struct sctp_tsnmap *map); /* Test the tracking state of this TSN. * Returns: * 0 if the TSN has not yet been seen * >0 if the TSN has been seen (duplicate) * <0 if the TSN is invalid (too large to track) */ int sctp_tsnmap_check(const struct sctp_tsnmap *, __u32 tsn); /* Mark this TSN as seen. */ int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn, struct sctp_transport *trans); /* Mark this TSN and all lower as seen. */ void sctp_tsnmap_skip(struct sctp_tsnmap *map, __u32 tsn); /* Retrieve the Cumulative TSN ACK Point. */ static inline __u32 sctp_tsnmap_get_ctsn(const struct sctp_tsnmap *map) { return map->cumulative_tsn_ack_point; } /* Retrieve the highest TSN we've seen. */ static inline __u32 sctp_tsnmap_get_max_tsn_seen(const struct sctp_tsnmap *map) { return map->max_tsn_seen; } /* How many duplicate TSNs are stored? */ static inline __u16 sctp_tsnmap_num_dups(struct sctp_tsnmap *map) { return map->num_dup_tsns; } /* Return pointer to duplicate tsn array as needed by SACK. */ static inline __be32 *sctp_tsnmap_get_dups(struct sctp_tsnmap *map) { map->num_dup_tsns = 0; return map->dup_tsns; } /* How many gap ack blocks do we have recorded? */ __u16 sctp_tsnmap_num_gabs(struct sctp_tsnmap *map, struct sctp_gap_ack_block *gabs); /* Refresh the count on pending data. */ __u16 sctp_tsnmap_pending(struct sctp_tsnmap *map); /* Is there a gap in the TSN map? */ static inline int sctp_tsnmap_has_gap(const struct sctp_tsnmap *map) { return map->cumulative_tsn_ack_point != map->max_tsn_seen; } /* Mark a duplicate TSN. Note: limit the storage of duplicate TSN * information. */ static inline void sctp_tsnmap_mark_dup(struct sctp_tsnmap *map, __u32 tsn) { if (map->num_dup_tsns < SCTP_MAX_DUP_TSNS) map->dup_tsns[map->num_dup_tsns++] = htonl(tsn); } /* Renege a TSN that was seen. */ void sctp_tsnmap_renege(struct sctp_tsnmap *, __u32 tsn); /* Is there a gap in the TSN map? */ int sctp_tsnmap_has_gap(const struct sctp_tsnmap *); #endif /* __sctp_tsnmap_h__ */ |
| 295 295 293 295 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 | // SPDX-License-Identifier: GPL-2.0-only /* * fs/dax.c - Direct Access filesystem code * Copyright (c) 2013-2014 Intel Corporation * Author: Matthew Wilcox <matthew.r.wilcox@intel.com> * Author: Ross Zwisler <ross.zwisler@linux.intel.com> */ #include <linux/atomic.h> #include <linux/blkdev.h> #include <linux/buffer_head.h> #include <linux/dax.h> #include <linux/fs.h> #include <linux/highmem.h> #include <linux/memcontrol.h> #include <linux/mm.h> #include <linux/mutex.h> #include <linux/pagevec.h> #include <linux/sched.h> #include <linux/sched/signal.h> #include <linux/uio.h> #include <linux/vmstat.h> #include <linux/pfn_t.h> #include <linux/sizes.h> #include <linux/mmu_notifier.h> #include <linux/iomap.h> #include <linux/rmap.h> #include <asm/pgalloc.h> #define CREATE_TRACE_POINTS #include <trace/events/fs_dax.h> /* We choose 4096 entries - same as per-zone page wait tables */ #define DAX_WAIT_TABLE_BITS 12 #define DAX_WAIT_TABLE_ENTRIES (1 << DAX_WAIT_TABLE_BITS) /* The 'colour' (ie low bits) within a PMD of a page offset. */ #define PG_PMD_COLOUR ((PMD_SIZE >> PAGE_SHIFT) - 1) #define PG_PMD_NR (PMD_SIZE >> PAGE_SHIFT) static wait_queue_head_t wait_table[DAX_WAIT_TABLE_ENTRIES]; static int __init init_dax_wait_table(void) { int i; for (i = 0; i < DAX_WAIT_TABLE_ENTRIES; i++) init_waitqueue_head(wait_table + i); return 0; } fs_initcall(init_dax_wait_table); /* * DAX pagecache entries use XArray value entries so they can't be mistaken * for pages. We use one bit for locking, one bit for the entry size (PMD) * and two more to tell us if the entry is a zero page or an empty entry that * is just used for locking. In total four special bits. * * If the PMD bit isn't set the entry has size PAGE_SIZE, and if the ZERO_PAGE * and EMPTY bits aren't set the entry is a normal DAX entry with a filesystem * block allocation. */ #define DAX_SHIFT (4) #define DAX_LOCKED (1UL << 0) #define DAX_PMD (1UL << 1) #define DAX_ZERO_PAGE (1UL << 2) #define DAX_EMPTY (1UL << 3) static unsigned long dax_to_pfn(void *entry) { return xa_to_value(entry) >> DAX_SHIFT; } static void *dax_make_entry(pfn_t pfn, unsigned long flags) { return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT)); } static bool dax_is_locked(void *entry) { return xa_to_value(entry) & DAX_LOCKED; } static unsigned int dax_entry_order(void *entry) { if (xa_to_value(entry) & DAX_PMD) return PMD_ORDER; return 0; } static unsigned long dax_is_pmd_entry(void *entry) { return xa_to_value(entry) & DAX_PMD; } static bool dax_is_pte_entry(void *entry) { return !(xa_to_value(entry) & DAX_PMD); } static int dax_is_zero_entry(void *entry) { return xa_to_value(entry) & DAX_ZERO_PAGE; } static int dax_is_empty_entry(void *entry) { return xa_to_value(entry) & DAX_EMPTY; } /* * true if the entry that was found is of a smaller order than the entry * we were looking for */ static bool dax_is_conflict(void *entry) { return entry == XA_RETRY_ENTRY; } /* * DAX page cache entry locking */ struct exceptional_entry_key { struct xarray *xa; pgoff_t entry_start; }; struct wait_exceptional_entry_queue { wait_queue_entry_t wait; struct exceptional_entry_key key; }; /** * enum dax_wake_mode: waitqueue wakeup behaviour * @WAKE_ALL: wake all waiters in the waitqueue * @WAKE_NEXT: wake only the first waiter in the waitqueue */ enum dax_wake_mode { WAKE_ALL, WAKE_NEXT, }; static wait_queue_head_t *dax_entry_waitqueue(struct xa_state *xas, void *entry, struct exceptional_entry_key *key) { unsigned long hash; unsigned long index = xas->xa_index; /* * If 'entry' is a PMD, align the 'index' that we use for the wait * queue to the start of that PMD. This ensures that all offsets in * the range covered by the PMD map to the same bit lock. */ if (dax_is_pmd_entry(entry)) index &= ~PG_PMD_COLOUR; key->xa = xas->xa; key->entry_start = index; hash = hash_long((unsigned long)xas->xa ^ index, DAX_WAIT_TABLE_BITS); return wait_table + hash; } static int wake_exceptional_entry_func(wait_queue_entry_t *wait, unsigned int mode, int sync, void *keyp) { struct exceptional_entry_key *key = keyp; struct wait_exceptional_entry_queue *ewait = container_of(wait, struct wait_exceptional_entry_queue, wait); if (key->xa != ewait->key.xa || key->entry_start != ewait->key.entry_start) return 0; return autoremove_wake_function(wait, mode, sync, NULL); } /* * @entry may no longer be the entry at the index in the mapping. * The important information it's conveying is whether the entry at * this index used to be a PMD entry. */ static void dax_wake_entry(struct xa_state *xas, void *entry, enum dax_wake_mode mode) { struct exceptional_entry_key key; wait_queue_head_t *wq; wq = dax_entry_waitqueue(xas, entry, &key); /* * Checking for locked entry and prepare_to_wait_exclusive() happens * under the i_pages lock, ditto for entry handling in our callers. * So at this point all tasks that could have seen our entry locked * must be in the waitqueue and the following check will see them. */ if (waitqueue_active(wq)) __wake_up(wq, TASK_NORMAL, mode == WAKE_ALL ? 0 : 1, &key); } /* * Look up entry in page cache, wait for it to become unlocked if it * is a DAX entry and return it. The caller must subsequently call * put_unlocked_entry() if it did not lock the entry or dax_unlock_entry() * if it did. The entry returned may have a larger order than @order. * If @order is larger than the order of the entry found in i_pages, this * function returns a dax_is_conflict entry. * * Must be called with the i_pages lock held. */ static void *get_unlocked_entry(struct xa_state *xas, unsigned int order) { void *entry; struct wait_exceptional_entry_queue ewait; wait_queue_head_t *wq; init_wait(&ewait.wait); ewait.wait.func = wake_exceptional_entry_func; for (;;) { entry = xas_find_conflict(xas); if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) return entry; if (dax_entry_order(entry) < order) return XA_RETRY_ENTRY; if (!dax_is_locked(entry)) return entry; wq = dax_entry_waitqueue(xas, entry, &ewait.key); prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE); xas_unlock_irq(xas); xas_reset(xas); schedule(); finish_wait(wq, &ewait.wait); xas_lock_irq(xas); } } /* * The only thing keeping the address space around is the i_pages lock * (it's cycled in clear_inode() after removing the entries from i_pages) * After we call xas_unlock_irq(), we cannot touch xas->xa. */ static void wait_entry_unlocked(struct xa_state *xas, void *entry) { struct wait_exceptional_entry_queue ewait; wait_queue_head_t *wq; init_wait(&ewait.wait); ewait.wait.func = wake_exceptional_entry_func; wq = dax_entry_waitqueue(xas, entry, &ewait.key); /* * Unlike get_unlocked_entry() there is no guarantee that this * path ever successfully retrieves an unlocked entry before an * inode dies. Perform a non-exclusive wait in case this path * never successfully performs its own wake up. */ prepare_to_wait(wq, &ewait.wait, TASK_UNINTERRUPTIBLE); xas_unlock_irq(xas); schedule(); finish_wait(wq, &ewait.wait); } static void put_unlocked_entry(struct xa_state *xas, void *entry, enum dax_wake_mode mode) { if (entry && !dax_is_conflict(entry)) dax_wake_entry(xas, entry, mode); } /* * We used the xa_state to get the entry, but then we locked the entry and * dropped the xa_lock, so we know the xa_state is stale and must be reset * before use. */ static void dax_unlock_entry(struct xa_state *xas, void *entry) { void *old; BUG_ON(dax_is_locked(entry)); xas_reset(xas); xas_lock_irq(xas); old = xas_store(xas, entry); xas_unlock_irq(xas); BUG_ON(!dax_is_locked(old)); dax_wake_entry(xas, entry, WAKE_NEXT); } /* * Return: The entry stored at this location before it was locked. */ static void *dax_lock_entry(struct xa_state *xas, void *entry) { unsigned long v = xa_to_value(entry); return xas_store(xas, xa_mk_value(v | DAX_LOCKED)); } static unsigned long dax_entry_size(void *entry) { if (dax_is_zero_entry(entry)) return 0; else if (dax_is_empty_entry(entry)) return 0; else if (dax_is_pmd_entry(entry)) return PMD_SIZE; else return PAGE_SIZE; } static unsigned long dax_end_pfn(void *entry) { return dax_to_pfn(entry) + dax_entry_size(entry) / PAGE_SIZE; } /* * Iterate through all mapped pfns represented by an entry, i.e. skip * 'empty' and 'zero' entries. */ #define for_each_mapped_pfn(entry, pfn) \ for (pfn = dax_to_pfn(entry); \ pfn < dax_end_pfn(entry); pfn++) static inline bool dax_page_is_shared(struct page *page) { return page->mapping == PAGE_MAPPING_DAX_SHARED; } /* * Set the page->mapping with PAGE_MAPPING_DAX_SHARED flag, increase the * refcount. */ static inline void dax_page_share_get(struct page *page) { if (page->mapping != PAGE_MAPPING_DAX_SHARED) { /* * Reset the index if the page was already mapped * regularly before. */ if (page->mapping) page->share = 1; page->mapping = PAGE_MAPPING_DAX_SHARED; } page->share++; } static inline unsigned long dax_page_share_put(struct page *page) { return --page->share; } /* * When it is called in dax_insert_entry(), the shared flag will indicate that * whether this entry is shared by multiple files. If so, set the page->mapping * PAGE_MAPPING_DAX_SHARED, and use page->share as refcount. */ static void dax_associate_entry(void *entry, struct address_space *mapping, struct vm_area_struct *vma, unsigned long address, bool shared) { unsigned long size = dax_entry_size(entry), pfn, index; int i = 0; if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return; index = linear_page_index(vma, address & ~(size - 1)); for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); if (shared) { dax_page_share_get(page); } else { WARN_ON_ONCE(page->mapping); page->mapping = mapping; page->index = index + i++; } } } static void dax_disassociate_entry(void *entry, struct address_space *mapping, bool trunc) { unsigned long pfn; if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return; for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); WARN_ON_ONCE(trunc && page_ref_count(page) > 1); if (dax_page_is_shared(page)) { /* keep the shared flag if this page is still shared */ if (dax_page_share_put(page) > 0) continue; } else WARN_ON_ONCE(page->mapping && page->mapping != mapping); page->mapping = NULL; page->index = 0; } } static struct page *dax_busy_page(void *entry) { unsigned long pfn; for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); if (page_ref_count(page) > 1) return page; } return NULL; } /** * dax_lock_folio - Lock the DAX entry corresponding to a folio * @folio: The folio whose entry we want to lock * * Context: Process context. * Return: A cookie to pass to dax_unlock_folio() or 0 if the entry could * not be locked. */ dax_entry_t dax_lock_folio(struct folio *folio) { XA_STATE(xas, NULL, 0); void *entry; /* Ensure folio->mapping isn't freed while we look at it */ rcu_read_lock(); for (;;) { struct address_space *mapping = READ_ONCE(folio->mapping); entry = NULL; if (!mapping || !dax_mapping(mapping)) break; /* * In the device-dax case there's no need to lock, a * struct dev_pagemap pin is sufficient to keep the * inode alive, and we assume we have dev_pagemap pin * otherwise we would not have a valid pfn_to_page() * translation. */ entry = (void *)~0UL; if (S_ISCHR(mapping->host->i_mode)) break; xas.xa = &mapping->i_pages; xas_lock_irq(&xas); if (mapping != folio->mapping) { xas_unlock_irq(&xas); continue; } xas_set(&xas, folio->index); entry = xas_load(&xas); if (dax_is_locked(entry)) { rcu_read_unlock(); wait_entry_unlocked(&xas, entry); rcu_read_lock(); continue; } dax_lock_entry(&xas, entry); xas_unlock_irq(&xas); break; } rcu_read_unlock(); return (dax_entry_t)entry; } void dax_unlock_folio(struct folio *folio, dax_entry_t cookie) { struct address_space *mapping = folio->mapping; XA_STATE(xas, &mapping->i_pages, folio->index); if (S_ISCHR(mapping->host->i_mode)) return; dax_unlock_entry(&xas, (void *)cookie); } /* * dax_lock_mapping_entry - Lock the DAX entry corresponding to a mapping * @mapping: the file's mapping whose entry we want to lock * @index: the offset within this file * @page: output the dax page corresponding to this dax entry * * Return: A cookie to pass to dax_unlock_mapping_entry() or 0 if the entry * could not be locked. */ dax_entry_t dax_lock_mapping_entry(struct address_space *mapping, pgoff_t index, struct page **page) { XA_STATE(xas, NULL, 0); void *entry; rcu_read_lock(); for (;;) { entry = NULL; if (!dax_mapping(mapping)) break; xas.xa = &mapping->i_pages; xas_lock_irq(&xas); xas_set(&xas, index); entry = xas_load(&xas); if (dax_is_locked(entry)) { rcu_read_unlock(); wait_entry_unlocked(&xas, entry); rcu_read_lock(); continue; } if (!entry || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { /* * Because we are looking for entry from file's mapping * and index, so the entry may not be inserted for now, * or even a zero/empty entry. We don't think this is * an error case. So, return a special value and do * not output @page. */ entry = (void *)~0UL; } else { *page = pfn_to_page(dax_to_pfn(entry)); dax_lock_entry(&xas, entry); } xas_unlock_irq(&xas); break; } rcu_read_unlock(); return (dax_entry_t)entry; } void dax_unlock_mapping_entry(struct address_space *mapping, pgoff_t index, dax_entry_t cookie) { XA_STATE(xas, &mapping->i_pages, index); if (cookie == ~0UL) return; dax_unlock_entry(&xas, (void *)cookie); } /* * Find page cache entry at given index. If it is a DAX entry, return it * with the entry locked. If the page cache doesn't contain an entry at * that index, add a locked empty entry. * * When requesting an entry with size DAX_PMD, grab_mapping_entry() will * either return that locked entry or will return VM_FAULT_FALLBACK. * This will happen if there are any PTE entries within the PMD range * that we are requesting. * * We always favor PTE entries over PMD entries. There isn't a flow where we * evict PTE entries in order to 'upgrade' them to a PMD entry. A PMD * insertion will fail if it finds any PTE entries already in the tree, and a * PTE insertion will cause an existing PMD entry to be unmapped and * downgraded to PTE entries. This happens for both PMD zero pages as * well as PMD empty entries. * * The exception to this downgrade path is for PMD entries that have * real storage backing them. We will leave these real PMD entries in * the tree, and PTE writes will simply dirty the entire PMD entry. * * Note: Unlike filemap_fault() we don't honor FAULT_FLAG_RETRY flags. For * persistent memory the benefit is doubtful. We can add that later if we can * show it helps. * * On error, this function does not return an ERR_PTR. Instead it returns * a VM_FAULT code, encoded as an xarray internal entry. The ERR_PTR values * overlap with xarray value entries. */ static void *grab_mapping_entry(struct xa_state *xas, struct address_space *mapping, unsigned int order) { unsigned long index = xas->xa_index; bool pmd_downgrade; /* splitting PMD entry into PTE entries? */ void *entry; retry: pmd_downgrade = false; xas_lock_irq(xas); entry = get_unlocked_entry(xas, order); if (entry) { if (dax_is_conflict(entry)) goto fallback; if (!xa_is_value(entry)) { xas_set_err(xas, -EIO); goto out_unlock; } if (order == 0) { if (dax_is_pmd_entry(entry) && (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))) { pmd_downgrade = true; } } } if (pmd_downgrade) { /* * Make sure 'entry' remains valid while we drop * the i_pages lock. */ dax_lock_entry(xas, entry); /* * Besides huge zero pages the only other thing that gets * downgraded are empty entries which don't need to be * unmapped. */ if (dax_is_zero_entry(entry)) { xas_unlock_irq(xas); unmap_mapping_pages(mapping, xas->xa_index & ~PG_PMD_COLOUR, PG_PMD_NR, false); xas_reset(xas); xas_lock_irq(xas); } dax_disassociate_entry(entry, mapping, false); xas_store(xas, NULL); /* undo the PMD join */ dax_wake_entry(xas, entry, WAKE_ALL); mapping->nrpages -= PG_PMD_NR; entry = NULL; xas_set(xas, index); } if (entry) { dax_lock_entry(xas, entry); } else { unsigned long flags = DAX_EMPTY; if (order > 0) flags |= DAX_PMD; entry = dax_make_entry(pfn_to_pfn_t(0), flags); dax_lock_entry(xas, entry); if (xas_error(xas)) goto out_unlock; mapping->nrpages += 1UL << order; } out_unlock: xas_unlock_irq(xas); if (xas_nomem(xas, mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM)) goto retry; if (xas->xa_node == XA_ERROR(-ENOMEM)) return xa_mk_internal(VM_FAULT_OOM); if (xas_error(xas)) return xa_mk_internal(VM_FAULT_SIGBUS); return entry; fallback: xas_unlock_irq(xas); return xa_mk_internal(VM_FAULT_FALLBACK); } /** * dax_layout_busy_page_range - find first pinned page in @mapping * @mapping: address space to scan for a page with ref count > 1 * @start: Starting offset. Page containing 'start' is included. * @end: End offset. Page containing 'end' is included. If 'end' is LLONG_MAX, * pages from 'start' till the end of file are included. * * DAX requires ZONE_DEVICE mapped pages. These pages are never * 'onlined' to the page allocator so they are considered idle when * page->count == 1. A filesystem uses this interface to determine if * any page in the mapping is busy, i.e. for DMA, or other * get_user_pages() usages. * * It is expected that the filesystem is holding locks to block the * establishment of new mappings in this address_space. I.e. it expects * to be able to run unmap_mapping_range() and subsequently not race * mapping_mapped() becoming true. */ struct page *dax_layout_busy_page_range(struct address_space *mapping, loff_t start, loff_t end) { void *entry; unsigned int scanned = 0; struct page *page = NULL; pgoff_t start_idx = start >> PAGE_SHIFT; pgoff_t end_idx; XA_STATE(xas, &mapping->i_pages, start_idx); /* * In the 'limited' case get_user_pages() for dax is disabled. */ if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return NULL; if (!dax_mapping(mapping) || !mapping_mapped(mapping)) return NULL; /* If end == LLONG_MAX, all pages from start to till end of file */ if (end == LLONG_MAX) end_idx = ULONG_MAX; else end_idx = end >> PAGE_SHIFT; /* * If we race get_user_pages_fast() here either we'll see the * elevated page count in the iteration and wait, or * get_user_pages_fast() will see that the page it took a reference * against is no longer mapped in the page tables and bail to the * get_user_pages() slow path. The slow path is protected by * pte_lock() and pmd_lock(). New references are not taken without * holding those locks, and unmap_mapping_pages() will not zero the * pte or pmd without holding the respective lock, so we are * guaranteed to either see new references or prevent new * references from being established. */ unmap_mapping_pages(mapping, start_idx, end_idx - start_idx + 1, 0); xas_lock_irq(&xas); xas_for_each(&xas, entry, end_idx) { if (WARN_ON_ONCE(!xa_is_value(entry))) continue; if (unlikely(dax_is_locked(entry))) entry = get_unlocked_entry(&xas, 0); if (entry) page = dax_busy_page(entry); put_unlocked_entry(&xas, entry, WAKE_NEXT); if (page) break; if (++scanned % XA_CHECK_SCHED) continue; xas_pause(&xas); xas_unlock_irq(&xas); cond_resched(); xas_lock_irq(&xas); } xas_unlock_irq(&xas); return page; } EXPORT_SYMBOL_GPL(dax_layout_busy_page_range); struct page *dax_layout_busy_page(struct address_space *mapping) { return dax_layout_busy_page_range(mapping, 0, LLONG_MAX); } EXPORT_SYMBOL_GPL(dax_layout_busy_page); static int __dax_invalidate_entry(struct address_space *mapping, pgoff_t index, bool trunc) { XA_STATE(xas, &mapping->i_pages, index); int ret = 0; void *entry; xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, 0); if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) goto out; if (!trunc && (xas_get_mark(&xas, PAGECACHE_TAG_DIRTY) || xas_get_mark(&xas, PAGECACHE_TAG_TOWRITE))) goto out; dax_disassociate_entry(entry, mapping, trunc); xas_store(&xas, NULL); mapping->nrpages -= 1UL << dax_entry_order(entry); ret = 1; out: put_unlocked_entry(&xas, entry, WAKE_ALL); xas_unlock_irq(&xas); return ret; } static int __dax_clear_dirty_range(struct address_space *mapping, pgoff_t start, pgoff_t end) { XA_STATE(xas, &mapping->i_pages, start); unsigned int scanned = 0; void *entry; xas_lock_irq(&xas); xas_for_each(&xas, entry, end) { entry = get_unlocked_entry(&xas, 0); xas_clear_mark(&xas, PAGECACHE_TAG_DIRTY); xas_clear_mark(&xas, PAGECACHE_TAG_TOWRITE); put_unlocked_entry(&xas, entry, WAKE_NEXT); if (++scanned % XA_CHECK_SCHED) continue; xas_pause(&xas); xas_unlock_irq(&xas); cond_resched(); xas_lock_irq(&xas); } xas_unlock_irq(&xas); return 0; } /* * Delete DAX entry at @index from @mapping. Wait for it * to be unlocked before deleting it. */ int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index) { int ret = __dax_invalidate_entry(mapping, index, true); /* * This gets called from truncate / punch_hole path. As such, the caller * must hold locks protecting against concurrent modifications of the * page cache (usually fs-private i_mmap_sem for writing). Since the * caller has seen a DAX entry for this index, we better find it * at that index as well... */ WARN_ON_ONCE(!ret); return ret; } /* * Invalidate DAX entry if it is clean. */ int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index) { return __dax_invalidate_entry(mapping, index, false); } static pgoff_t dax_iomap_pgoff(const struct iomap *iomap, loff_t pos) { return PHYS_PFN(iomap->addr + (pos & PAGE_MASK) - iomap->offset); } static int copy_cow_page_dax(struct vm_fault *vmf, const struct iomap_iter *iter) { pgoff_t pgoff = dax_iomap_pgoff(&iter->iomap, iter->pos); void *vto, *kaddr; long rc; int id; id = dax_read_lock(); rc = dax_direct_access(iter->iomap.dax_dev, pgoff, 1, DAX_ACCESS, &kaddr, NULL); if (rc < 0) { dax_read_unlock(id); return rc; } vto = kmap_atomic(vmf->cow_page); copy_user_page(vto, kaddr, vmf->address, vmf->cow_page); kunmap_atomic(vto); dax_read_unlock(id); return 0; } /* * MAP_SYNC on a dax mapping guarantees dirty metadata is * flushed on write-faults (non-cow), but not read-faults. */ static bool dax_fault_is_synchronous(const struct iomap_iter *iter, struct vm_area_struct *vma) { return (iter->flags & IOMAP_WRITE) && (vma->vm_flags & VM_SYNC) && (iter->iomap.flags & IOMAP_F_DIRTY); } /* * By this point grab_mapping_entry() has ensured that we have a locked entry * of the appropriate size so we don't have to worry about downgrading PMDs to * PTEs. If we happen to be trying to insert a PTE and there is a PMD * already in the tree, we will skip the insertion and just dirty the PMD as * appropriate. */ static void *dax_insert_entry(struct xa_state *xas, struct vm_fault *vmf, const struct iomap_iter *iter, void *entry, pfn_t pfn, unsigned long flags) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; void *new_entry = dax_make_entry(pfn, flags); bool write = iter->flags & IOMAP_WRITE; bool dirty = write && !dax_fault_is_synchronous(iter, vmf->vma); bool shared = iter->iomap.flags & IOMAP_F_SHARED; if (dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); if (shared || (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE))) { unsigned long index = xas->xa_index; /* we are replacing a zero page with block mapping */ if (dax_is_pmd_entry(entry)) unmap_mapping_pages(mapping, index & ~PG_PMD_COLOUR, PG_PMD_NR, false); else /* pte entry */ unmap_mapping_pages(mapping, index, 1, false); } xas_reset(xas); xas_lock_irq(xas); if (shared || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { void *old; dax_disassociate_entry(entry, mapping, false); dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address, shared); /* * Only swap our new entry into the page cache if the current * entry is a zero page or an empty entry. If a normal PTE or * PMD entry is already in the cache, we leave it alone. This * means that if we are trying to insert a PTE and the * existing entry is a PMD, we will just leave the PMD in the * tree and dirty it if necessary. */ old = dax_lock_entry(xas, new_entry); WARN_ON_ONCE(old != xa_mk_value(xa_to_value(entry) | DAX_LOCKED)); entry = new_entry; } else { xas_load(xas); /* Walk the xa_state */ } if (dirty) xas_set_mark(xas, PAGECACHE_TAG_DIRTY); if (write && shared) xas_set_mark(xas, PAGECACHE_TAG_TOWRITE); xas_unlock_irq(xas); return entry; } static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, struct address_space *mapping, void *entry) { unsigned long pfn, index, count, end; long ret = 0; struct vm_area_struct *vma; /* * A page got tagged dirty in DAX mapping? Something is seriously * wrong. */ if (WARN_ON(!xa_is_value(entry))) return -EIO; if (unlikely(dax_is_locked(entry))) { void *old_entry = entry; entry = get_unlocked_entry(xas, 0); /* Entry got punched out / reallocated? */ if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) goto put_unlocked; /* * Entry got reallocated elsewhere? No need to writeback. * We have to compare pfns as we must not bail out due to * difference in lockbit or entry type. */ if (dax_to_pfn(old_entry) != dax_to_pfn(entry)) goto put_unlocked; if (WARN_ON_ONCE(dax_is_empty_entry(entry) || dax_is_zero_entry(entry))) { ret = -EIO; goto put_unlocked; } /* Another fsync thread may have already done this entry */ if (!xas_get_mark(xas, PAGECACHE_TAG_TOWRITE)) goto put_unlocked; } /* Lock the entry to serialize with page faults */ dax_lock_entry(xas, entry); /* * We can clear the tag now but we have to be careful so that concurrent * dax_writeback_one() calls for the same index cannot finish before we * actually flush the caches. This is achieved as the calls will look * at the entry only under the i_pages lock and once they do that * they will see the entry locked and wait for it to unlock. */ xas_clear_mark(xas, PAGECACHE_TAG_TOWRITE); xas_unlock_irq(xas); /* * If dax_writeback_mapping_range() was given a wbc->range_start * in the middle of a PMD, the 'index' we use needs to be * aligned to the start of the PMD. * This allows us to flush for PMD_SIZE and not have to worry about * partial PMD writebacks. */ pfn = dax_to_pfn(entry); count = 1UL << dax_entry_order(entry); index = xas->xa_index & ~(count - 1); end = index + count - 1; /* Walk all mappings of a given index of a file and writeprotect them */ i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, index, end) { pfn_mkclean_range(pfn, count, index, vma); cond_resched(); } i_mmap_unlock_read(mapping); dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE); /* * After we have flushed the cache, we can clear the dirty tag. There * cannot be new dirty data in the pfn after the flush has completed as * the pfn mappings are writeprotected and fault waits for mapping * entry lock. */ xas_reset(xas); xas_lock_irq(xas); xas_store(xas, entry); xas_clear_mark(xas, PAGECACHE_TAG_DIRTY); dax_wake_entry(xas, entry, WAKE_NEXT); trace_dax_writeback_one(mapping->host, index, count); return ret; put_unlocked: put_unlocked_entry(xas, entry, WAKE_NEXT); return ret; } /* * Flush the mapping to the persistent domain within the byte range of [start, * end]. This is required by data integrity operations to ensure file data is * on persistent storage prior to completion of the operation. */ int dax_writeback_mapping_range(struct address_space *mapping, struct dax_device *dax_dev, struct writeback_control *wbc) { XA_STATE(xas, &mapping->i_pages, wbc->range_start >> PAGE_SHIFT); struct inode *inode = mapping->host; pgoff_t end_index = wbc->range_end >> PAGE_SHIFT; void *entry; int ret = 0; unsigned int scanned = 0; if (WARN_ON_ONCE(inode->i_blkbits != PAGE_SHIFT)) return -EIO; if (mapping_empty(mapping) || wbc->sync_mode != WB_SYNC_ALL) return 0; trace_dax_writeback_range(inode, xas.xa_index, end_index); tag_pages_for_writeback(mapping, xas.xa_index, end_index); xas_lock_irq(&xas); xas_for_each_marked(&xas, entry, end_index, PAGECACHE_TAG_TOWRITE) { ret = dax_writeback_one(&xas, dax_dev, mapping, entry); if (ret < 0) { mapping_set_error(mapping, ret); break; } if (++scanned % XA_CHECK_SCHED) continue; xas_pause(&xas); xas_unlock_irq(&xas); cond_resched(); xas_lock_irq(&xas); } xas_unlock_irq(&xas); trace_dax_writeback_range_done(inode, xas.xa_index, end_index); return ret; } EXPORT_SYMBOL_GPL(dax_writeback_mapping_range); static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos, size_t size, void **kaddr, pfn_t *pfnp) { pgoff_t pgoff = dax_iomap_pgoff(iomap, pos); int id, rc = 0; long length; id = dax_read_lock(); length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size), DAX_ACCESS, kaddr, pfnp); if (length < 0) { rc = length; goto out; } if (!pfnp) goto out_check_addr; rc = -EINVAL; if (PFN_PHYS(length) < size) goto out; if (pfn_t_to_pfn(*pfnp) & (PHYS_PFN(size)-1)) goto out; /* For larger pages we need devmap */ if (length > 1 && !pfn_t_devmap(*pfnp)) goto out; rc = 0; out_check_addr: if (!kaddr) goto out; if (!*kaddr) rc = -EFAULT; out: dax_read_unlock(id); return rc; } /** * dax_iomap_copy_around - Prepare for an unaligned write to a shared/cow page * by copying the data before and after the range to be written. * @pos: address to do copy from. * @length: size of copy operation. * @align_size: aligned w.r.t align_size (either PMD_SIZE or PAGE_SIZE) * @srcmap: iomap srcmap * @daddr: destination address to copy to. * * This can be called from two places. Either during DAX write fault (page * aligned), to copy the length size data to daddr. Or, while doing normal DAX * write operation, dax_iomap_iter() might call this to do the copy of either * start or end unaligned address. In the latter case the rest of the copy of * aligned ranges is taken care by dax_iomap_iter() itself. * If the srcmap contains invalid data, such as HOLE and UNWRITTEN, zero the * area to make sure no old data remains. */ static int dax_iomap_copy_around(loff_t pos, uint64_t length, size_t align_size, const struct iomap *srcmap, void *daddr) { loff_t head_off = pos & (align_size - 1); size_t size = ALIGN(head_off + length, align_size); loff_t end = pos + length; loff_t pg_end = round_up(end, align_size); /* copy_all is usually in page fault case */ bool copy_all = head_off == 0 && end == pg_end; /* zero the edges if srcmap is a HOLE or IOMAP_UNWRITTEN */ bool zero_edge = srcmap->flags & IOMAP_F_SHARED || srcmap->type == IOMAP_UNWRITTEN; void *saddr = NULL; int ret = 0; if (!zero_edge) { ret = dax_iomap_direct_access(srcmap, pos, size, &saddr, NULL); if (ret) return dax_mem2blk_err(ret); } if (copy_all) { if (zero_edge) memset(daddr, 0, size); else ret = copy_mc_to_kernel(daddr, saddr, length); goto out; } /* Copy the head part of the range */ if (head_off) { if (zero_edge) memset(daddr, 0, head_off); else { ret = copy_mc_to_kernel(daddr, saddr, head_off); if (ret) return -EIO; } } /* Copy the tail part of the range */ if (end < pg_end) { loff_t tail_off = head_off + length; loff_t tail_len = pg_end - end; if (zero_edge) memset(daddr + tail_off, 0, tail_len); else { ret = copy_mc_to_kernel(daddr + tail_off, saddr + tail_off, tail_len); if (ret) return -EIO; } } out: if (zero_edge) dax_flush(srcmap->dax_dev, daddr, size); return ret ? -EIO : 0; } /* * The user has performed a load from a hole in the file. Allocating a new * page in the file would cause excessive storage usage for workloads with * sparse files. Instead we insert a read-only mapping of the 4k zero page. * If this page is ever written to we will re-fault and change the mapping to * point to real DAX storage instead. */ static vm_fault_t dax_load_hole(struct xa_state *xas, struct vm_fault *vmf, const struct iomap_iter *iter, void **entry) { struct inode *inode = iter->inode; unsigned long vaddr = vmf->address; pfn_t pfn = pfn_to_pfn_t(my_zero_pfn(vaddr)); vm_fault_t ret; *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_ZERO_PAGE); ret = vmf_insert_mixed(vmf->vma, vaddr, pfn); trace_dax_load_hole(inode, vmf, ret); return ret; } #ifdef CONFIG_FS_DAX_PMD static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, const struct iomap_iter *iter, void **entry) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; unsigned long pmd_addr = vmf->address & PMD_MASK; struct vm_area_struct *vma = vmf->vma; struct inode *inode = mapping->host; pgtable_t pgtable = NULL; struct folio *zero_folio; spinlock_t *ptl; pmd_t pmd_entry; pfn_t pfn; zero_folio = mm_get_huge_zero_folio(vmf->vma->vm_mm); if (unlikely(!zero_folio)) goto fallback; pfn = page_to_pfn_t(&zero_folio->page); *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_PMD | DAX_ZERO_PAGE); if (arch_needs_pgtable_deposit()) { pgtable = pte_alloc_one(vma->vm_mm); if (!pgtable) return VM_FAULT_OOM; } ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); if (!pmd_none(*(vmf->pmd))) { spin_unlock(ptl); goto fallback; } if (pgtable) { pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); mm_inc_nr_ptes(vma->vm_mm); } pmd_entry = mk_pmd(&zero_folio->page, vmf->vma->vm_page_prot); pmd_entry = pmd_mkhuge(pmd_entry); set_pmd_at(vmf->vma->vm_mm, pmd_addr, vmf->pmd, pmd_entry); spin_unlock(ptl); trace_dax_pmd_load_hole(inode, vmf, zero_folio, *entry); return VM_FAULT_NOPAGE; fallback: if (pgtable) pte_free(vma->vm_mm, pgtable); trace_dax_pmd_load_hole_fallback(inode, vmf, zero_folio, *entry); return VM_FAULT_FALLBACK; } #else static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, const struct iomap_iter *iter, void **entry) { return VM_FAULT_FALLBACK; } #endif /* CONFIG_FS_DAX_PMD */ static s64 dax_unshare_iter(struct iomap_iter *iter) { struct iomap *iomap = &iter->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iter); loff_t pos = iter->pos; loff_t length = iomap_length(iter); int id = 0; s64 ret = 0; void *daddr = NULL, *saddr = NULL; /* don't bother with blocks that are not shared to start with */ if (!(iomap->flags & IOMAP_F_SHARED)) return length; id = dax_read_lock(); ret = dax_iomap_direct_access(iomap, pos, length, &daddr, NULL); if (ret < 0) goto out_unlock; /* zero the distance if srcmap is HOLE or UNWRITTEN */ if (srcmap->flags & IOMAP_F_SHARED || srcmap->type == IOMAP_UNWRITTEN) { memset(daddr, 0, length); dax_flush(iomap->dax_dev, daddr, length); ret = length; goto out_unlock; } ret = dax_iomap_direct_access(srcmap, pos, length, &saddr, NULL); if (ret < 0) goto out_unlock; if (copy_mc_to_kernel(daddr, saddr, length) == 0) ret = length; else ret = -EIO; out_unlock: dax_read_unlock(id); return dax_mem2blk_err(ret); } int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops) { struct iomap_iter iter = { .inode = inode, .pos = pos, .len = len, .flags = IOMAP_WRITE | IOMAP_UNSHARE | IOMAP_DAX, }; int ret; while ((ret = iomap_iter(&iter, ops)) > 0) iter.processed = dax_unshare_iter(&iter); return ret; } EXPORT_SYMBOL_GPL(dax_file_unshare); static int dax_memzero(struct iomap_iter *iter, loff_t pos, size_t size) { const struct iomap *iomap = &iter->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iter); unsigned offset = offset_in_page(pos); pgoff_t pgoff = dax_iomap_pgoff(iomap, pos); void *kaddr; long ret; ret = dax_direct_access(iomap->dax_dev, pgoff, 1, DAX_ACCESS, &kaddr, NULL); if (ret < 0) return dax_mem2blk_err(ret); memset(kaddr + offset, 0, size); if (iomap->flags & IOMAP_F_SHARED) ret = dax_iomap_copy_around(pos, size, PAGE_SIZE, srcmap, kaddr); else dax_flush(iomap->dax_dev, kaddr + offset, size); return ret; } static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero) { const struct iomap *iomap = &iter->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iter); loff_t pos = iter->pos; u64 length = iomap_length(iter); s64 written = 0; /* already zeroed? we're done. */ if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN) return length; /* * invalidate the pages whose sharing state is to be changed * because of CoW. */ if (iomap->flags & IOMAP_F_SHARED) invalidate_inode_pages2_range(iter->inode->i_mapping, pos >> PAGE_SHIFT, (pos + length - 1) >> PAGE_SHIFT); do { unsigned offset = offset_in_page(pos); unsigned size = min_t(u64, PAGE_SIZE - offset, length); pgoff_t pgoff = dax_iomap_pgoff(iomap, pos); long rc; int id; id = dax_read_lock(); if (IS_ALIGNED(pos, PAGE_SIZE) && size == PAGE_SIZE) rc = dax_zero_page_range(iomap->dax_dev, pgoff, 1); else rc = dax_memzero(iter, pos, size); dax_read_unlock(id); if (rc < 0) return rc; pos += size; length -= size; written += size; } while (length > 0); if (did_zero) *did_zero = true; return written; } int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, const struct iomap_ops *ops) { struct iomap_iter iter = { .inode = inode, .pos = pos, .len = len, .flags = IOMAP_DAX | IOMAP_ZERO, }; int ret; while ((ret = iomap_iter(&iter, ops)) > 0) iter.processed = dax_zero_iter(&iter, did_zero); return ret; } EXPORT_SYMBOL_GPL(dax_zero_range); int dax_truncate_page(struct inode *inode, loff_t pos, bool *did_zero, const struct iomap_ops *ops) { unsigned int blocksize = i_blocksize(inode); unsigned int off = pos & (blocksize - 1); /* Block boundary? Nothing to do */ if (!off) return 0; return dax_zero_range(inode, pos, blocksize - off, did_zero, ops); } EXPORT_SYMBOL_GPL(dax_truncate_page); static loff_t dax_iomap_iter(const struct iomap_iter *iomi, struct iov_iter *iter) { const struct iomap *iomap = &iomi->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iomi); loff_t length = iomap_length(iomi); loff_t pos = iomi->pos; struct dax_device *dax_dev = iomap->dax_dev; loff_t end = pos + length, done = 0; bool write = iov_iter_rw(iter) == WRITE; bool cow = write && iomap->flags & IOMAP_F_SHARED; ssize_t ret = 0; size_t xfer; int id; if (!write) { end = min(end, i_size_read(iomi->inode)); if (pos >= end) return 0; if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_UNWRITTEN) return iov_iter_zero(min(length, end - pos), iter); } /* * In DAX mode, enforce either pure overwrites of written extents, or * writes to unwritten extents as part of a copy-on-write operation. */ if (WARN_ON_ONCE(iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED))) return -EIO; /* * Write can allocate block for an area which has a hole page mapped * into page tables. We have to tear down these mappings so that data * written by write(2) is visible in mmap. */ if (iomap->flags & IOMAP_F_NEW || cow) { /* * Filesystem allows CoW on non-shared extents. The src extents * may have been mmapped with dirty mark before. To be able to * invalidate its dax entries, we need to clear the dirty mark * in advance. */ if (cow) __dax_clear_dirty_range(iomi->inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); invalidate_inode_pages2_range(iomi->inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); } id = dax_read_lock(); while (pos < end) { unsigned offset = pos & (PAGE_SIZE - 1); const size_t size = ALIGN(length + offset, PAGE_SIZE); pgoff_t pgoff = dax_iomap_pgoff(iomap, pos); ssize_t map_len; bool recovery = false; void *kaddr; if (fatal_signal_pending(current)) { ret = -EINTR; break; } map_len = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size), DAX_ACCESS, &kaddr, NULL); if (map_len == -EHWPOISON && iov_iter_rw(iter) == WRITE) { map_len = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size), DAX_RECOVERY_WRITE, &kaddr, NULL); if (map_len > 0) recovery = true; } if (map_len < 0) { ret = dax_mem2blk_err(map_len); break; } if (cow) { ret = dax_iomap_copy_around(pos, length, PAGE_SIZE, srcmap, kaddr); if (ret) break; } map_len = PFN_PHYS(map_len); kaddr += offset; map_len -= offset; if (map_len > end - pos) map_len = end - pos; if (recovery) xfer = dax_recovery_write(dax_dev, pgoff, kaddr, map_len, iter); else if (write) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, map_len, iter); else xfer = dax_copy_to_iter(dax_dev, pgoff, kaddr, map_len, iter); pos += xfer; length -= xfer; done += xfer; if (xfer == 0) ret = -EFAULT; if (xfer < map_len) break; } dax_read_unlock(id); return done ? done : ret; } /** * dax_iomap_rw - Perform I/O to a DAX file * @iocb: The control block for this I/O * @iter: The addresses to do I/O from or to * @ops: iomap ops passed from the file system * * This function performs read and write operations to directly mapped * persistent memory. The callers needs to take care of read/write exclusion * and evicting any page cache pages in the region under I/O. */ ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops) { struct iomap_iter iomi = { .inode = iocb->ki_filp->f_mapping->host, .pos = iocb->ki_pos, .len = iov_iter_count(iter), .flags = IOMAP_DAX, }; loff_t done = 0; int ret; if (!iomi.len) return 0; if (iov_iter_rw(iter) == WRITE) { lockdep_assert_held_write(&iomi.inode->i_rwsem); iomi.flags |= IOMAP_WRITE; } else { lockdep_assert_held(&iomi.inode->i_rwsem); } if (iocb->ki_flags & IOCB_NOWAIT) iomi.flags |= IOMAP_NOWAIT; while ((ret = iomap_iter(&iomi, ops)) > 0) iomi.processed = dax_iomap_iter(&iomi, iter); done = iomi.pos - iocb->ki_pos; iocb->ki_pos = iomi.pos; return done ? done : ret; } EXPORT_SYMBOL_GPL(dax_iomap_rw); static vm_fault_t dax_fault_return(int error) { if (error == 0) return VM_FAULT_NOPAGE; return vmf_error(error); } /* * When handling a synchronous page fault and the inode need a fsync, we can * insert the PTE/PMD into page tables only after that fsync happened. Skip * insertion for now and return the pfn so that caller can insert it after the * fsync is done. */ static vm_fault_t dax_fault_synchronous_pfnp(pfn_t *pfnp, pfn_t pfn) { if (WARN_ON_ONCE(!pfnp)) return VM_FAULT_SIGBUS; *pfnp = pfn; return VM_FAULT_NEEDDSYNC; } static vm_fault_t dax_fault_cow_page(struct vm_fault *vmf, const struct iomap_iter *iter) { vm_fault_t ret; int error = 0; switch (iter->iomap.type) { case IOMAP_HOLE: case IOMAP_UNWRITTEN: clear_user_highpage(vmf->cow_page, vmf->address); break; case IOMAP_MAPPED: error = copy_cow_page_dax(vmf, iter); break; default: WARN_ON_ONCE(1); error = -EIO; break; } if (error) return dax_fault_return(error); __SetPageUptodate(vmf->cow_page); ret = finish_fault(vmf); if (!ret) return VM_FAULT_DONE_COW; return ret; } /** * dax_fault_iter - Common actor to handle pfn insertion in PTE/PMD fault. * @vmf: vm fault instance * @iter: iomap iter * @pfnp: pfn to be returned * @xas: the dax mapping tree of a file * @entry: an unlocked dax entry to be inserted * @pmd: distinguish whether it is a pmd fault */ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, const struct iomap_iter *iter, pfn_t *pfnp, struct xa_state *xas, void **entry, bool pmd) { const struct iomap *iomap = &iter->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iter); size_t size = pmd ? PMD_SIZE : PAGE_SIZE; loff_t pos = (loff_t)xas->xa_index << PAGE_SHIFT; bool write = iter->flags & IOMAP_WRITE; unsigned long entry_flags = pmd ? DAX_PMD : 0; int err = 0; pfn_t pfn; void *kaddr; if (!pmd && vmf->cow_page) return dax_fault_cow_page(vmf, iter); /* if we are reading UNWRITTEN and HOLE, return a hole. */ if (!write && (iomap->type == IOMAP_UNWRITTEN || iomap->type == IOMAP_HOLE)) { if (!pmd) return dax_load_hole(xas, vmf, iter, entry); return dax_pmd_load_hole(xas, vmf, iter, entry); } if (iomap->type != IOMAP_MAPPED && !(iomap->flags & IOMAP_F_SHARED)) { WARN_ON_ONCE(1); return pmd ? VM_FAULT_FALLBACK : VM_FAULT_SIGBUS; } err = dax_iomap_direct_access(iomap, pos, size, &kaddr, &pfn); if (err) return pmd ? VM_FAULT_FALLBACK : dax_fault_return(err); *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, entry_flags); if (write && iomap->flags & IOMAP_F_SHARED) { err = dax_iomap_copy_around(pos, size, size, srcmap, kaddr); if (err) return dax_fault_return(err); } if (dax_fault_is_synchronous(iter, vmf->vma)) return dax_fault_synchronous_pfnp(pfnp, pfn); /* insert PMD pfn */ if (pmd) return vmf_insert_pfn_pmd(vmf, pfn, write); /* insert PTE pfn */ if (write) return vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); return vmf_insert_mixed(vmf->vma, vmf->address, pfn); } static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE(xas, &mapping->i_pages, vmf->pgoff); struct iomap_iter iter = { .inode = mapping->host, .pos = (loff_t)vmf->pgoff << PAGE_SHIFT, .len = PAGE_SIZE, .flags = IOMAP_DAX | IOMAP_FAULT, }; vm_fault_t ret = 0; void *entry; int error; trace_dax_pte_fault(iter.inode, vmf, ret); /* * Check whether offset isn't beyond end of file now. Caller is supposed * to hold locks serializing us with truncate / punch hole so this is * a reliable test. */ if (iter.pos >= i_size_read(iter.inode)) { ret = VM_FAULT_SIGBUS; goto out; } if ((vmf->flags & FAULT_FLAG_WRITE) && !vmf->cow_page) iter.flags |= IOMAP_WRITE; entry = grab_mapping_entry(&xas, mapping, 0); if (xa_is_internal(entry)) { ret = xa_to_internal(entry); goto out; } /* * It is possible, particularly with mixed reads & writes to private * mappings, that we have raced with a PMD fault that overlaps with * the PTE we need to set up. If so just return and the fault will be * retried. */ if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) { ret = VM_FAULT_NOPAGE; goto unlock_entry; } while ((error = iomap_iter(&iter, ops)) > 0) { if (WARN_ON_ONCE(iomap_length(&iter) < PAGE_SIZE)) { iter.processed = -EIO; /* fs corruption? */ continue; } ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, false); if (ret != VM_FAULT_SIGBUS && (iter.iomap.flags & IOMAP_F_NEW)) { count_vm_event(PGMAJFAULT); count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT); ret |= VM_FAULT_MAJOR; } if (!(ret & VM_FAULT_ERROR)) iter.processed = PAGE_SIZE; } if (iomap_errp) *iomap_errp = error; if (!ret && error) ret = dax_fault_return(error); unlock_entry: dax_unlock_entry(&xas, entry); out: trace_dax_pte_fault_done(iter.inode, vmf, ret); return ret; } #ifdef CONFIG_FS_DAX_PMD static bool dax_fault_check_fallback(struct vm_fault *vmf, struct xa_state *xas, pgoff_t max_pgoff) { unsigned long pmd_addr = vmf->address & PMD_MASK; bool write = vmf->flags & FAULT_FLAG_WRITE; /* * Make sure that the faulting address's PMD offset (color) matches * the PMD offset from the start of the file. This is necessary so * that a PMD range in the page table overlaps exactly with a PMD * range in the page cache. */ if ((vmf->pgoff & PG_PMD_COLOUR) != ((vmf->address >> PAGE_SHIFT) & PG_PMD_COLOUR)) return true; /* Fall back to PTEs if we're going to COW */ if (write && !(vmf->vma->vm_flags & VM_SHARED)) return true; /* If the PMD would extend outside the VMA */ if (pmd_addr < vmf->vma->vm_start) return true; if ((pmd_addr + PMD_SIZE) > vmf->vma->vm_end) return true; /* If the PMD would extend beyond the file size */ if ((xas->xa_index | PG_PMD_COLOUR) >= max_pgoff) return true; return false; } static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, const struct iomap_ops *ops) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, PMD_ORDER); struct iomap_iter iter = { .inode = mapping->host, .len = PMD_SIZE, .flags = IOMAP_DAX | IOMAP_FAULT, }; vm_fault_t ret = VM_FAULT_FALLBACK; pgoff_t max_pgoff; void *entry; if (vmf->flags & FAULT_FLAG_WRITE) iter.flags |= IOMAP_WRITE; /* * Check whether offset isn't beyond end of file now. Caller is * supposed to hold locks serializing us with truncate / punch hole so * this is a reliable test. */ max_pgoff = DIV_ROUND_UP(i_size_read(iter.inode), PAGE_SIZE); trace_dax_pmd_fault(iter.inode, vmf, max_pgoff, 0); if (xas.xa_index >= max_pgoff) { ret = VM_FAULT_SIGBUS; goto out; } if (dax_fault_check_fallback(vmf, &xas, max_pgoff)) goto fallback; /* * grab_mapping_entry() will make sure we get an empty PMD entry, * a zero PMD entry or a DAX PMD. If it can't (because a PTE * entry is already in the array, for instance), it will return * VM_FAULT_FALLBACK. */ entry = grab_mapping_entry(&xas, mapping, PMD_ORDER); if (xa_is_internal(entry)) { ret = xa_to_internal(entry); goto fallback; } /* * It is possible, particularly with mixed reads & writes to private * mappings, that we have raced with a PTE fault that overlaps with * the PMD we need to set up. If so just return and the fault will be * retried. */ if (!pmd_none(*vmf->pmd) && !pmd_trans_huge(*vmf->pmd) && !pmd_devmap(*vmf->pmd)) { ret = 0; goto unlock_entry; } iter.pos = (loff_t)xas.xa_index << PAGE_SHIFT; while (iomap_iter(&iter, ops) > 0) { if (iomap_length(&iter) < PMD_SIZE) continue; /* actually breaks out of the loop */ ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, true); if (ret != VM_FAULT_FALLBACK) iter.processed = PMD_SIZE; } unlock_entry: dax_unlock_entry(&xas, entry); fallback: if (ret == VM_FAULT_FALLBACK) { split_huge_pmd(vmf->vma, vmf->pmd, vmf->address); count_vm_event(THP_FAULT_FALLBACK); } out: trace_dax_pmd_fault_done(iter.inode, vmf, max_pgoff, ret); return ret; } #else static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, const struct iomap_ops *ops) { return VM_FAULT_FALLBACK; } #endif /* CONFIG_FS_DAX_PMD */ /** * dax_iomap_fault - handle a page fault on a DAX file * @vmf: The description of the fault * @order: Order of the page to fault in * @pfnp: PFN to insert for synchronous faults if fsync is required * @iomap_errp: Storage for detailed error code in case of error * @ops: Iomap ops passed from the file system * * When a page fault occurs, filesystems may call this helper in * their fault handler for DAX files. dax_iomap_fault() assumes the caller * has done all the necessary locking for page fault to proceed * successfully. */ vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order, pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops) { if (order == 0) return dax_iomap_pte_fault(vmf, pfnp, iomap_errp, ops); else if (order == PMD_ORDER) return dax_iomap_pmd_fault(vmf, pfnp, ops); else return VM_FAULT_FALLBACK; } EXPORT_SYMBOL_GPL(dax_iomap_fault); /* * dax_insert_pfn_mkwrite - insert PTE or PMD entry into page tables * @vmf: The description of the fault * @pfn: PFN to insert * @order: Order of entry to insert. * * This function inserts a writeable PTE or PMD entry into the page tables * for an mmaped DAX file. It also marks the page cache entry as dirty. */ static vm_fault_t dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, order); void *entry; vm_fault_t ret; xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, order); /* Did we race with someone splitting entry or so? */ if (!entry || dax_is_conflict(entry) || (order == 0 && !dax_is_pte_entry(entry))) { put_unlocked_entry(&xas, entry, WAKE_NEXT); xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf, VM_FAULT_NOPAGE); return VM_FAULT_NOPAGE; } xas_set_mark(&xas, PAGECACHE_TAG_DIRTY); dax_lock_entry(&xas, entry); xas_unlock_irq(&xas); if (order == 0) ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); #ifdef CONFIG_FS_DAX_PMD else if (order == PMD_ORDER) ret = vmf_insert_pfn_pmd(vmf, pfn, FAULT_FLAG_WRITE); #endif else ret = VM_FAULT_FALLBACK; dax_unlock_entry(&xas, entry); trace_dax_insert_pfn_mkwrite(mapping->host, vmf, ret); return ret; } /** * dax_finish_sync_fault - finish synchronous page fault * @vmf: The description of the fault * @order: Order of entry to be inserted * @pfn: PFN to insert * * This function ensures that the file range touched by the page fault is * stored persistently on the media and handles inserting of appropriate page * table entry. */ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order, pfn_t pfn) { int err; loff_t start = ((loff_t)vmf->pgoff) << PAGE_SHIFT; size_t len = PAGE_SIZE << order; err = vfs_fsync_range(vmf->vma->vm_file, start, start + len - 1, 1); if (err) return VM_FAULT_SIGBUS; return dax_insert_pfn_mkwrite(vmf, pfn, order); } EXPORT_SYMBOL_GPL(dax_finish_sync_fault); static loff_t dax_range_compare_iter(struct iomap_iter *it_src, struct iomap_iter *it_dest, u64 len, bool *same) { const struct iomap *smap = &it_src->iomap; const struct iomap *dmap = &it_dest->iomap; loff_t pos1 = it_src->pos, pos2 = it_dest->pos; void *saddr, *daddr; int id, ret; len = min(len, min(smap->length, dmap->length)); if (smap->type == IOMAP_HOLE && dmap->type == IOMAP_HOLE) { *same = true; return len; } if (smap->type == IOMAP_HOLE || dmap->type == IOMAP_HOLE) { *same = false; return 0; } id = dax_read_lock(); ret = dax_iomap_direct_access(smap, pos1, ALIGN(pos1 + len, PAGE_SIZE), &saddr, NULL); if (ret < 0) goto out_unlock; ret = dax_iomap_direct_access(dmap, pos2, ALIGN(pos2 + len, PAGE_SIZE), &daddr, NULL); if (ret < 0) goto out_unlock; *same = !memcmp(saddr, daddr, len); if (!*same) len = 0; dax_read_unlock(id); return len; out_unlock: dax_read_unlock(id); return -EIO; } int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dst, loff_t dstoff, loff_t len, bool *same, const struct iomap_ops *ops) { struct iomap_iter src_iter = { .inode = src, .pos = srcoff, .len = len, .flags = IOMAP_DAX, }; struct iomap_iter dst_iter = { .inode = dst, .pos = dstoff, .len = len, .flags = IOMAP_DAX, }; int ret, compared = 0; while ((ret = iomap_iter(&src_iter, ops)) > 0 && (ret = iomap_iter(&dst_iter, ops)) > 0) { compared = dax_range_compare_iter(&src_iter, &dst_iter, min(src_iter.len, dst_iter.len), same); if (compared < 0) return ret; src_iter.processed = dst_iter.processed = compared; } return ret; } int dax_remap_file_range_prep(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, loff_t *len, unsigned int remap_flags, const struct iomap_ops *ops) { return __generic_remap_file_range_prep(file_in, pos_in, file_out, pos_out, len, remap_flags, ops); } EXPORT_SYMBOL_GPL(dax_remap_file_range_prep); |
| 8 8 6 8 8 8 8 3 3 6 8 8 6 6 6 8 9 8 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | // SPDX-License-Identifier: GPL-2.0-or-later /* Direct I/O support. * * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/export.h> #include <linux/fs.h> #include <linux/mm.h> #include <linux/pagemap.h> #include <linux/slab.h> #include <linux/uio.h> #include <linux/sched/mm.h> #include <linux/task_io_accounting_ops.h> #include <linux/netfs.h> #include "internal.h" /** * netfs_unbuffered_read_iter_locked - Perform an unbuffered or direct I/O read * @iocb: The I/O control descriptor describing the read * @iter: The output buffer (also specifies read length) * * Perform an unbuffered I/O or direct I/O from the file in @iocb to the * output buffer. No use is made of the pagecache. * * The caller must hold any appropriate locks. */ ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_iter *iter) { struct netfs_io_request *rreq; ssize_t ret; size_t orig_count = iov_iter_count(iter); bool async = !is_sync_kiocb(iocb); _enter(""); if (!orig_count) return 0; /* Don't update atime */ ret = kiocb_write_and_wait(iocb, orig_count); if (ret < 0) return ret; file_accessed(iocb->ki_filp); rreq = netfs_alloc_request(iocb->ki_filp->f_mapping, iocb->ki_filp, iocb->ki_pos, orig_count, NETFS_DIO_READ); if (IS_ERR(rreq)) return PTR_ERR(rreq); netfs_stat(&netfs_n_rh_dio_read); trace_netfs_read(rreq, rreq->start, rreq->len, netfs_read_trace_dio_read); /* If this is an async op, we have to keep track of the destination * buffer for ourselves as the caller's iterator will be trashed when * we return. * * In such a case, extract an iterator to represent as much of the the * output buffer as we can manage. Note that the extraction might not * be able to allocate a sufficiently large bvec array and may shorten * the request. */ if (user_backed_iter(iter)) { ret = netfs_extract_user_iter(iter, rreq->len, &rreq->iter, 0); if (ret < 0) goto out; rreq->direct_bv = (struct bio_vec *)rreq->iter.bvec; rreq->direct_bv_count = ret; rreq->direct_bv_unpin = iov_iter_extract_will_pin(iter); rreq->len = iov_iter_count(&rreq->iter); } else { rreq->iter = *iter; rreq->len = orig_count; rreq->direct_bv_unpin = false; iov_iter_advance(iter, orig_count); } // TODO: Set up bounce buffer if needed if (async) rreq->iocb = iocb; ret = netfs_begin_read(rreq, is_sync_kiocb(iocb)); if (ret < 0) goto out; /* May be -EIOCBQUEUED */ if (!async) { // TODO: Copy from bounce buffer iocb->ki_pos += rreq->transferred; ret = rreq->transferred; } out: netfs_put_request(rreq, false, netfs_rreq_trace_put_return); if (ret > 0) orig_count -= ret; if (ret != -EIOCBQUEUED) iov_iter_revert(iter, orig_count - iov_iter_count(iter)); return ret; } EXPORT_SYMBOL(netfs_unbuffered_read_iter_locked); /** * netfs_unbuffered_read_iter - Perform an unbuffered or direct I/O read * @iocb: The I/O control descriptor describing the read * @iter: The output buffer (also specifies read length) * * Perform an unbuffered I/O or direct I/O from the file in @iocb to the * output buffer. No use is made of the pagecache. */ ssize_t netfs_unbuffered_read_iter(struct kiocb *iocb, struct iov_iter *iter) { struct inode *inode = file_inode(iocb->ki_filp); ssize_t ret; if (!iter->count) return 0; /* Don't update atime */ ret = netfs_start_io_direct(inode); if (ret == 0) { ret = netfs_unbuffered_read_iter_locked(iocb, iter); netfs_end_io_direct(inode); } return ret; } EXPORT_SYMBOL(netfs_unbuffered_read_iter); |
| 10 10 5191 16 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 | // SPDX-License-Identifier: GPL-2.0 #include <linux/kernel.h> #include <linux/of.h> #include <linux/of_device.h> #include <linux/of_address.h> #include <linux/of_iommu.h> #include <linux/of_reserved_mem.h> #include <linux/dma-direct.h> /* for bus_dma_region */ #include <linux/dma-map-ops.h> #include <linux/init.h> #include <linux/mod_devicetable.h> #include <linux/slab.h> #include <linux/platform_device.h> #include <asm/errno.h> #include "of_private.h" /** * of_match_device - Tell if a struct device matches an of_device_id list * @matches: array of of device match structures to search in * @dev: the of device structure to match against * * Used by a driver to check whether an platform_device present in the * system is in its list of supported devices. */ const struct of_device_id *of_match_device(const struct of_device_id *matches, const struct device *dev) { if (!matches || !dev->of_node || dev->of_node_reused) return NULL; return of_match_node(matches, dev->of_node); } EXPORT_SYMBOL(of_match_device); static void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) { struct device_node *node, *of_node = dev->of_node; int count, i; if (!IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL)) return; count = of_property_count_elems_of_size(of_node, "memory-region", sizeof(u32)); /* * If dev->of_node doesn't exist or doesn't contain memory-region, try * the OF node having DMA configuration. */ if (count <= 0) { of_node = np; count = of_property_count_elems_of_size( of_node, "memory-region", sizeof(u32)); } for (i = 0; i < count; i++) { node = of_parse_phandle(of_node, "memory-region", i); /* * There might be multiple memory regions, but only one * restricted-dma-pool region is allowed. */ if (of_device_is_compatible(node, "restricted-dma-pool") && of_device_is_available(node)) { of_node_put(node); break; } of_node_put(node); } /* * Attempt to initialize a restricted-dma-pool region if one was found. * Note that count can hold a negative error code. */ if (i < count && of_reserved_mem_device_init_by_idx(dev, of_node, i)) dev_warn(dev, "failed to initialise \"restricted-dma-pool\" memory node\n"); } /** * of_dma_configure_id - Setup DMA configuration * @dev: Device to apply DMA configuration * @np: Pointer to OF node having DMA configuration * @force_dma: Whether device is to be set up by of_dma_configure() even if * DMA capability is not explicitly described by firmware. * @id: Optional const pointer value input id * * Try to get devices's DMA configuration from DT and update it * accordingly. * * If platform code needs to use its own special DMA configuration, it * can use a platform bus notifier and handle BUS_NOTIFY_ADD_DEVICE events * to fix up DMA configuration. */ int of_dma_configure_id(struct device *dev, struct device_node *np, bool force_dma, const u32 *id) { const struct bus_dma_region *map = NULL; struct device_node *bus_np; u64 mask, end = 0; bool coherent, set_map = false; int ret; if (np == dev->of_node) bus_np = __of_get_dma_parent(np); else bus_np = of_node_get(np); ret = of_dma_get_range(bus_np, &map); of_node_put(bus_np); if (ret < 0) { /* * For legacy reasons, we have to assume some devices need * DMA configuration regardless of whether "dma-ranges" is * correctly specified or not. */ if (!force_dma) return ret == -ENODEV ? 0 : ret; } else { /* Determine the overall bounds of all DMA regions */ end = dma_range_map_max(map); set_map = true; } /* * If @dev is expected to be DMA-capable then the bus code that created * it should have initialised its dma_mask pointer by this point. For * now, we'll continue the legacy behaviour of coercing it to the * coherent mask if not, but we'll no longer do so quietly. */ if (!dev->dma_mask) { dev_warn(dev, "DMA mask not set\n"); dev->dma_mask = &dev->coherent_dma_mask; } if (!end && dev->coherent_dma_mask) end = dev->coherent_dma_mask; else if (!end) end = (1ULL << 32) - 1; /* * Limit coherent and dma mask based on size and default mask * set by the driver. */ mask = DMA_BIT_MASK(ilog2(end) + 1); dev->coherent_dma_mask &= mask; *dev->dma_mask &= mask; /* ...but only set bus limit and range map if we found valid dma-ranges earlier */ if (set_map) { dev->bus_dma_limit = end; dev->dma_range_map = map; } coherent = of_dma_is_coherent(np); dev_dbg(dev, "device is%sdma coherent\n", coherent ? " " : " not "); ret = of_iommu_configure(dev, np, id); if (ret == -EPROBE_DEFER) { /* Don't touch range map if it wasn't set from a valid dma-ranges */ if (set_map) dev->dma_range_map = NULL; kfree(map); return -EPROBE_DEFER; } /* Take all other IOMMU errors to mean we'll just carry on without it */ dev_dbg(dev, "device is%sbehind an iommu\n", !ret ? " " : " not "); arch_setup_dma_ops(dev, coherent); if (ret) of_dma_set_restricted_buffer(dev, np); return 0; } EXPORT_SYMBOL_GPL(of_dma_configure_id); const void *of_device_get_match_data(const struct device *dev) { const struct of_device_id *match; match = of_match_device(dev->driver->of_match_table, dev); if (!match) return NULL; return match->data; } EXPORT_SYMBOL(of_device_get_match_data); /** * of_device_modalias - Fill buffer with newline terminated modalias string * @dev: Calling device * @str: Modalias string * @len: Size of @str */ ssize_t of_device_modalias(struct device *dev, char *str, ssize_t len) { ssize_t sl; if (!dev || !dev->of_node || dev->of_node_reused) return -ENODEV; sl = of_modalias(dev->of_node, str, len - 2); if (sl < 0) return sl; if (sl > len - 2) return -ENOMEM; str[sl++] = '\n'; str[sl] = 0; return sl; } EXPORT_SYMBOL_GPL(of_device_modalias); /** * of_device_uevent - Display OF related uevent information * @dev: Device to display the uevent information for * @env: Kernel object's userspace event reference to fill up */ void of_device_uevent(const struct device *dev, struct kobj_uevent_env *env) { const char *compat, *type; struct alias_prop *app; struct property *p; int seen = 0; if ((!dev) || (!dev->of_node)) return; add_uevent_var(env, "OF_NAME=%pOFn", dev->of_node); add_uevent_var(env, "OF_FULLNAME=%pOF", dev->of_node); type = of_node_get_device_type(dev->of_node); if (type) add_uevent_var(env, "OF_TYPE=%s", type); /* Since the compatible field can contain pretty much anything * it's not really legal to split it out with commas. We split it * up using a number of environment variables instead. */ of_property_for_each_string(dev->of_node, "compatible", p, compat) { add_uevent_var(env, "OF_COMPATIBLE_%d=%s", seen, compat); seen++; } add_uevent_var(env, "OF_COMPATIBLE_N=%d", seen); seen = 0; mutex_lock(&of_mutex); list_for_each_entry(app, &aliases_lookup, link) { if (dev->of_node == app->np) { add_uevent_var(env, "OF_ALIAS_%d=%s", seen, app->alias); seen++; } } mutex_unlock(&of_mutex); } EXPORT_SYMBOL_GPL(of_device_uevent); int of_device_uevent_modalias(const struct device *dev, struct kobj_uevent_env *env) { int sl; if ((!dev) || (!dev->of_node) || dev->of_node_reused) return -ENODEV; /* Devicetree modalias is tricky, we add it in 2 steps */ if (add_uevent_var(env, "MODALIAS=")) return -ENOMEM; sl = of_modalias(dev->of_node, &env->buf[env->buflen-1], sizeof(env->buf) - env->buflen); if (sl < 0) return sl; if (sl >= (sizeof(env->buf) - env->buflen)) return -ENOMEM; env->buflen += sl; return 0; } EXPORT_SYMBOL_GPL(of_device_uevent_modalias); /** * of_device_make_bus_id - Use the device node data to assign a unique name * @dev: pointer to device structure that is linked to a device tree node * * This routine will first try using the translated bus address to * derive a unique name. If it cannot, then it will prepend names from * parent nodes until a unique name can be derived. */ void of_device_make_bus_id(struct device *dev) { struct device_node *node = dev->of_node; const __be32 *reg; u64 addr; u32 mask; /* Construct the name, using parent nodes if necessary to ensure uniqueness */ while (node->parent) { /* * If the address can be translated, then that is as much * uniqueness as we need. Make it the first component and return */ reg = of_get_property(node, "reg", NULL); if (reg && (addr = of_translate_address(node, reg)) != OF_BAD_ADDR) { if (!of_property_read_u32(node, "mask", &mask)) dev_set_name(dev, dev_name(dev) ? "%llx.%x.%pOFn:%s" : "%llx.%x.%pOFn", addr, ffs(mask) - 1, node, dev_name(dev)); else dev_set_name(dev, dev_name(dev) ? "%llx.%pOFn:%s" : "%llx.%pOFn", addr, node, dev_name(dev)); return; } /* format arguments only used if dev_name() resolves to NULL */ dev_set_name(dev, dev_name(dev) ? "%s:%s" : "%s", kbasename(node->full_name), dev_name(dev)); node = node->parent; } } EXPORT_SYMBOL_GPL(of_device_make_bus_id); |
| 501 1 499 500 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (C) 2011 IBM Corporation * * Author: * Mimi Zohar <zohar@us.ibm.com> */ #include <linux/xattr.h> #include <linux/evm.h> int posix_xattr_acl(const char *xattr) { int xattr_len = strlen(xattr); if ((strlen(XATTR_NAME_POSIX_ACL_ACCESS) == xattr_len) && (strncmp(XATTR_NAME_POSIX_ACL_ACCESS, xattr, xattr_len) == 0)) return 1; if ((strlen(XATTR_NAME_POSIX_ACL_DEFAULT) == xattr_len) && (strncmp(XATTR_NAME_POSIX_ACL_DEFAULT, xattr, xattr_len) == 0)) return 1; return 0; } |
| 22 5 2 2 2 34 5 3 1 1 1 4 4 4 4 2 3 3 1 3 22 22 22 23 23 23 23 23 23 23 23 23 23 23 22 23 23 23 22 26 25 25 23 25 21 25 25 25 25 25 25 25 25 25 25 2 25 25 25 25 25 26 26 26 26 26 26 26 25 22 20 22 22 22 25 23 25 25 25 25 25 21 25 25 24 25 24 25 25 21 21 1 1 21 21 21 21 21 21 21 21 21 21 1 21 21 21 21 21 2 4 20 25 25 25 20 1 21 4 25 25 4 21 21 21 4 4 4 4 4 4 4 4 4 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 2 2 2 2 4 4 4 4 4 4 4 2 4 4 4 4 4 2 2 2 2 4 2 4 2 2 2 2 1 1 1 455 455 17 17 1 17 17 17 17 1 1 1 1 1 1 1 25 25 2 2 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 | /* * Copyright 2002-2005, Instant802 Networks, Inc. * Copyright 2005-2006, Devicescape Software, Inc. * Copyright 2007 Johannes Berg <johannes@sipsolutions.net> * Copyright 2008-2011 Luis R. Rodriguez <mcgrof@qca.qualcomm.com> * Copyright 2013-2014 Intel Mobile Communications GmbH * Copyright 2017 Intel Deutschland GmbH * Copyright (C) 2018 - 2024 Intel Corporation * * Permission to use, copy, modify, and/or distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /** * DOC: Wireless regulatory infrastructure * * The usual implementation is for a driver to read a device EEPROM to * determine which regulatory domain it should be operating under, then * looking up the allowable channels in a driver-local table and finally * registering those channels in the wiphy structure. * * Another set of compliance enforcement is for drivers to use their * own compliance limits which can be stored on the EEPROM. The host * driver or firmware may ensure these are used. * * In addition to all this we provide an extra layer of regulatory * conformance. For drivers which do not have any regulatory * information CRDA provides the complete regulatory solution. * For others it provides a community effort on further restrictions * to enhance compliance. * * Note: When number of rules --> infinity we will not be able to * index on alpha2 any more, instead we'll probably have to * rely on some SHA1 checksum of the regdomain for example. * */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/kernel.h> #include <linux/export.h> #include <linux/slab.h> #include <linux/list.h> #include <linux/ctype.h> #include <linux/nl80211.h> #include <linux/platform_device.h> #include <linux/verification.h> #include <linux/moduleparam.h> #include <linux/firmware.h> #include <linux/units.h> #include <net/cfg80211.h> #include "core.h" #include "reg.h" #include "rdev-ops.h" #include "nl80211.h" /* * Grace period we give before making sure all current interfaces reside on * channels allowed by the current regulatory domain. */ #define REG_ENFORCE_GRACE_MS 60000 /** * enum reg_request_treatment - regulatory request treatment * * @REG_REQ_OK: continue processing the regulatory request * @REG_REQ_IGNORE: ignore the regulatory request * @REG_REQ_INTERSECT: the regulatory domain resulting from this request should * be intersected with the current one. * @REG_REQ_ALREADY_SET: the regulatory request will not change the current * regulatory settings, and no further processing is required. */ enum reg_request_treatment { REG_REQ_OK, REG_REQ_IGNORE, REG_REQ_INTERSECT, REG_REQ_ALREADY_SET, }; static struct regulatory_request core_request_world = { .initiator = NL80211_REGDOM_SET_BY_CORE, .alpha2[0] = '0', .alpha2[1] = '0', .intersect = false, .processed = true, .country_ie_env = ENVIRON_ANY, }; /* * Receipt of information from last regulatory request, * protected by RTNL (and can be accessed with RCU protection) */ static struct regulatory_request __rcu *last_request = (void __force __rcu *)&core_request_world; /* To trigger userspace events and load firmware */ static struct platform_device *reg_pdev; /* * Central wireless core regulatory domains, we only need two, * the current one and a world regulatory domain in case we have no * information to give us an alpha2. * (protected by RTNL, can be read under RCU) */ const struct ieee80211_regdomain __rcu *cfg80211_regdomain; /* * Number of devices that registered to the core * that support cellular base station regulatory hints * (protected by RTNL) */ static int reg_num_devs_support_basehint; /* * State variable indicating if the platform on which the devices * are attached is operating in an indoor environment. The state variable * is relevant for all registered devices. */ static bool reg_is_indoor; static DEFINE_SPINLOCK(reg_indoor_lock); /* Used to track the userspace process controlling the indoor setting */ static u32 reg_is_indoor_portid; static void restore_regulatory_settings(bool reset_user, bool cached); static void print_regdomain(const struct ieee80211_regdomain *rd); static void reg_process_hint(struct regulatory_request *reg_request); static const struct ieee80211_regdomain *get_cfg80211_regdom(void) { return rcu_dereference_rtnl(cfg80211_regdomain); } /* * Returns the regulatory domain associated with the wiphy. * * Requires any of RTNL, wiphy mutex or RCU protection. */ const struct ieee80211_regdomain *get_wiphy_regdom(struct wiphy *wiphy) { return rcu_dereference_check(wiphy->regd, lockdep_is_held(&wiphy->mtx) || lockdep_rtnl_is_held()); } EXPORT_SYMBOL(get_wiphy_regdom); static const char *reg_dfs_region_str(enum nl80211_dfs_regions dfs_region) { switch (dfs_region) { case NL80211_DFS_UNSET: return "unset"; case NL80211_DFS_FCC: return "FCC"; case NL80211_DFS_ETSI: return "ETSI"; case NL80211_DFS_JP: return "JP"; } return "Unknown"; } enum nl80211_dfs_regions reg_get_dfs_region(struct wiphy *wiphy) { const struct ieee80211_regdomain *regd = NULL; const struct ieee80211_regdomain *wiphy_regd = NULL; enum nl80211_dfs_regions dfs_region; rcu_read_lock(); regd = get_cfg80211_regdom(); dfs_region = regd->dfs_region; if (!wiphy) goto out; wiphy_regd = get_wiphy_regdom(wiphy); if (!wiphy_regd) goto out; if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) { dfs_region = wiphy_regd->dfs_region; goto out; } if (wiphy_regd->dfs_region == regd->dfs_region) goto out; pr_debug("%s: device specific dfs_region (%s) disagrees with cfg80211's central dfs_region (%s)\n", dev_name(&wiphy->dev), reg_dfs_region_str(wiphy_regd->dfs_region), reg_dfs_region_str(regd->dfs_region)); out: rcu_read_unlock(); return dfs_region; } static void rcu_free_regdom(const struct ieee80211_regdomain *r) { if (!r) return; kfree_rcu((struct ieee80211_regdomain *)r, rcu_head); } static struct regulatory_request *get_last_request(void) { return rcu_dereference_rtnl(last_request); } /* Used to queue up regulatory hints */ static LIST_HEAD(reg_requests_list); static DEFINE_SPINLOCK(reg_requests_lock); /* Used to queue up beacon hints for review */ static LIST_HEAD(reg_pending_beacons); static DEFINE_SPINLOCK(reg_pending_beacons_lock); /* Used to keep track of processed beacon hints */ static LIST_HEAD(reg_beacon_list); struct reg_beacon { struct list_head list; struct ieee80211_channel chan; }; static void reg_check_chans_work(struct work_struct *work); static DECLARE_DELAYED_WORK(reg_check_chans, reg_check_chans_work); static void reg_todo(struct work_struct *work); static DECLARE_WORK(reg_work, reg_todo); /* We keep a static world regulatory domain in case of the absence of CRDA */ static const struct ieee80211_regdomain world_regdom = { .n_reg_rules = 8, .alpha2 = "00", .reg_rules = { /* IEEE 802.11b/g, channels 1..11 */ REG_RULE(2412-10, 2462+10, 40, 6, 20, 0), /* IEEE 802.11b/g, channels 12..13. */ REG_RULE(2467-10, 2472+10, 20, 6, 20, NL80211_RRF_NO_IR | NL80211_RRF_AUTO_BW), /* IEEE 802.11 channel 14 - Only JP enables * this and for 802.11b only */ REG_RULE(2484-10, 2484+10, 20, 6, 20, NL80211_RRF_NO_IR | NL80211_RRF_NO_OFDM), /* IEEE 802.11a, channel 36..48 */ REG_RULE(5180-10, 5240+10, 80, 6, 20, NL80211_RRF_NO_IR | NL80211_RRF_AUTO_BW), /* IEEE 802.11a, channel 52..64 - DFS required */ REG_RULE(5260-10, 5320+10, 80, 6, 20, NL80211_RRF_NO_IR | NL80211_RRF_AUTO_BW | NL80211_RRF_DFS), /* IEEE 802.11a, channel 100..144 - DFS required */ REG_RULE(5500-10, 5720+10, 160, 6, 20, NL80211_RRF_NO_IR | NL80211_RRF_DFS), /* IEEE 802.11a, channel 149..165 */ REG_RULE(5745-10, 5825+10, 80, 6, 20, NL80211_RRF_NO_IR), /* IEEE 802.11ad (60GHz), channels 1..3 */ REG_RULE(56160+2160*1-1080, 56160+2160*3+1080, 2160, 0, 0, 0), } }; /* protected by RTNL */ static const struct ieee80211_regdomain *cfg80211_world_regdom = &world_regdom; static char *ieee80211_regdom = "00"; static char user_alpha2[2]; static const struct ieee80211_regdomain *cfg80211_user_regdom; module_param(ieee80211_regdom, charp, 0444); MODULE_PARM_DESC(ieee80211_regdom, "IEEE 802.11 regulatory domain code"); static void reg_free_request(struct regulatory_request *request) { if (request == &core_request_world) return; if (request != get_last_request()) kfree(request); } static void reg_free_last_request(void) { struct regulatory_request *lr = get_last_request(); if (lr != &core_request_world && lr) kfree_rcu(lr, rcu_head); } static void reg_update_last_request(struct regulatory_request *request) { struct regulatory_request *lr; lr = get_last_request(); if (lr == request) return; reg_free_last_request(); rcu_assign_pointer(last_request, request); } static void reset_regdomains(bool full_reset, const struct ieee80211_regdomain *new_regdom) { const struct ieee80211_regdomain *r; ASSERT_RTNL(); r = get_cfg80211_regdom(); /* avoid freeing static information or freeing something twice */ if (r == cfg80211_world_regdom) r = NULL; if (cfg80211_world_regdom == &world_regdom) cfg80211_world_regdom = NULL; if (r == &world_regdom) r = NULL; rcu_free_regdom(r); rcu_free_regdom(cfg80211_world_regdom); cfg80211_world_regdom = &world_regdom; rcu_assign_pointer(cfg80211_regdomain, new_regdom); if (!full_reset) return; reg_update_last_request(&core_request_world); } /* * Dynamic world regulatory domain requested by the wireless * core upon initialization */ static void update_world_regdomain(const struct ieee80211_regdomain *rd) { struct regulatory_request *lr; lr = get_last_request(); WARN_ON(!lr); reset_regdomains(false, rd); cfg80211_world_regdom = rd; } bool is_world_regdom(const char *alpha2) { if (!alpha2) return false; return alpha2[0] == '0' && alpha2[1] == '0'; } static bool is_alpha2_set(const char *alpha2) { if (!alpha2) return false; return alpha2[0] && alpha2[1]; } static bool is_unknown_alpha2(const char *alpha2) { if (!alpha2) return false; /* * Special case where regulatory domain was built by driver * but a specific alpha2 cannot be determined */ return alpha2[0] == '9' && alpha2[1] == '9'; } static bool is_intersected_alpha2(const char *alpha2) { if (!alpha2) return false; /* * Special case where regulatory domain is the * result of an intersection between two regulatory domain * structures */ return alpha2[0] == '9' && alpha2[1] == '8'; } static bool is_an_alpha2(const char *alpha2) { if (!alpha2) return false; return isalpha(alpha2[0]) && isalpha(alpha2[1]); } static bool alpha2_equal(const char *alpha2_x, const char *alpha2_y) { if (!alpha2_x || !alpha2_y) return false; return alpha2_x[0] == alpha2_y[0] && alpha2_x[1] == alpha2_y[1]; } static bool regdom_changes(const char *alpha2) { const struct ieee80211_regdomain *r = get_cfg80211_regdom(); if (!r) return true; return !alpha2_equal(r->alpha2, alpha2); } /* * The NL80211_REGDOM_SET_BY_USER regdom alpha2 is cached, this lets * you know if a valid regulatory hint with NL80211_REGDOM_SET_BY_USER * has ever been issued. */ static bool is_user_regdom_saved(void) { if (user_alpha2[0] == '9' && user_alpha2[1] == '7') return false; /* This would indicate a mistake on the design */ if (WARN(!is_world_regdom(user_alpha2) && !is_an_alpha2(user_alpha2), "Unexpected user alpha2: %c%c\n", user_alpha2[0], user_alpha2[1])) return false; return true; } static const struct ieee80211_regdomain * reg_copy_regd(const struct ieee80211_regdomain *src_regd) { struct ieee80211_regdomain *regd; unsigned int i; regd = kzalloc(struct_size(regd, reg_rules, src_regd->n_reg_rules), GFP_KERNEL); if (!regd) return ERR_PTR(-ENOMEM); memcpy(regd, src_regd, sizeof(struct ieee80211_regdomain)); for (i = 0; i < src_regd->n_reg_rules; i++) memcpy(®d->reg_rules[i], &src_regd->reg_rules[i], sizeof(struct ieee80211_reg_rule)); return regd; } static void cfg80211_save_user_regdom(const struct ieee80211_regdomain *rd) { ASSERT_RTNL(); if (!IS_ERR(cfg80211_user_regdom)) kfree(cfg80211_user_regdom); cfg80211_user_regdom = reg_copy_regd(rd); } struct reg_regdb_apply_request { struct list_head list; const struct ieee80211_regdomain *regdom; }; static LIST_HEAD(reg_regdb_apply_list); static DEFINE_MUTEX(reg_regdb_apply_mutex); static void reg_regdb_apply(struct work_struct *work) { struct reg_regdb_apply_request *request; rtnl_lock(); mutex_lock(®_regdb_apply_mutex); while (!list_empty(®_regdb_apply_list)) { request = list_first_entry(®_regdb_apply_list, struct reg_regdb_apply_request, list); list_del(&request->list); set_regdom(request->regdom, REGD_SOURCE_INTERNAL_DB); kfree(request); } mutex_unlock(®_regdb_apply_mutex); rtnl_unlock(); } static DECLARE_WORK(reg_regdb_work, reg_regdb_apply); static int reg_schedule_apply(const struct ieee80211_regdomain *regdom) { struct reg_regdb_apply_request *request; request = kzalloc(sizeof(struct reg_regdb_apply_request), GFP_KERNEL); if (!request) { kfree(regdom); return -ENOMEM; } request->regdom = regdom; mutex_lock(®_regdb_apply_mutex); list_add_tail(&request->list, ®_regdb_apply_list); mutex_unlock(®_regdb_apply_mutex); schedule_work(®_regdb_work); return 0; } #ifdef CONFIG_CFG80211_CRDA_SUPPORT /* Max number of consecutive attempts to communicate with CRDA */ #define REG_MAX_CRDA_TIMEOUTS 10 static u32 reg_crda_timeouts; static void crda_timeout_work(struct work_struct *work); static DECLARE_DELAYED_WORK(crda_timeout, crda_timeout_work); static void crda_timeout_work(struct work_struct *work) { pr_debug("Timeout while waiting for CRDA to reply, restoring regulatory settings\n"); rtnl_lock(); reg_crda_timeouts++; restore_regulatory_settings(true, false); rtnl_unlock(); } static void cancel_crda_timeout(void) { cancel_delayed_work(&crda_timeout); } static void cancel_crda_timeout_sync(void) { cancel_delayed_work_sync(&crda_timeout); } static void reset_crda_timeouts(void) { reg_crda_timeouts = 0; } /* * This lets us keep regulatory code which is updated on a regulatory * basis in userspace. */ static int call_crda(const char *alpha2) { char country[12]; char *env[] = { country, NULL }; int ret; snprintf(country, sizeof(country), "COUNTRY=%c%c", alpha2[0], alpha2[1]); if (reg_crda_timeouts > REG_MAX_CRDA_TIMEOUTS) { pr_debug("Exceeded CRDA call max attempts. Not calling CRDA\n"); return -EINVAL; } if (!is_world_regdom((char *) alpha2)) pr_debug("Calling CRDA for country: %c%c\n", alpha2[0], alpha2[1]); else pr_debug("Calling CRDA to update world regulatory domain\n"); ret = kobject_uevent_env(®_pdev->dev.kobj, KOBJ_CHANGE, env); if (ret) return ret; queue_delayed_work(system_power_efficient_wq, &crda_timeout, msecs_to_jiffies(3142)); return 0; } #else static inline void cancel_crda_timeout(void) {} static inline void cancel_crda_timeout_sync(void) {} static inline void reset_crda_timeouts(void) {} static inline int call_crda(const char *alpha2) { return -ENODATA; } #endif /* CONFIG_CFG80211_CRDA_SUPPORT */ /* code to directly load a firmware database through request_firmware */ static const struct fwdb_header *regdb; struct fwdb_country { u8 alpha2[2]; __be16 coll_ptr; /* this struct cannot be extended */ } __packed __aligned(4); struct fwdb_collection { u8 len; u8 n_rules; u8 dfs_region; /* no optional data yet */ /* aligned to 2, then followed by __be16 array of rule pointers */ } __packed __aligned(4); enum fwdb_flags { FWDB_FLAG_NO_OFDM = BIT(0), FWDB_FLAG_NO_OUTDOOR = BIT(1), FWDB_FLAG_DFS = BIT(2), FWDB_FLAG_NO_IR = BIT(3), FWDB_FLAG_AUTO_BW = BIT(4), }; struct fwdb_wmm_ac { u8 ecw; u8 aifsn; __be16 cot; } __packed; struct fwdb_wmm_rule { struct fwdb_wmm_ac client[IEEE80211_NUM_ACS]; struct fwdb_wmm_ac ap[IEEE80211_NUM_ACS]; } __packed; struct fwdb_rule { u8 len; u8 flags; __be16 max_eirp; __be32 start, end, max_bw; /* start of optional data */ __be16 cac_timeout; __be16 wmm_ptr; } __packed __aligned(4); #define FWDB_MAGIC 0x52474442 #define FWDB_VERSION 20 struct fwdb_header { __be32 magic; __be32 version; struct fwdb_country country[]; } __packed __aligned(4); static int ecw2cw(int ecw) { return (1 << ecw) - 1; } static bool valid_wmm(struct fwdb_wmm_rule *rule) { struct fwdb_wmm_ac *ac = (struct fwdb_wmm_ac *)rule; int i; for (i = 0; i < IEEE80211_NUM_ACS * 2; i++) { u16 cw_min = ecw2cw((ac[i].ecw & 0xf0) >> 4); u16 cw_max = ecw2cw(ac[i].ecw & 0x0f); u8 aifsn = ac[i].aifsn; if (cw_min >= cw_max) return false; if (aifsn < 1) return false; } return true; } static bool valid_rule(const u8 *data, unsigned int size, u16 rule_ptr) { struct fwdb_rule *rule = (void *)(data + (rule_ptr << 2)); if ((u8 *)rule + sizeof(rule->len) > data + size) return false; /* mandatory fields */ if (rule->len < offsetofend(struct fwdb_rule, max_bw)) return false; if (rule->len >= offsetofend(struct fwdb_rule, wmm_ptr)) { u32 wmm_ptr = be16_to_cpu(rule->wmm_ptr) << 2; struct fwdb_wmm_rule *wmm; if (wmm_ptr + sizeof(struct fwdb_wmm_rule) > size) return false; wmm = (void *)(data + wmm_ptr); if (!valid_wmm(wmm)) return false; } return true; } static bool valid_country(const u8 *data, unsigned int size, const struct fwdb_country *country) { unsigned int ptr = be16_to_cpu(country->coll_ptr) << 2; struct fwdb_collection *coll = (void *)(data + ptr); __be16 *rules_ptr; unsigned int i; /* make sure we can read len/n_rules */ if ((u8 *)coll + offsetofend(typeof(*coll), n_rules) > data + size) return false; /* make sure base struct and all rules fit */ if ((u8 *)coll + ALIGN(coll->len, 2) + (coll->n_rules * 2) > data + size) return false; /* mandatory fields must exist */ if (coll->len < offsetofend(struct fwdb_collection, dfs_region)) return false; rules_ptr = (void *)((u8 *)coll + ALIGN(coll->len, 2)); for (i = 0; i < coll->n_rules; i++) { u16 rule_ptr = be16_to_cpu(rules_ptr[i]); if (!valid_rule(data, size, rule_ptr)) return false; } return true; } #ifdef CONFIG_CFG80211_REQUIRE_SIGNED_REGDB #include <keys/asymmetric-type.h> static struct key *builtin_regdb_keys; static int __init load_builtin_regdb_keys(void) { builtin_regdb_keys = keyring_alloc(".builtin_regdb_keys", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), ((KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_VIEW | KEY_USR_READ | KEY_USR_SEARCH), KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(builtin_regdb_keys)) return PTR_ERR(builtin_regdb_keys); pr_notice("Loading compiled-in X.509 certificates for regulatory database\n"); #ifdef CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS x509_load_certificate_list(shipped_regdb_certs, shipped_regdb_certs_len, builtin_regdb_keys); #endif #ifdef CONFIG_CFG80211_EXTRA_REGDB_KEYDIR if (CONFIG_CFG80211_EXTRA_REGDB_KEYDIR[0] != '\0') x509_load_certificate_list(extra_regdb_certs, extra_regdb_certs_len, builtin_regdb_keys); #endif return 0; } MODULE_FIRMWARE("regulatory.db.p7s"); static bool regdb_has_valid_signature(const u8 *data, unsigned int size) { const struct firmware *sig; bool result; if (request_firmware(&sig, "regulatory.db.p7s", ®_pdev->dev)) return false; result = verify_pkcs7_signature(data, size, sig->data, sig->size, builtin_regdb_keys, VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL) == 0; release_firmware(sig); return result; } static void free_regdb_keyring(void) { key_put(builtin_regdb_keys); } #else static int load_builtin_regdb_keys(void) { return 0; } static bool regdb_has_valid_signature(const u8 *data, unsigned int size) { return true; } static void free_regdb_keyring(void) { } #endif /* CONFIG_CFG80211_REQUIRE_SIGNED_REGDB */ static bool valid_regdb(const u8 *data, unsigned int size) { const struct fwdb_header *hdr = (void *)data; const struct fwdb_country *country; if (size < sizeof(*hdr)) return false; if (hdr->magic != cpu_to_be32(FWDB_MAGIC)) return false; if (hdr->version != cpu_to_be32(FWDB_VERSION)) return false; if (!regdb_has_valid_signature(data, size)) return false; country = &hdr->country[0]; while ((u8 *)(country + 1) <= data + size) { if (!country->coll_ptr) break; if (!valid_country(data, size, country)) return false; country++; } return true; } static void set_wmm_rule(const struct fwdb_header *db, const struct fwdb_country *country, const struct fwdb_rule *rule, struct ieee80211_reg_rule *rrule) { struct ieee80211_wmm_rule *wmm_rule = &rrule->wmm_rule; struct fwdb_wmm_rule *wmm; unsigned int i, wmm_ptr; wmm_ptr = be16_to_cpu(rule->wmm_ptr) << 2; wmm = (void *)((u8 *)db + wmm_ptr); if (!valid_wmm(wmm)) { pr_err("Invalid regulatory WMM rule %u-%u in domain %c%c\n", be32_to_cpu(rule->start), be32_to_cpu(rule->end), country->alpha2[0], country->alpha2[1]); return; } for (i = 0; i < IEEE80211_NUM_ACS; i++) { wmm_rule->client[i].cw_min = ecw2cw((wmm->client[i].ecw & 0xf0) >> 4); wmm_rule->client[i].cw_max = ecw2cw(wmm->client[i].ecw & 0x0f); wmm_rule->client[i].aifsn = wmm->client[i].aifsn; wmm_rule->client[i].cot = 1000 * be16_to_cpu(wmm->client[i].cot); wmm_rule->ap[i].cw_min = ecw2cw((wmm->ap[i].ecw & 0xf0) >> 4); wmm_rule->ap[i].cw_max = ecw2cw(wmm->ap[i].ecw & 0x0f); wmm_rule->ap[i].aifsn = wmm->ap[i].aifsn; wmm_rule->ap[i].cot = 1000 * be16_to_cpu(wmm->ap[i].cot); } rrule->has_wmm = true; } static int __regdb_query_wmm(const struct fwdb_header *db, const struct fwdb_country *country, int freq, struct ieee80211_reg_rule *rrule) { unsigned int ptr = be16_to_cpu(country->coll_ptr) << 2; struct fwdb_collection *coll = (void *)((u8 *)db + ptr); int i; for (i = 0; i < coll->n_rules; i++) { __be16 *rules_ptr = (void *)((u8 *)coll + ALIGN(coll->len, 2)); unsigned int rule_ptr = be16_to_cpu(rules_ptr[i]) << 2; struct fwdb_rule *rule = (void *)((u8 *)db + rule_ptr); if (rule->len < offsetofend(struct fwdb_rule, wmm_ptr)) continue; if (freq >= KHZ_TO_MHZ(be32_to_cpu(rule->start)) && freq <= KHZ_TO_MHZ(be32_to_cpu(rule->end))) { set_wmm_rule(db, country, rule, rrule); return 0; } } return -ENODATA; } int reg_query_regdb_wmm(char *alpha2, int freq, struct ieee80211_reg_rule *rule) { const struct fwdb_header *hdr = regdb; const struct fwdb_country *country; if (!regdb) return -ENODATA; if (IS_ERR(regdb)) return PTR_ERR(regdb); country = &hdr->country[0]; while (country->coll_ptr) { if (alpha2_equal(alpha2, country->alpha2)) return __regdb_query_wmm(regdb, country, freq, rule); country++; } return -ENODATA; } EXPORT_SYMBOL(reg_query_regdb_wmm); static int regdb_query_country(const struct fwdb_header *db, const struct fwdb_country *country) { unsigned int ptr = be16_to_cpu(country->coll_ptr) << 2; struct fwdb_collection *coll = (void *)((u8 *)db + ptr); struct ieee80211_regdomain *regdom; unsigned int i; regdom = kzalloc(struct_size(regdom, reg_rules, coll->n_rules), GFP_KERNEL); if (!regdom) return -ENOMEM; regdom->n_reg_rules = coll->n_rules; regdom->alpha2[0] = country->alpha2[0]; regdom->alpha2[1] = country->alpha2[1]; regdom->dfs_region = coll->dfs_region; for (i = 0; i < regdom->n_reg_rules; i++) { __be16 *rules_ptr = (void *)((u8 *)coll + ALIGN(coll->len, 2)); unsigned int rule_ptr = be16_to_cpu(rules_ptr[i]) << 2; struct fwdb_rule *rule = (void *)((u8 *)db + rule_ptr); struct ieee80211_reg_rule *rrule = ®dom->reg_rules[i]; rrule->freq_range.start_freq_khz = be32_to_cpu(rule->start); rrule->freq_range.end_freq_khz = be32_to_cpu(rule->end); rrule->freq_range.max_bandwidth_khz = be32_to_cpu(rule->max_bw); rrule->power_rule.max_antenna_gain = 0; rrule->power_rule.max_eirp = be16_to_cpu(rule->max_eirp); rrule->flags = 0; if (rule->flags & FWDB_FLAG_NO_OFDM) rrule->flags |= NL80211_RRF_NO_OFDM; if (rule->flags & FWDB_FLAG_NO_OUTDOOR) rrule->flags |= NL80211_RRF_NO_OUTDOOR; if (rule->flags & FWDB_FLAG_DFS) rrule->flags |= NL80211_RRF_DFS; if (rule->flags & FWDB_FLAG_NO_IR) rrule->flags |= NL80211_RRF_NO_IR; if (rule->flags & FWDB_FLAG_AUTO_BW) rrule->flags |= NL80211_RRF_AUTO_BW; rrule->dfs_cac_ms = 0; /* handle optional data */ if (rule->len >= offsetofend(struct fwdb_rule, cac_timeout)) rrule->dfs_cac_ms = 1000 * be16_to_cpu(rule->cac_timeout); if (rule->len >= offsetofend(struct fwdb_rule, wmm_ptr)) set_wmm_rule(db, country, rule, rrule); } return reg_schedule_apply(regdom); } static int query_regdb(const char *alpha2) { const struct fwdb_header *hdr = regdb; const struct fwdb_country *country; ASSERT_RTNL(); if (IS_ERR(regdb)) return PTR_ERR(regdb); country = &hdr->country[0]; while (country->coll_ptr) { if (alpha2_equal(alpha2, country->alpha2)) return regdb_query_country(regdb, country); country++; } return -ENODATA; } static void regdb_fw_cb(const struct firmware *fw, void *context) { int set_error = 0; bool restore = true; void *db; if (!fw) { pr_info("failed to load regulatory.db\n"); set_error = -ENODATA; } else if (!valid_regdb(fw->data, fw->size)) { pr_info("loaded regulatory.db is malformed or signature is missing/invalid\n"); set_error = -EINVAL; } rtnl_lock(); if (regdb && !IS_ERR(regdb)) { /* negative case - a bug * positive case - can happen due to race in case of multiple cb's in * queue, due to usage of asynchronous callback * * Either case, just restore and free new db. */ } else if (set_error) { regdb = ERR_PTR(set_error); } else if (fw) { db = kmemdup(fw->data, fw->size, GFP_KERNEL); if (db) { regdb = db; restore = context && query_regdb(context); } else { restore = true; } } if (restore) restore_regulatory_settings(true, false); rtnl_unlock(); kfree(context); release_firmware(fw); } MODULE_FIRMWARE("regulatory.db"); static int query_regdb_file(const char *alpha2) { int err; ASSERT_RTNL(); if (regdb) return query_regdb(alpha2); alpha2 = kmemdup(alpha2, 2, GFP_KERNEL); if (!alpha2) return -ENOMEM; err = request_firmware_nowait(THIS_MODULE, true, "regulatory.db", ®_pdev->dev, GFP_KERNEL, (void *)alpha2, regdb_fw_cb); if (err) kfree(alpha2); return err; } int reg_reload_regdb(void) { const struct firmware *fw; void *db; int err; const struct ieee80211_regdomain *current_regdomain; struct regulatory_request *request; err = request_firmware(&fw, "regulatory.db", ®_pdev->dev); if (err) return err; if (!valid_regdb(fw->data, fw->size)) { err = -ENODATA; goto out; } db = kmemdup(fw->data, fw->size, GFP_KERNEL); if (!db) { err = -ENOMEM; goto out; } rtnl_lock(); if (!IS_ERR_OR_NULL(regdb)) kfree(regdb); regdb = db; /* reset regulatory domain */ current_regdomain = get_cfg80211_regdom(); request = kzalloc(sizeof(*request), GFP_KERNEL); if (!request) { err = -ENOMEM; goto out_unlock; } request->wiphy_idx = WIPHY_IDX_INVALID; request->alpha2[0] = current_regdomain->alpha2[0]; request->alpha2[1] = current_regdomain->alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_CORE; request->user_reg_hint_type = NL80211_USER_REG_HINT_USER; reg_process_hint(request); out_unlock: rtnl_unlock(); out: release_firmware(fw); return err; } static bool reg_query_database(struct regulatory_request *request) { if (query_regdb_file(request->alpha2) == 0) return true; if (call_crda(request->alpha2) == 0) return true; return false; } bool reg_is_valid_request(const char *alpha2) { struct regulatory_request *lr = get_last_request(); if (!lr || lr->processed) return false; return alpha2_equal(lr->alpha2, alpha2); } static const struct ieee80211_regdomain *reg_get_regdomain(struct wiphy *wiphy) { struct regulatory_request *lr = get_last_request(); /* * Follow the driver's regulatory domain, if present, unless a country * IE has been processed or a user wants to help complaince further */ if (lr->initiator != NL80211_REGDOM_SET_BY_COUNTRY_IE && lr->initiator != NL80211_REGDOM_SET_BY_USER && wiphy->regd) return get_wiphy_regdom(wiphy); return get_cfg80211_regdom(); } static unsigned int reg_get_max_bandwidth_from_range(const struct ieee80211_regdomain *rd, const struct ieee80211_reg_rule *rule) { const struct ieee80211_freq_range *freq_range = &rule->freq_range; const struct ieee80211_freq_range *freq_range_tmp; const struct ieee80211_reg_rule *tmp; u32 start_freq, end_freq, idx, no; for (idx = 0; idx < rd->n_reg_rules; idx++) if (rule == &rd->reg_rules[idx]) break; if (idx == rd->n_reg_rules) return 0; /* get start_freq */ no = idx; while (no) { tmp = &rd->reg_rules[--no]; freq_range_tmp = &tmp->freq_range; if (freq_range_tmp->end_freq_khz < freq_range->start_freq_khz) break; freq_range = freq_range_tmp; } start_freq = freq_range->start_freq_khz; /* get end_freq */ freq_range = &rule->freq_range; no = idx; while (no < rd->n_reg_rules - 1) { tmp = &rd->reg_rules[++no]; freq_range_tmp = &tmp->freq_range; if (freq_range_tmp->start_freq_khz > freq_range->end_freq_khz) break; freq_range = freq_range_tmp; } end_freq = freq_range->end_freq_khz; return end_freq - start_freq; } unsigned int reg_get_max_bandwidth(const struct ieee80211_regdomain *rd, const struct ieee80211_reg_rule *rule) { unsigned int bw = reg_get_max_bandwidth_from_range(rd, rule); if (rule->flags & NL80211_RRF_NO_320MHZ) bw = min_t(unsigned int, bw, MHZ_TO_KHZ(160)); if (rule->flags & NL80211_RRF_NO_160MHZ) bw = min_t(unsigned int, bw, MHZ_TO_KHZ(80)); if (rule->flags & NL80211_RRF_NO_80MHZ) bw = min_t(unsigned int, bw, MHZ_TO_KHZ(40)); /* * HT40+/HT40- limits are handled per-channel. Only limit BW if both * are not allowed. */ if (rule->flags & NL80211_RRF_NO_HT40MINUS && rule->flags & NL80211_RRF_NO_HT40PLUS) bw = min_t(unsigned int, bw, MHZ_TO_KHZ(20)); return bw; } /* Sanity check on a regulatory rule */ static bool is_valid_reg_rule(const struct ieee80211_reg_rule *rule) { const struct ieee80211_freq_range *freq_range = &rule->freq_range; u32 freq_diff; if (freq_range->start_freq_khz <= 0 || freq_range->end_freq_khz <= 0) return false; if (freq_range->start_freq_khz > freq_range->end_freq_khz) return false; freq_diff = freq_range->end_freq_khz - freq_range->start_freq_khz; if (freq_range->end_freq_khz <= freq_range->start_freq_khz || freq_range->max_bandwidth_khz > freq_diff) return false; return true; } static bool is_valid_rd(const struct ieee80211_regdomain *rd) { const struct ieee80211_reg_rule *reg_rule = NULL; unsigned int i; if (!rd->n_reg_rules) return false; if (WARN_ON(rd->n_reg_rules > NL80211_MAX_SUPP_REG_RULES)) return false; for (i = 0; i < rd->n_reg_rules; i++) { reg_rule = &rd->reg_rules[i]; if (!is_valid_reg_rule(reg_rule)) return false; } return true; } /** * freq_in_rule_band - tells us if a frequency is in a frequency band * @freq_range: frequency rule we want to query * @freq_khz: frequency we are inquiring about * * This lets us know if a specific frequency rule is or is not relevant to * a specific frequency's band. Bands are device specific and artificial * definitions (the "2.4 GHz band", the "5 GHz band" and the "60GHz band"), * however it is safe for now to assume that a frequency rule should not be * part of a frequency's band if the start freq or end freq are off by more * than 2 GHz for the 2.4 and 5 GHz bands, and by more than 20 GHz for the * 60 GHz band. * This resolution can be lowered and should be considered as we add * regulatory rule support for other "bands". * * Returns: whether or not the frequency is in the range */ static bool freq_in_rule_band(const struct ieee80211_freq_range *freq_range, u32 freq_khz) { /* * From 802.11ad: directional multi-gigabit (DMG): * Pertaining to operation in a frequency band containing a channel * with the Channel starting frequency above 45 GHz. */ u32 limit = freq_khz > 45 * KHZ_PER_GHZ ? 20 * KHZ_PER_GHZ : 2 * KHZ_PER_GHZ; if (abs(freq_khz - freq_range->start_freq_khz) <= limit) return true; if (abs(freq_khz - freq_range->end_freq_khz) <= limit) return true; return false; } /* * Later on we can perhaps use the more restrictive DFS * region but we don't have information for that yet so * for now simply disallow conflicts. */ static enum nl80211_dfs_regions reg_intersect_dfs_region(const enum nl80211_dfs_regions dfs_region1, const enum nl80211_dfs_regions dfs_region2) { if (dfs_region1 != dfs_region2) return NL80211_DFS_UNSET; return dfs_region1; } static void reg_wmm_rules_intersect(const struct ieee80211_wmm_ac *wmm_ac1, const struct ieee80211_wmm_ac *wmm_ac2, struct ieee80211_wmm_ac *intersect) { intersect->cw_min = max_t(u16, wmm_ac1->cw_min, wmm_ac2->cw_min); intersect->cw_max = max_t(u16, wmm_ac1->cw_max, wmm_ac2->cw_max); intersect->cot = min_t(u16, wmm_ac1->cot, wmm_ac2->cot); intersect->aifsn = max_t(u8, wmm_ac1->aifsn, wmm_ac2->aifsn); } /* * Helper for regdom_intersect(), this does the real * mathematical intersection fun */ static int reg_rules_intersect(const struct ieee80211_regdomain *rd1, const struct ieee80211_regdomain *rd2, const struct ieee80211_reg_rule *rule1, const struct ieee80211_reg_rule *rule2, struct ieee80211_reg_rule *intersected_rule) { const struct ieee80211_freq_range *freq_range1, *freq_range2; struct ieee80211_freq_range *freq_range; const struct ieee80211_power_rule *power_rule1, *power_rule2; struct ieee80211_power_rule *power_rule; const struct ieee80211_wmm_rule *wmm_rule1, *wmm_rule2; struct ieee80211_wmm_rule *wmm_rule; u32 freq_diff, max_bandwidth1, max_bandwidth2; freq_range1 = &rule1->freq_range; freq_range2 = &rule2->freq_range; freq_range = &intersected_rule->freq_range; power_rule1 = &rule1->power_rule; power_rule2 = &rule2->power_rule; power_rule = &intersected_rule->power_rule; wmm_rule1 = &rule1->wmm_rule; wmm_rule2 = &rule2->wmm_rule; wmm_rule = &intersected_rule->wmm_rule; freq_range->start_freq_khz = max(freq_range1->start_freq_khz, freq_range2->start_freq_khz); freq_range->end_freq_khz = min(freq_range1->end_freq_khz, freq_range2->end_freq_khz); max_bandwidth1 = freq_range1->max_bandwidth_khz; max_bandwidth2 = freq_range2->max_bandwidth_khz; if (rule1->flags & NL80211_RRF_AUTO_BW) max_bandwidth1 = reg_get_max_bandwidth(rd1, rule1); if (rule2->flags & NL80211_RRF_AUTO_BW) max_bandwidth2 = reg_get_max_bandwidth(rd2, rule2); freq_range->max_bandwidth_khz = min(max_bandwidth1, max_bandwidth2); intersected_rule->flags = rule1->flags | rule2->flags; /* * In case NL80211_RRF_AUTO_BW requested for both rules * set AUTO_BW in intersected rule also. Next we will * calculate BW correctly in handle_channel function. * In other case remove AUTO_BW flag while we calculate * maximum bandwidth correctly and auto calculation is * not required. */ if ((rule1->flags & NL80211_RRF_AUTO_BW) && (rule2->flags & NL80211_RRF_AUTO_BW)) intersected_rule->flags |= NL80211_RRF_AUTO_BW; else intersected_rule->flags &= ~NL80211_RRF_AUTO_BW; freq_diff = freq_range->end_freq_khz - freq_range->start_freq_khz; if (freq_range->max_bandwidth_khz > freq_diff) freq_range->max_bandwidth_khz = freq_diff; power_rule->max_eirp = min(power_rule1->max_eirp, power_rule2->max_eirp); power_rule->max_antenna_gain = min(power_rule1->max_antenna_gain, power_rule2->max_antenna_gain); intersected_rule->dfs_cac_ms = max(rule1->dfs_cac_ms, rule2->dfs_cac_ms); if (rule1->has_wmm && rule2->has_wmm) { u8 ac; for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) { reg_wmm_rules_intersect(&wmm_rule1->client[ac], &wmm_rule2->client[ac], &wmm_rule->client[ac]); reg_wmm_rules_intersect(&wmm_rule1->ap[ac], &wmm_rule2->ap[ac], &wmm_rule->ap[ac]); } intersected_rule->has_wmm = true; } else if (rule1->has_wmm) { *wmm_rule = *wmm_rule1; intersected_rule->has_wmm = true; } else if (rule2->has_wmm) { *wmm_rule = *wmm_rule2; intersected_rule->has_wmm = true; } else { intersected_rule->has_wmm = false; } if (!is_valid_reg_rule(intersected_rule)) return -EINVAL; return 0; } /* check whether old rule contains new rule */ static bool rule_contains(struct ieee80211_reg_rule *r1, struct ieee80211_reg_rule *r2) { /* for simplicity, currently consider only same flags */ if (r1->flags != r2->flags) return false; /* verify r1 is more restrictive */ if ((r1->power_rule.max_antenna_gain > r2->power_rule.max_antenna_gain) || r1->power_rule.max_eirp > r2->power_rule.max_eirp) return false; /* make sure r2's range is contained within r1 */ if (r1->freq_range.start_freq_khz > r2->freq_range.start_freq_khz || r1->freq_range.end_freq_khz < r2->freq_range.end_freq_khz) return false; /* and finally verify that r1.max_bw >= r2.max_bw */ if (r1->freq_range.max_bandwidth_khz < r2->freq_range.max_bandwidth_khz) return false; return true; } /* add or extend current rules. do nothing if rule is already contained */ static void add_rule(struct ieee80211_reg_rule *rule, struct ieee80211_reg_rule *reg_rules, u32 *n_rules) { struct ieee80211_reg_rule *tmp_rule; int i; for (i = 0; i < *n_rules; i++) { tmp_rule = ®_rules[i]; /* rule is already contained - do nothing */ if (rule_contains(tmp_rule, rule)) return; /* extend rule if possible */ if (rule_contains(rule, tmp_rule)) { memcpy(tmp_rule, rule, sizeof(*rule)); return; } } memcpy(®_rules[*n_rules], rule, sizeof(*rule)); (*n_rules)++; } /** * regdom_intersect - do the intersection between two regulatory domains * @rd1: first regulatory domain * @rd2: second regulatory domain * * Use this function to get the intersection between two regulatory domains. * Once completed we will mark the alpha2 for the rd as intersected, "98", * as no one single alpha2 can represent this regulatory domain. * * Returns a pointer to the regulatory domain structure which will hold the * resulting intersection of rules between rd1 and rd2. We will * kzalloc() this structure for you. * * Returns: the intersected regdomain */ static struct ieee80211_regdomain * regdom_intersect(const struct ieee80211_regdomain *rd1, const struct ieee80211_regdomain *rd2) { int r; unsigned int x, y; unsigned int num_rules = 0; const struct ieee80211_reg_rule *rule1, *rule2; struct ieee80211_reg_rule intersected_rule; struct ieee80211_regdomain *rd; if (!rd1 || !rd2) return NULL; /* * First we get a count of the rules we'll need, then we actually * build them. This is to so we can malloc() and free() a * regdomain once. The reason we use reg_rules_intersect() here * is it will return -EINVAL if the rule computed makes no sense. * All rules that do check out OK are valid. */ for (x = 0; x < rd1->n_reg_rules; x++) { rule1 = &rd1->reg_rules[x]; for (y = 0; y < rd2->n_reg_rules; y++) { rule2 = &rd2->reg_rules[y]; if (!reg_rules_intersect(rd1, rd2, rule1, rule2, &intersected_rule)) num_rules++; } } if (!num_rules) return NULL; rd = kzalloc(struct_size(rd, reg_rules, num_rules), GFP_KERNEL); if (!rd) return NULL; for (x = 0; x < rd1->n_reg_rules; x++) { rule1 = &rd1->reg_rules[x]; for (y = 0; y < rd2->n_reg_rules; y++) { rule2 = &rd2->reg_rules[y]; r = reg_rules_intersect(rd1, rd2, rule1, rule2, &intersected_rule); /* * No need to memset here the intersected rule here as * we're not using the stack anymore */ if (r) continue; add_rule(&intersected_rule, rd->reg_rules, &rd->n_reg_rules); } } rd->alpha2[0] = '9'; rd->alpha2[1] = '8'; rd->dfs_region = reg_intersect_dfs_region(rd1->dfs_region, rd2->dfs_region); return rd; } /* * XXX: add support for the rest of enum nl80211_reg_rule_flags, we may * want to just have the channel structure use these */ static u32 map_regdom_flags(u32 rd_flags) { u32 channel_flags = 0; if (rd_flags & NL80211_RRF_NO_IR_ALL) channel_flags |= IEEE80211_CHAN_NO_IR; if (rd_flags & NL80211_RRF_DFS) channel_flags |= IEEE80211_CHAN_RADAR; if (rd_flags & NL80211_RRF_NO_OFDM) channel_flags |= IEEE80211_CHAN_NO_OFDM; if (rd_flags & NL80211_RRF_NO_OUTDOOR) channel_flags |= IEEE80211_CHAN_INDOOR_ONLY; if (rd_flags & NL80211_RRF_IR_CONCURRENT) channel_flags |= IEEE80211_CHAN_IR_CONCURRENT; if (rd_flags & NL80211_RRF_NO_HT40MINUS) channel_flags |= IEEE80211_CHAN_NO_HT40MINUS; if (rd_flags & NL80211_RRF_NO_HT40PLUS) channel_flags |= IEEE80211_CHAN_NO_HT40PLUS; if (rd_flags & NL80211_RRF_NO_80MHZ) channel_flags |= IEEE80211_CHAN_NO_80MHZ; if (rd_flags & NL80211_RRF_NO_160MHZ) channel_flags |= IEEE80211_CHAN_NO_160MHZ; if (rd_flags & NL80211_RRF_NO_HE) channel_flags |= IEEE80211_CHAN_NO_HE; if (rd_flags & NL80211_RRF_NO_320MHZ) channel_flags |= IEEE80211_CHAN_NO_320MHZ; if (rd_flags & NL80211_RRF_NO_EHT) channel_flags |= IEEE80211_CHAN_NO_EHT; if (rd_flags & NL80211_RRF_DFS_CONCURRENT) channel_flags |= IEEE80211_CHAN_DFS_CONCURRENT; if (rd_flags & NL80211_RRF_NO_6GHZ_VLP_CLIENT) channel_flags |= IEEE80211_CHAN_NO_6GHZ_VLP_CLIENT; if (rd_flags & NL80211_RRF_NO_6GHZ_AFC_CLIENT) channel_flags |= IEEE80211_CHAN_NO_6GHZ_AFC_CLIENT; if (rd_flags & NL80211_RRF_PSD) channel_flags |= IEEE80211_CHAN_PSD; if (rd_flags & NL80211_RRF_ALLOW_6GHZ_VLP_AP) channel_flags |= IEEE80211_CHAN_ALLOW_6GHZ_VLP_AP; return channel_flags; } static const struct ieee80211_reg_rule * freq_reg_info_regd(u32 center_freq, const struct ieee80211_regdomain *regd, u32 bw) { int i; bool band_rule_found = false; bool bw_fits = false; if (!regd) return ERR_PTR(-EINVAL); for (i = 0; i < regd->n_reg_rules; i++) { const struct ieee80211_reg_rule *rr; const struct ieee80211_freq_range *fr = NULL; rr = ®d->reg_rules[i]; fr = &rr->freq_range; /* * We only need to know if one frequency rule was * in center_freq's band, that's enough, so let's * not overwrite it once found */ if (!band_rule_found) band_rule_found = freq_in_rule_band(fr, center_freq); bw_fits = cfg80211_does_bw_fit_range(fr, center_freq, bw); if (band_rule_found && bw_fits) return rr; } if (!band_rule_found) return ERR_PTR(-ERANGE); return ERR_PTR(-EINVAL); } static const struct ieee80211_reg_rule * __freq_reg_info(struct wiphy *wiphy, u32 center_freq, u32 min_bw) { const struct ieee80211_regdomain *regd = reg_get_regdomain(wiphy); static const u32 bws[] = {0, 1, 2, 4, 5, 8, 10, 16, 20}; const struct ieee80211_reg_rule *reg_rule = ERR_PTR(-ERANGE); int i = ARRAY_SIZE(bws) - 1; u32 bw; for (bw = MHZ_TO_KHZ(bws[i]); bw >= min_bw; bw = MHZ_TO_KHZ(bws[i--])) { reg_rule = freq_reg_info_regd(center_freq, regd, bw); if (!IS_ERR(reg_rule)) return reg_rule; } return reg_rule; } const struct ieee80211_reg_rule *freq_reg_info(struct wiphy *wiphy, u32 center_freq) { u32 min_bw = center_freq < MHZ_TO_KHZ(1000) ? 1 : 20; return __freq_reg_info(wiphy, center_freq, MHZ_TO_KHZ(min_bw)); } EXPORT_SYMBOL(freq_reg_info); const char *reg_initiator_name(enum nl80211_reg_initiator initiator) { switch (initiator) { case NL80211_REGDOM_SET_BY_CORE: return "core"; case NL80211_REGDOM_SET_BY_USER: return "user"; case NL80211_REGDOM_SET_BY_DRIVER: return "driver"; case NL80211_REGDOM_SET_BY_COUNTRY_IE: return "country element"; default: WARN_ON(1); return "bug"; } } EXPORT_SYMBOL(reg_initiator_name); static uint32_t reg_rule_to_chan_bw_flags(const struct ieee80211_regdomain *regd, const struct ieee80211_reg_rule *reg_rule, const struct ieee80211_channel *chan) { const struct ieee80211_freq_range *freq_range = NULL; u32 max_bandwidth_khz, center_freq_khz, bw_flags = 0; bool is_s1g = chan->band == NL80211_BAND_S1GHZ; freq_range = ®_rule->freq_range; max_bandwidth_khz = freq_range->max_bandwidth_khz; center_freq_khz = ieee80211_channel_to_khz(chan); /* Check if auto calculation requested */ if (reg_rule->flags & NL80211_RRF_AUTO_BW) max_bandwidth_khz = reg_get_max_bandwidth(regd, reg_rule); /* If we get a reg_rule we can assume that at least 5Mhz fit */ if (!cfg80211_does_bw_fit_range(freq_range, center_freq_khz, MHZ_TO_KHZ(10))) bw_flags |= IEEE80211_CHAN_NO_10MHZ; if (!cfg80211_does_bw_fit_range(freq_range, center_freq_khz, MHZ_TO_KHZ(20))) bw_flags |= IEEE80211_CHAN_NO_20MHZ; if (is_s1g) { /* S1G is strict about non overlapping channels. We can * calculate which bandwidth is allowed per channel by finding * the largest bandwidth which cleanly divides the freq_range. */ int edge_offset; int ch_bw = max_bandwidth_khz; while (ch_bw) { edge_offset = (center_freq_khz - ch_bw / 2) - freq_range->start_freq_khz; if (edge_offset % ch_bw == 0) { switch (KHZ_TO_MHZ(ch_bw)) { case 1: bw_flags |= IEEE80211_CHAN_1MHZ; break; case 2: bw_flags |= IEEE80211_CHAN_2MHZ; break; case 4: bw_flags |= IEEE80211_CHAN_4MHZ; break; case 8: bw_flags |= IEEE80211_CHAN_8MHZ; break; case 16: bw_flags |= IEEE80211_CHAN_16MHZ; break; default: /* If we got here, no bandwidths fit on * this frequency, ie. band edge. */ bw_flags |= IEEE80211_CHAN_DISABLED; break; } break; } ch_bw /= 2; } } else { if (max_bandwidth_khz < MHZ_TO_KHZ(10)) bw_flags |= IEEE80211_CHAN_NO_10MHZ; if (max_bandwidth_khz < MHZ_TO_KHZ(20)) bw_flags |= IEEE80211_CHAN_NO_20MHZ; if (max_bandwidth_khz < MHZ_TO_KHZ(40)) bw_flags |= IEEE80211_CHAN_NO_HT40; if (max_bandwidth_khz < MHZ_TO_KHZ(80)) bw_flags |= IEEE80211_CHAN_NO_80MHZ; if (max_bandwidth_khz < MHZ_TO_KHZ(160)) bw_flags |= IEEE80211_CHAN_NO_160MHZ; if (max_bandwidth_khz < MHZ_TO_KHZ(320)) bw_flags |= IEEE80211_CHAN_NO_320MHZ; } return bw_flags; } static void handle_channel_single_rule(struct wiphy *wiphy, enum nl80211_reg_initiator initiator, struct ieee80211_channel *chan, u32 flags, struct regulatory_request *lr, struct wiphy *request_wiphy, const struct ieee80211_reg_rule *reg_rule) { u32 bw_flags = 0; const struct ieee80211_power_rule *power_rule = NULL; const struct ieee80211_regdomain *regd; regd = reg_get_regdomain(wiphy); power_rule = ®_rule->power_rule; bw_flags = reg_rule_to_chan_bw_flags(regd, reg_rule, chan); if (lr->initiator == NL80211_REGDOM_SET_BY_DRIVER && request_wiphy && request_wiphy == wiphy && request_wiphy->regulatory_flags & REGULATORY_STRICT_REG) { /* * This guarantees the driver's requested regulatory domain * will always be used as a base for further regulatory * settings */ chan->flags = chan->orig_flags = map_regdom_flags(reg_rule->flags) | bw_flags; chan->max_antenna_gain = chan->orig_mag = (int) MBI_TO_DBI(power_rule->max_antenna_gain); chan->max_reg_power = chan->max_power = chan->orig_mpwr = (int) MBM_TO_DBM(power_rule->max_eirp); if (chan->flags & IEEE80211_CHAN_RADAR) { chan->dfs_cac_ms = IEEE80211_DFS_MIN_CAC_TIME_MS; if (reg_rule->dfs_cac_ms) chan->dfs_cac_ms = reg_rule->dfs_cac_ms; } if (chan->flags & IEEE80211_CHAN_PSD) chan->psd = reg_rule->psd; return; } chan->dfs_state = NL80211_DFS_USABLE; chan->dfs_state_entered = jiffies; chan->beacon_found = false; chan->flags = flags | bw_flags | map_regdom_flags(reg_rule->flags); chan->max_antenna_gain = min_t(int, chan->orig_mag, MBI_TO_DBI(power_rule->max_antenna_gain)); chan->max_reg_power = (int) MBM_TO_DBM(power_rule->max_eirp); if (chan->flags & IEEE80211_CHAN_RADAR) { if (reg_rule->dfs_cac_ms) chan->dfs_cac_ms = reg_rule->dfs_cac_ms; else chan->dfs_cac_ms = IEEE80211_DFS_MIN_CAC_TIME_MS; } if (chan->flags & IEEE80211_CHAN_PSD) chan->psd = reg_rule->psd; if (chan->orig_mpwr) { /* * Devices that use REGULATORY_COUNTRY_IE_FOLLOW_POWER * will always follow the passed country IE power settings. */ if (initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE && wiphy->regulatory_flags & REGULATORY_COUNTRY_IE_FOLLOW_POWER) chan->max_power = chan->max_reg_power; else chan->max_power = min(chan->orig_mpwr, chan->max_reg_power); } else chan->max_power = chan->max_reg_power; } static void handle_channel_adjacent_rules(struct wiphy *wiphy, enum nl80211_reg_initiator initiator, struct ieee80211_channel *chan, u32 flags, struct regulatory_request *lr, struct wiphy *request_wiphy, const struct ieee80211_reg_rule *rrule1, const struct ieee80211_reg_rule *rrule2, struct ieee80211_freq_range *comb_range) { u32 bw_flags1 = 0; u32 bw_flags2 = 0; const struct ieee80211_power_rule *power_rule1 = NULL; const struct ieee80211_power_rule *power_rule2 = NULL; const struct ieee80211_regdomain *regd; regd = reg_get_regdomain(wiphy); power_rule1 = &rrule1->power_rule; power_rule2 = &rrule2->power_rule; bw_flags1 = reg_rule_to_chan_bw_flags(regd, rrule1, chan); bw_flags2 = reg_rule_to_chan_bw_flags(regd, rrule2, chan); if (lr->initiator == NL80211_REGDOM_SET_BY_DRIVER && request_wiphy && request_wiphy == wiphy && request_wiphy->regulatory_flags & REGULATORY_STRICT_REG) { /* This guarantees the driver's requested regulatory domain * will always be used as a base for further regulatory * settings */ chan->flags = map_regdom_flags(rrule1->flags) | map_regdom_flags(rrule2->flags) | bw_flags1 | bw_flags2; chan->orig_flags = chan->flags; chan->max_antenna_gain = min_t(int, MBI_TO_DBI(power_rule1->max_antenna_gain), MBI_TO_DBI(power_rule2->max_antenna_gain)); chan->orig_mag = chan->max_antenna_gain; chan->max_reg_power = min_t(int, MBM_TO_DBM(power_rule1->max_eirp), MBM_TO_DBM(power_rule2->max_eirp)); chan->max_power = chan->max_reg_power; chan->orig_mpwr = chan->max_reg_power; if (chan->flags & IEEE80211_CHAN_RADAR) { chan->dfs_cac_ms = IEEE80211_DFS_MIN_CAC_TIME_MS; if (rrule1->dfs_cac_ms || rrule2->dfs_cac_ms) chan->dfs_cac_ms = max_t(unsigned int, rrule1->dfs_cac_ms, rrule2->dfs_cac_ms); } if ((rrule1->flags & NL80211_RRF_PSD) && (rrule2->flags & NL80211_RRF_PSD)) chan->psd = min_t(s8, rrule1->psd, rrule2->psd); else chan->flags &= ~NL80211_RRF_PSD; return; } chan->dfs_state = NL80211_DFS_USABLE; chan->dfs_state_entered = jiffies; chan->beacon_found = false; chan->flags = flags | bw_flags1 | bw_flags2 | map_regdom_flags(rrule1->flags) | map_regdom_flags(rrule2->flags); /* reg_rule_to_chan_bw_flags may forbids 10 and forbids 20 MHz * (otherwise no adj. rule case), recheck therefore */ if (cfg80211_does_bw_fit_range(comb_range, ieee80211_channel_to_khz(chan), MHZ_TO_KHZ(10))) chan->flags &= ~IEEE80211_CHAN_NO_10MHZ; if (cfg80211_does_bw_fit_range(comb_range, ieee80211_channel_to_khz(chan), MHZ_TO_KHZ(20))) chan->flags &= ~IEEE80211_CHAN_NO_20MHZ; chan->max_antenna_gain = min_t(int, chan->orig_mag, min_t(int, MBI_TO_DBI(power_rule1->max_antenna_gain), MBI_TO_DBI(power_rule2->max_antenna_gain))); chan->max_reg_power = min_t(int, MBM_TO_DBM(power_rule1->max_eirp), MBM_TO_DBM(power_rule2->max_eirp)); if (chan->flags & IEEE80211_CHAN_RADAR) { if (rrule1->dfs_cac_ms || rrule2->dfs_cac_ms) chan->dfs_cac_ms = max_t(unsigned int, rrule1->dfs_cac_ms, rrule2->dfs_cac_ms); else chan->dfs_cac_ms = IEEE80211_DFS_MIN_CAC_TIME_MS; } if (chan->orig_mpwr) { /* Devices that use REGULATORY_COUNTRY_IE_FOLLOW_POWER * will always follow the passed country IE power settings. */ if (initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE && wiphy->regulatory_flags & REGULATORY_COUNTRY_IE_FOLLOW_POWER) chan->max_power = chan->max_reg_power; else chan->max_power = min(chan->orig_mpwr, chan->max_reg_power); } else { chan->max_power = chan->max_reg_power; } } /* Note that right now we assume the desired channel bandwidth * is always 20 MHz for each individual channel (HT40 uses 20 MHz * per channel, the primary and the extension channel). */ static void handle_channel(struct wiphy *wiphy, enum nl80211_reg_initiator initiator, struct ieee80211_channel *chan) { const u32 orig_chan_freq = ieee80211_channel_to_khz(chan); struct regulatory_request *lr = get_last_request(); struct wiphy *request_wiphy = wiphy_idx_to_wiphy(lr->wiphy_idx); const struct ieee80211_reg_rule *rrule = NULL; const struct ieee80211_reg_rule *rrule1 = NULL; const struct ieee80211_reg_rule *rrule2 = NULL; u32 flags = chan->orig_flags; rrule = freq_reg_info(wiphy, orig_chan_freq); if (IS_ERR(rrule)) { /* check for adjacent match, therefore get rules for * chan - 20 MHz and chan + 20 MHz and test * if reg rules are adjacent */ rrule1 = freq_reg_info(wiphy, orig_chan_freq - MHZ_TO_KHZ(20)); rrule2 = freq_reg_info(wiphy, orig_chan_freq + MHZ_TO_KHZ(20)); if (!IS_ERR(rrule1) && !IS_ERR(rrule2)) { struct ieee80211_freq_range comb_range; if (rrule1->freq_range.end_freq_khz != rrule2->freq_range.start_freq_khz) goto disable_chan; comb_range.start_freq_khz = rrule1->freq_range.start_freq_khz; comb_range.end_freq_khz = rrule2->freq_range.end_freq_khz; comb_range.max_bandwidth_khz = min_t(u32, rrule1->freq_range.max_bandwidth_khz, rrule2->freq_range.max_bandwidth_khz); if (!cfg80211_does_bw_fit_range(&comb_range, orig_chan_freq, MHZ_TO_KHZ(20))) goto disable_chan; handle_channel_adjacent_rules(wiphy, initiator, chan, flags, lr, request_wiphy, rrule1, rrule2, &comb_range); return; } disable_chan: /* We will disable all channels that do not match our * received regulatory rule unless the hint is coming * from a Country IE and the Country IE had no information * about a band. The IEEE 802.11 spec allows for an AP * to send only a subset of the regulatory rules allowed, * so an AP in the US that only supports 2.4 GHz may only send * a country IE with information for the 2.4 GHz band * while 5 GHz is still supported. */ if (initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE && PTR_ERR(rrule) == -ERANGE) return; if (lr->initiator == NL80211_REGDOM_SET_BY_DRIVER && request_wiphy && request_wiphy == wiphy && request_wiphy->regulatory_flags & REGULATORY_STRICT_REG) { pr_debug("Disabling freq %d.%03d MHz for good\n", chan->center_freq, chan->freq_offset); chan->orig_flags |= IEEE80211_CHAN_DISABLED; chan->flags = chan->orig_flags; } else { pr_debug("Disabling freq %d.%03d MHz\n", chan->center_freq, chan->freq_offset); chan->flags |= IEEE80211_CHAN_DISABLED; } return; } handle_channel_single_rule(wiphy, initiator, chan, flags, lr, request_wiphy, rrule); } static void handle_band(struct wiphy *wiphy, enum nl80211_reg_initiator initiator, struct ieee80211_supported_band *sband) { unsigned int i; if (!sband) return; for (i = 0; i < sband->n_channels; i++) handle_channel(wiphy, initiator, &sband->channels[i]); } static bool reg_request_cell_base(struct regulatory_request *request) { if (request->initiator != NL80211_REGDOM_SET_BY_USER) return false; return request->user_reg_hint_type == NL80211_USER_REG_HINT_CELL_BASE; } bool reg_last_request_cell_base(void) { return reg_request_cell_base(get_last_request()); } #ifdef CONFIG_CFG80211_REG_CELLULAR_HINTS /* Core specific check */ static enum reg_request_treatment reg_ignore_cell_hint(struct regulatory_request *pending_request) { struct regulatory_request *lr = get_last_request(); if (!reg_num_devs_support_basehint) return REG_REQ_IGNORE; if (reg_request_cell_base(lr) && !regdom_changes(pending_request->alpha2)) return REG_REQ_ALREADY_SET; return REG_REQ_OK; } /* Device specific check */ static bool reg_dev_ignore_cell_hint(struct wiphy *wiphy) { return !(wiphy->features & NL80211_FEATURE_CELL_BASE_REG_HINTS); } #else static enum reg_request_treatment reg_ignore_cell_hint(struct regulatory_request *pending_request) { return REG_REQ_IGNORE; } static bool reg_dev_ignore_cell_hint(struct wiphy *wiphy) { return true; } #endif static bool wiphy_strict_alpha2_regd(struct wiphy *wiphy) { if (wiphy->regulatory_flags & REGULATORY_STRICT_REG && !(wiphy->regulatory_flags & REGULATORY_CUSTOM_REG)) return true; return false; } static bool ignore_reg_update(struct wiphy *wiphy, enum nl80211_reg_initiator initiator) { struct regulatory_request *lr = get_last_request(); if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) return true; if (!lr) { pr_debug("Ignoring regulatory request set by %s since last_request is not set\n", reg_initiator_name(initiator)); return true; } if (initiator == NL80211_REGDOM_SET_BY_CORE && wiphy->regulatory_flags & REGULATORY_CUSTOM_REG) { pr_debug("Ignoring regulatory request set by %s since the driver uses its own custom regulatory domain\n", reg_initiator_name(initiator)); return true; } /* * wiphy->regd will be set once the device has its own * desired regulatory domain set */ if (wiphy_strict_alpha2_regd(wiphy) && !wiphy->regd && initiator != NL80211_REGDOM_SET_BY_COUNTRY_IE && !is_world_regdom(lr->alpha2)) { pr_debug("Ignoring regulatory request set by %s since the driver requires its own regulatory domain to be set first\n", reg_initiator_name(initiator)); return true; } if (reg_request_cell_base(lr)) return reg_dev_ignore_cell_hint(wiphy); return false; } static bool reg_is_world_roaming(struct wiphy *wiphy) { const struct ieee80211_regdomain *cr = get_cfg80211_regdom(); const struct ieee80211_regdomain *wr = get_wiphy_regdom(wiphy); struct regulatory_request *lr = get_last_request(); if (is_world_regdom(cr->alpha2) || (wr && is_world_regdom(wr->alpha2))) return true; if (lr && lr->initiator != NL80211_REGDOM_SET_BY_COUNTRY_IE && wiphy->regulatory_flags & REGULATORY_CUSTOM_REG) return true; return false; } static void reg_call_notifier(struct wiphy *wiphy, struct regulatory_request *request) { if (wiphy->reg_notifier) wiphy->reg_notifier(wiphy, request); } static void handle_reg_beacon(struct wiphy *wiphy, unsigned int chan_idx, struct reg_beacon *reg_beacon) { struct ieee80211_supported_band *sband; struct ieee80211_channel *chan; bool channel_changed = false; struct ieee80211_channel chan_before; struct regulatory_request *lr = get_last_request(); sband = wiphy->bands[reg_beacon->chan.band]; chan = &sband->channels[chan_idx]; if (likely(!ieee80211_channel_equal(chan, ®_beacon->chan))) return; if (chan->beacon_found) return; chan->beacon_found = true; if (!reg_is_world_roaming(wiphy)) return; if (wiphy->regulatory_flags & REGULATORY_DISABLE_BEACON_HINTS) return; chan_before = *chan; if (chan->flags & IEEE80211_CHAN_NO_IR) { chan->flags &= ~IEEE80211_CHAN_NO_IR; channel_changed = true; } if (channel_changed) { nl80211_send_beacon_hint_event(wiphy, &chan_before, chan); if (wiphy->flags & WIPHY_FLAG_CHANNEL_CHANGE_ON_BEACON) reg_call_notifier(wiphy, lr); } } /* * Called when a scan on a wiphy finds a beacon on * new channel */ static void wiphy_update_new_beacon(struct wiphy *wiphy, struct reg_beacon *reg_beacon) { unsigned int i; struct ieee80211_supported_band *sband; if (!wiphy->bands[reg_beacon->chan.band]) return; sband = wiphy->bands[reg_beacon->chan.band]; for (i = 0; i < sband->n_channels; i++) handle_reg_beacon(wiphy, i, reg_beacon); } /* * Called upon reg changes or a new wiphy is added */ static void wiphy_update_beacon_reg(struct wiphy *wiphy) { unsigned int i; struct ieee80211_supported_band *sband; struct reg_beacon *reg_beacon; list_for_each_entry(reg_beacon, ®_beacon_list, list) { if (!wiphy->bands[reg_beacon->chan.band]) continue; sband = wiphy->bands[reg_beacon->chan.band]; for (i = 0; i < sband->n_channels; i++) handle_reg_beacon(wiphy, i, reg_beacon); } } /* Reap the advantages of previously found beacons */ static void reg_process_beacons(struct wiphy *wiphy) { /* * Means we are just firing up cfg80211, so no beacons would * have been processed yet. */ if (!last_request) return; wiphy_update_beacon_reg(wiphy); } static bool is_ht40_allowed(struct ieee80211_channel *chan) { if (!chan) return false; if (chan->flags & IEEE80211_CHAN_DISABLED) return false; /* This would happen when regulatory rules disallow HT40 completely */ if ((chan->flags & IEEE80211_CHAN_NO_HT40) == IEEE80211_CHAN_NO_HT40) return false; return true; } static void reg_process_ht_flags_channel(struct wiphy *wiphy, struct ieee80211_channel *channel) { struct ieee80211_supported_band *sband = wiphy->bands[channel->band]; struct ieee80211_channel *channel_before = NULL, *channel_after = NULL; const struct ieee80211_regdomain *regd; unsigned int i; u32 flags; if (!is_ht40_allowed(channel)) { channel->flags |= IEEE80211_CHAN_NO_HT40; return; } /* * We need to ensure the extension channels exist to * be able to use HT40- or HT40+, this finds them (or not) */ for (i = 0; i < sband->n_channels; i++) { struct ieee80211_channel *c = &sband->channels[i]; if (c->center_freq == (channel->center_freq - 20)) channel_before = c; if (c->center_freq == (channel->center_freq + 20)) channel_after = c; } flags = 0; regd = get_wiphy_regdom(wiphy); if (regd) { const struct ieee80211_reg_rule *reg_rule = freq_reg_info_regd(MHZ_TO_KHZ(channel->center_freq), regd, MHZ_TO_KHZ(20)); if (!IS_ERR(reg_rule)) flags = reg_rule->flags; } /* * Please note that this assumes target bandwidth is 20 MHz, * if that ever changes we also need to change the below logic * to include that as well. */ if (!is_ht40_allowed(channel_before) || flags & NL80211_RRF_NO_HT40MINUS) channel->flags |= IEEE80211_CHAN_NO_HT40MINUS; else channel->flags &= ~IEEE80211_CHAN_NO_HT40MINUS; if (!is_ht40_allowed(channel_after) || flags & NL80211_RRF_NO_HT40PLUS) channel->flags |= IEEE80211_CHAN_NO_HT40PLUS; else channel->flags &= ~IEEE80211_CHAN_NO_HT40PLUS; } static void reg_process_ht_flags_band(struct wiphy *wiphy, struct ieee80211_supported_band *sband) { unsigned int i; if (!sband) return; for (i = 0; i < sband->n_channels; i++) reg_process_ht_flags_channel(wiphy, &sband->channels[i]); } static void reg_process_ht_flags(struct wiphy *wiphy) { enum nl80211_band band; if (!wiphy) return; for (band = 0; band < NUM_NL80211_BANDS; band++) reg_process_ht_flags_band(wiphy, wiphy->bands[band]); } static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev) { struct cfg80211_chan_def chandef = {}; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); enum nl80211_iftype iftype; bool ret; int link; iftype = wdev->iftype; /* make sure the interface is active */ if (!wdev->netdev || !netif_running(wdev->netdev)) return true; for (link = 0; link < ARRAY_SIZE(wdev->links); link++) { struct ieee80211_channel *chan; if (!wdev->valid_links && link > 0) break; if (wdev->valid_links && !(wdev->valid_links & BIT(link))) continue; switch (iftype) { case NL80211_IFTYPE_AP: case NL80211_IFTYPE_P2P_GO: if (!wdev->links[link].ap.beacon_interval) continue; chandef = wdev->links[link].ap.chandef; break; case NL80211_IFTYPE_MESH_POINT: if (!wdev->u.mesh.beacon_interval) continue; chandef = wdev->u.mesh.chandef; break; case NL80211_IFTYPE_ADHOC: if (!wdev->u.ibss.ssid_len) continue; chandef = wdev->u.ibss.chandef; break; case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_P2P_CLIENT: /* Maybe we could consider disabling that link only? */ if (!wdev->links[link].client.current_bss) continue; chan = wdev->links[link].client.current_bss->pub.channel; if (!chan) continue; if (!rdev->ops->get_channel || rdev_get_channel(rdev, wdev, link, &chandef)) cfg80211_chandef_create(&chandef, chan, NL80211_CHAN_NO_HT); break; case NL80211_IFTYPE_MONITOR: case NL80211_IFTYPE_AP_VLAN: case NL80211_IFTYPE_P2P_DEVICE: /* no enforcement required */ break; case NL80211_IFTYPE_OCB: if (!wdev->u.ocb.chandef.chan) continue; chandef = wdev->u.ocb.chandef; break; case NL80211_IFTYPE_NAN: /* we have no info, but NAN is also pretty universal */ continue; default: /* others not implemented for now */ WARN_ON_ONCE(1); break; } switch (iftype) { case NL80211_IFTYPE_AP: case NL80211_IFTYPE_P2P_GO: case NL80211_IFTYPE_ADHOC: case NL80211_IFTYPE_MESH_POINT: ret = cfg80211_reg_can_beacon_relax(wiphy, &chandef, iftype); if (!ret) return ret; break; case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_P2P_CLIENT: ret = cfg80211_chandef_usable(wiphy, &chandef, IEEE80211_CHAN_DISABLED); if (!ret) return ret; break; default: break; } } return true; } static void reg_leave_invalid_chans(struct wiphy *wiphy) { struct wireless_dev *wdev; struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); wiphy_lock(wiphy); list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) if (!reg_wdev_chan_valid(wiphy, wdev)) cfg80211_leave(rdev, wdev); wiphy_unlock(wiphy); } static void reg_check_chans_work(struct work_struct *work) { struct cfg80211_registered_device *rdev; pr_debug("Verifying active interfaces after reg change\n"); rtnl_lock(); for_each_rdev(rdev) reg_leave_invalid_chans(&rdev->wiphy); rtnl_unlock(); } void reg_check_channels(void) { /* * Give usermode a chance to do something nicer (move to another * channel, orderly disconnection), before forcing a disconnection. */ mod_delayed_work(system_power_efficient_wq, ®_check_chans, msecs_to_jiffies(REG_ENFORCE_GRACE_MS)); } static void wiphy_update_regulatory(struct wiphy *wiphy, enum nl80211_reg_initiator initiator) { enum nl80211_band band; struct regulatory_request *lr = get_last_request(); if (ignore_reg_update(wiphy, initiator)) { /* * Regulatory updates set by CORE are ignored for custom * regulatory cards. Let us notify the changes to the driver, * as some drivers used this to restore its orig_* reg domain. */ if (initiator == NL80211_REGDOM_SET_BY_CORE && wiphy->regulatory_flags & REGULATORY_CUSTOM_REG && !(wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED)) reg_call_notifier(wiphy, lr); return; } lr->dfs_region = get_cfg80211_regdom()->dfs_region; for (band = 0; band < NUM_NL80211_BANDS; band++) handle_band(wiphy, initiator, wiphy->bands[band]); reg_process_beacons(wiphy); reg_process_ht_flags(wiphy); reg_call_notifier(wiphy, lr); } static void update_all_wiphy_regulatory(enum nl80211_reg_initiator initiator) { struct cfg80211_registered_device *rdev; struct wiphy *wiphy; ASSERT_RTNL(); for_each_rdev(rdev) { wiphy = &rdev->wiphy; wiphy_update_regulatory(wiphy, initiator); } reg_check_channels(); } static void handle_channel_custom(struct wiphy *wiphy, struct ieee80211_channel *chan, const struct ieee80211_regdomain *regd, u32 min_bw) { u32 bw_flags = 0; const struct ieee80211_reg_rule *reg_rule = NULL; const struct ieee80211_power_rule *power_rule = NULL; u32 bw, center_freq_khz; center_freq_khz = ieee80211_channel_to_khz(chan); for (bw = MHZ_TO_KHZ(20); bw >= min_bw; bw = bw / 2) { reg_rule = freq_reg_info_regd(center_freq_khz, regd, bw); if (!IS_ERR(reg_rule)) break; } if (IS_ERR_OR_NULL(reg_rule)) { pr_debug("Disabling freq %d.%03d MHz as custom regd has no rule that fits it\n", chan->center_freq, chan->freq_offset); if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) { chan->flags |= IEEE80211_CHAN_DISABLED; } else { chan->orig_flags |= IEEE80211_CHAN_DISABLED; chan->flags = chan->orig_flags; } return; } power_rule = ®_rule->power_rule; bw_flags = reg_rule_to_chan_bw_flags(regd, reg_rule, chan); chan->dfs_state_entered = jiffies; chan->dfs_state = NL80211_DFS_USABLE; chan->beacon_found = false; if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) chan->flags = chan->orig_flags | bw_flags | map_regdom_flags(reg_rule->flags); else chan->flags |= map_regdom_flags(reg_rule->flags) | bw_flags; chan->max_antenna_gain = (int) MBI_TO_DBI(power_rule->max_antenna_gain); chan->max_reg_power = chan->max_power = (int) MBM_TO_DBM(power_rule->max_eirp); if (chan->flags & IEEE80211_CHAN_RADAR) { if (reg_rule->dfs_cac_ms) chan->dfs_cac_ms = reg_rule->dfs_cac_ms; else chan->dfs_cac_ms = IEEE80211_DFS_MIN_CAC_TIME_MS; } if (chan->flags & IEEE80211_CHAN_PSD) chan->psd = reg_rule->psd; chan->max_power = chan->max_reg_power; } static void handle_band_custom(struct wiphy *wiphy, struct ieee80211_supported_band *sband, const struct ieee80211_regdomain *regd) { unsigned int i; if (!sband) return; /* * We currently assume that you always want at least 20 MHz, * otherwise channel 12 might get enabled if this rule is * compatible to US, which permits 2402 - 2472 MHz. */ for (i = 0; i < sband->n_channels; i++) handle_channel_custom(wiphy, &sband->channels[i], regd, MHZ_TO_KHZ(20)); } /* Used by drivers prior to wiphy registration */ void wiphy_apply_custom_regulatory(struct wiphy *wiphy, const struct ieee80211_regdomain *regd) { const struct ieee80211_regdomain *new_regd, *tmp; enum nl80211_band band; unsigned int bands_set = 0; WARN(!(wiphy->regulatory_flags & REGULATORY_CUSTOM_REG), "wiphy should have REGULATORY_CUSTOM_REG\n"); wiphy->regulatory_flags |= REGULATORY_CUSTOM_REG; for (band = 0; band < NUM_NL80211_BANDS; band++) { if (!wiphy->bands[band]) continue; handle_band_custom(wiphy, wiphy->bands[band], regd); bands_set++; } /* * no point in calling this if it won't have any effect * on your device's supported bands. */ WARN_ON(!bands_set); new_regd = reg_copy_regd(regd); if (IS_ERR(new_regd)) return; rtnl_lock(); wiphy_lock(wiphy); tmp = get_wiphy_regdom(wiphy); rcu_assign_pointer(wiphy->regd, new_regd); rcu_free_regdom(tmp); wiphy_unlock(wiphy); rtnl_unlock(); } EXPORT_SYMBOL(wiphy_apply_custom_regulatory); static void reg_set_request_processed(void) { bool need_more_processing = false; struct regulatory_request *lr = get_last_request(); lr->processed = true; spin_lock(®_requests_lock); if (!list_empty(®_requests_list)) need_more_processing = true; spin_unlock(®_requests_lock); cancel_crda_timeout(); if (need_more_processing) schedule_work(®_work); } /** * reg_process_hint_core - process core regulatory requests * @core_request: a pending core regulatory request * * The wireless subsystem can use this function to process * a regulatory request issued by the regulatory core. * * Returns: %REG_REQ_OK or %REG_REQ_IGNORE, indicating if the * hint was processed or ignored */ static enum reg_request_treatment reg_process_hint_core(struct regulatory_request *core_request) { if (reg_query_database(core_request)) { core_request->intersect = false; core_request->processed = false; reg_update_last_request(core_request); return REG_REQ_OK; } return REG_REQ_IGNORE; } static enum reg_request_treatment __reg_process_hint_user(struct regulatory_request *user_request) { struct regulatory_request *lr = get_last_request(); if (reg_request_cell_base(user_request)) return reg_ignore_cell_hint(user_request); if (reg_request_cell_base(lr)) return REG_REQ_IGNORE; if (lr->initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE) return REG_REQ_INTERSECT; /* * If the user knows better the user should set the regdom * to their country before the IE is picked up */ if (lr->initiator == NL80211_REGDOM_SET_BY_USER && lr->intersect) return REG_REQ_IGNORE; /* * Process user requests only after previous user/driver/core * requests have been processed */ if ((lr->initiator == NL80211_REGDOM_SET_BY_CORE || lr->initiator == NL80211_REGDOM_SET_BY_DRIVER || lr->initiator == NL80211_REGDOM_SET_BY_USER) && regdom_changes(lr->alpha2)) return REG_REQ_IGNORE; if (!regdom_changes(user_request->alpha2)) return REG_REQ_ALREADY_SET; return REG_REQ_OK; } /** * reg_process_hint_user - process user regulatory requests * @user_request: a pending user regulatory request * * The wireless subsystem can use this function to process * a regulatory request initiated by userspace. * * Returns: %REG_REQ_OK or %REG_REQ_IGNORE, indicating if the * hint was processed or ignored */ static enum reg_request_treatment reg_process_hint_user(struct regulatory_request *user_request) { enum reg_request_treatment treatment; treatment = __reg_process_hint_user(user_request); if (treatment == REG_REQ_IGNORE || treatment == REG_REQ_ALREADY_SET) return REG_REQ_IGNORE; user_request->intersect = treatment == REG_REQ_INTERSECT; user_request->processed = false; if (reg_query_database(user_request)) { reg_update_last_request(user_request); user_alpha2[0] = user_request->alpha2[0]; user_alpha2[1] = user_request->alpha2[1]; return REG_REQ_OK; } return REG_REQ_IGNORE; } static enum reg_request_treatment __reg_process_hint_driver(struct regulatory_request *driver_request) { struct regulatory_request *lr = get_last_request(); if (lr->initiator == NL80211_REGDOM_SET_BY_CORE) { if (regdom_changes(driver_request->alpha2)) return REG_REQ_OK; return REG_REQ_ALREADY_SET; } /* * This would happen if you unplug and plug your card * back in or if you add a new device for which the previously * loaded card also agrees on the regulatory domain. */ if (lr->initiator == NL80211_REGDOM_SET_BY_DRIVER && !regdom_changes(driver_request->alpha2)) return REG_REQ_ALREADY_SET; return REG_REQ_INTERSECT; } /** * reg_process_hint_driver - process driver regulatory requests * @wiphy: the wireless device for the regulatory request * @driver_request: a pending driver regulatory request * * The wireless subsystem can use this function to process * a regulatory request issued by an 802.11 driver. * * Returns: one of the different reg request treatment values. */ static enum reg_request_treatment reg_process_hint_driver(struct wiphy *wiphy, struct regulatory_request *driver_request) { const struct ieee80211_regdomain *regd, *tmp; enum reg_request_treatment treatment; treatment = __reg_process_hint_driver(driver_request); switch (treatment) { case REG_REQ_OK: break; case REG_REQ_IGNORE: return REG_REQ_IGNORE; case REG_REQ_INTERSECT: case REG_REQ_ALREADY_SET: regd = reg_copy_regd(get_cfg80211_regdom()); if (IS_ERR(regd)) return REG_REQ_IGNORE; tmp = get_wiphy_regdom(wiphy); ASSERT_RTNL(); wiphy_lock(wiphy); rcu_assign_pointer(wiphy->regd, regd); wiphy_unlock(wiphy); rcu_free_regdom(tmp); } driver_request->intersect = treatment == REG_REQ_INTERSECT; driver_request->processed = false; /* * Since CRDA will not be called in this case as we already * have applied the requested regulatory domain before we just * inform userspace we have processed the request */ if (treatment == REG_REQ_ALREADY_SET) { nl80211_send_reg_change_event(driver_request); reg_update_last_request(driver_request); reg_set_request_processed(); return REG_REQ_ALREADY_SET; } if (reg_query_database(driver_request)) { reg_update_last_request(driver_request); return REG_REQ_OK; } return REG_REQ_IGNORE; } static enum reg_request_treatment __reg_process_hint_country_ie(struct wiphy *wiphy, struct regulatory_request *country_ie_request) { struct wiphy *last_wiphy = NULL; struct regulatory_request *lr = get_last_request(); if (reg_request_cell_base(lr)) { /* Trust a Cell base station over the AP's country IE */ if (regdom_changes(country_ie_request->alpha2)) return REG_REQ_IGNORE; return REG_REQ_ALREADY_SET; } else { if (wiphy->regulatory_flags & REGULATORY_COUNTRY_IE_IGNORE) return REG_REQ_IGNORE; } if (unlikely(!is_an_alpha2(country_ie_request->alpha2))) return -EINVAL; if (lr->initiator != NL80211_REGDOM_SET_BY_COUNTRY_IE) return REG_REQ_OK; last_wiphy = wiphy_idx_to_wiphy(lr->wiphy_idx); if (last_wiphy != wiphy) { /* * Two cards with two APs claiming different * Country IE alpha2s. We could * intersect them, but that seems unlikely * to be correct. Reject second one for now. */ if (regdom_changes(country_ie_request->alpha2)) return REG_REQ_IGNORE; return REG_REQ_ALREADY_SET; } if (regdom_changes(country_ie_request->alpha2)) return REG_REQ_OK; return REG_REQ_ALREADY_SET; } /** * reg_process_hint_country_ie - process regulatory requests from country IEs * @wiphy: the wireless device for the regulatory request * @country_ie_request: a regulatory request from a country IE * * The wireless subsystem can use this function to process * a regulatory request issued by a country Information Element. * * Returns: one of the different reg request treatment values. */ static enum reg_request_treatment reg_process_hint_country_ie(struct wiphy *wiphy, struct regulatory_request *country_ie_request) { enum reg_request_treatment treatment; treatment = __reg_process_hint_country_ie(wiphy, country_ie_request); switch (treatment) { case REG_REQ_OK: break; case REG_REQ_IGNORE: return REG_REQ_IGNORE; case REG_REQ_ALREADY_SET: reg_free_request(country_ie_request); return REG_REQ_ALREADY_SET; case REG_REQ_INTERSECT: /* * This doesn't happen yet, not sure we * ever want to support it for this case. */ WARN_ONCE(1, "Unexpected intersection for country elements"); return REG_REQ_IGNORE; } country_ie_request->intersect = false; country_ie_request->processed = false; if (reg_query_database(country_ie_request)) { reg_update_last_request(country_ie_request); return REG_REQ_OK; } return REG_REQ_IGNORE; } bool reg_dfs_domain_same(struct wiphy *wiphy1, struct wiphy *wiphy2) { const struct ieee80211_regdomain *wiphy1_regd = NULL; const struct ieee80211_regdomain *wiphy2_regd = NULL; const struct ieee80211_regdomain *cfg80211_regd = NULL; bool dfs_domain_same; rcu_read_lock(); cfg80211_regd = rcu_dereference(cfg80211_regdomain); wiphy1_regd = rcu_dereference(wiphy1->regd); if (!wiphy1_regd) wiphy1_regd = cfg80211_regd; wiphy2_regd = rcu_dereference(wiphy2->regd); if (!wiphy2_regd) wiphy2_regd = cfg80211_regd; dfs_domain_same = wiphy1_regd->dfs_region == wiphy2_regd->dfs_region; rcu_read_unlock(); return dfs_domain_same; } static void reg_copy_dfs_chan_state(struct ieee80211_channel *dst_chan, struct ieee80211_channel *src_chan) { if (!(dst_chan->flags & IEEE80211_CHAN_RADAR) || !(src_chan->flags & IEEE80211_CHAN_RADAR)) return; if (dst_chan->flags & IEEE80211_CHAN_DISABLED || src_chan->flags & IEEE80211_CHAN_DISABLED) return; if (src_chan->center_freq == dst_chan->center_freq && dst_chan->dfs_state == NL80211_DFS_USABLE) { dst_chan->dfs_state = src_chan->dfs_state; dst_chan->dfs_state_entered = src_chan->dfs_state_entered; } } static void wiphy_share_dfs_chan_state(struct wiphy *dst_wiphy, struct wiphy *src_wiphy) { struct ieee80211_supported_band *src_sband, *dst_sband; struct ieee80211_channel *src_chan, *dst_chan; int i, j, band; if (!reg_dfs_domain_same(dst_wiphy, src_wiphy)) return; for (band = 0; band < NUM_NL80211_BANDS; band++) { dst_sband = dst_wiphy->bands[band]; src_sband = src_wiphy->bands[band]; if (!dst_sband || !src_sband) continue; for (i = 0; i < dst_sband->n_channels; i++) { dst_chan = &dst_sband->channels[i]; for (j = 0; j < src_sband->n_channels; j++) { src_chan = &src_sband->channels[j]; reg_copy_dfs_chan_state(dst_chan, src_chan); } } } } static void wiphy_all_share_dfs_chan_state(struct wiphy *wiphy) { struct cfg80211_registered_device *rdev; ASSERT_RTNL(); for_each_rdev(rdev) { if (wiphy == &rdev->wiphy) continue; wiphy_share_dfs_chan_state(wiphy, &rdev->wiphy); } } /* This processes *all* regulatory hints */ static void reg_process_hint(struct regulatory_request *reg_request) { struct wiphy *wiphy = NULL; enum reg_request_treatment treatment; enum nl80211_reg_initiator initiator = reg_request->initiator; if (reg_request->wiphy_idx != WIPHY_IDX_INVALID) wiphy = wiphy_idx_to_wiphy(reg_request->wiphy_idx); switch (initiator) { case NL80211_REGDOM_SET_BY_CORE: treatment = reg_process_hint_core(reg_request); break; case NL80211_REGDOM_SET_BY_USER: treatment = reg_process_hint_user(reg_request); break; case NL80211_REGDOM_SET_BY_DRIVER: if (!wiphy) goto out_free; treatment = reg_process_hint_driver(wiphy, reg_request); break; case NL80211_REGDOM_SET_BY_COUNTRY_IE: if (!wiphy) goto out_free; treatment = reg_process_hint_country_ie(wiphy, reg_request); break; default: WARN(1, "invalid initiator %d\n", initiator); goto out_free; } if (treatment == REG_REQ_IGNORE) goto out_free; WARN(treatment != REG_REQ_OK && treatment != REG_REQ_ALREADY_SET, "unexpected treatment value %d\n", treatment); /* This is required so that the orig_* parameters are saved. * NOTE: treatment must be set for any case that reaches here! */ if (treatment == REG_REQ_ALREADY_SET && wiphy && wiphy->regulatory_flags & REGULATORY_STRICT_REG) { wiphy_update_regulatory(wiphy, initiator); wiphy_all_share_dfs_chan_state(wiphy); reg_check_channels(); } return; out_free: reg_free_request(reg_request); } static void notify_self_managed_wiphys(struct regulatory_request *request) { struct cfg80211_registered_device *rdev; struct wiphy *wiphy; for_each_rdev(rdev) { wiphy = &rdev->wiphy; if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED && request->initiator == NL80211_REGDOM_SET_BY_USER) reg_call_notifier(wiphy, request); } } /* * Processes regulatory hints, this is all the NL80211_REGDOM_SET_BY_* * Regulatory hints come on a first come first serve basis and we * must process each one atomically. */ static void reg_process_pending_hints(void) { struct regulatory_request *reg_request, *lr; lr = get_last_request(); /* When last_request->processed becomes true this will be rescheduled */ if (lr && !lr->processed) { pr_debug("Pending regulatory request, waiting for it to be processed...\n"); return; } spin_lock(®_requests_lock); if (list_empty(®_requests_list)) { spin_unlock(®_requests_lock); return; } reg_request = list_first_entry(®_requests_list, struct regulatory_request, list); list_del_init(®_request->list); spin_unlock(®_requests_lock); notify_self_managed_wiphys(reg_request); reg_process_hint(reg_request); lr = get_last_request(); spin_lock(®_requests_lock); if (!list_empty(®_requests_list) && lr && lr->processed) schedule_work(®_work); spin_unlock(®_requests_lock); } /* Processes beacon hints -- this has nothing to do with country IEs */ static void reg_process_pending_beacon_hints(void) { struct cfg80211_registered_device *rdev; struct reg_beacon *pending_beacon, *tmp; /* This goes through the _pending_ beacon list */ spin_lock_bh(®_pending_beacons_lock); list_for_each_entry_safe(pending_beacon, tmp, ®_pending_beacons, list) { list_del_init(&pending_beacon->list); /* Applies the beacon hint to current wiphys */ for_each_rdev(rdev) wiphy_update_new_beacon(&rdev->wiphy, pending_beacon); /* Remembers the beacon hint for new wiphys or reg changes */ list_add_tail(&pending_beacon->list, ®_beacon_list); } spin_unlock_bh(®_pending_beacons_lock); } static void reg_process_self_managed_hint(struct wiphy *wiphy) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); const struct ieee80211_regdomain *tmp; const struct ieee80211_regdomain *regd; enum nl80211_band band; struct regulatory_request request = {}; ASSERT_RTNL(); lockdep_assert_wiphy(wiphy); spin_lock(®_requests_lock); regd = rdev->requested_regd; rdev->requested_regd = NULL; spin_unlock(®_requests_lock); if (!regd) return; tmp = get_wiphy_regdom(wiphy); rcu_assign_pointer(wiphy->regd, regd); rcu_free_regdom(tmp); for (band = 0; band < NUM_NL80211_BANDS; band++) handle_band_custom(wiphy, wiphy->bands[band], regd); reg_process_ht_flags(wiphy); request.wiphy_idx = get_wiphy_idx(wiphy); request.alpha2[0] = regd->alpha2[0]; request.alpha2[1] = regd->alpha2[1]; request.initiator = NL80211_REGDOM_SET_BY_DRIVER; if (wiphy->flags & WIPHY_FLAG_NOTIFY_REGDOM_BY_DRIVER) reg_call_notifier(wiphy, &request); nl80211_send_wiphy_reg_change_event(&request); } static void reg_process_self_managed_hints(void) { struct cfg80211_registered_device *rdev; ASSERT_RTNL(); for_each_rdev(rdev) { wiphy_lock(&rdev->wiphy); reg_process_self_managed_hint(&rdev->wiphy); wiphy_unlock(&rdev->wiphy); } reg_check_channels(); } static void reg_todo(struct work_struct *work) { rtnl_lock(); reg_process_pending_hints(); reg_process_pending_beacon_hints(); reg_process_self_managed_hints(); rtnl_unlock(); } static void queue_regulatory_request(struct regulatory_request *request) { request->alpha2[0] = toupper(request->alpha2[0]); request->alpha2[1] = toupper(request->alpha2[1]); spin_lock(®_requests_lock); list_add_tail(&request->list, ®_requests_list); spin_unlock(®_requests_lock); schedule_work(®_work); } /* * Core regulatory hint -- happens during cfg80211_init() * and when we restore regulatory settings. */ static int regulatory_hint_core(const char *alpha2) { struct regulatory_request *request; request = kzalloc(sizeof(struct regulatory_request), GFP_KERNEL); if (!request) return -ENOMEM; request->alpha2[0] = alpha2[0]; request->alpha2[1] = alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_CORE; request->wiphy_idx = WIPHY_IDX_INVALID; queue_regulatory_request(request); return 0; } /* User hints */ int regulatory_hint_user(const char *alpha2, enum nl80211_user_reg_hint_type user_reg_hint_type) { struct regulatory_request *request; if (WARN_ON(!alpha2)) return -EINVAL; if (!is_world_regdom(alpha2) && !is_an_alpha2(alpha2)) return -EINVAL; request = kzalloc(sizeof(struct regulatory_request), GFP_KERNEL); if (!request) return -ENOMEM; request->wiphy_idx = WIPHY_IDX_INVALID; request->alpha2[0] = alpha2[0]; request->alpha2[1] = alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_USER; request->user_reg_hint_type = user_reg_hint_type; /* Allow calling CRDA again */ reset_crda_timeouts(); queue_regulatory_request(request); return 0; } void regulatory_hint_indoor(bool is_indoor, u32 portid) { spin_lock(®_indoor_lock); /* It is possible that more than one user space process is trying to * configure the indoor setting. To handle such cases, clear the indoor * setting in case that some process does not think that the device * is operating in an indoor environment. In addition, if a user space * process indicates that it is controlling the indoor setting, save its * portid, i.e., make it the owner. */ reg_is_indoor = is_indoor; if (reg_is_indoor) { if (!reg_is_indoor_portid) reg_is_indoor_portid = portid; } else { reg_is_indoor_portid = 0; } spin_unlock(®_indoor_lock); if (!is_indoor) reg_check_channels(); } void regulatory_netlink_notify(u32 portid) { spin_lock(®_indoor_lock); if (reg_is_indoor_portid != portid) { spin_unlock(®_indoor_lock); return; } reg_is_indoor = false; reg_is_indoor_portid = 0; spin_unlock(®_indoor_lock); reg_check_channels(); } /* Driver hints */ int regulatory_hint(struct wiphy *wiphy, const char *alpha2) { struct regulatory_request *request; if (WARN_ON(!alpha2 || !wiphy)) return -EINVAL; wiphy->regulatory_flags &= ~REGULATORY_CUSTOM_REG; request = kzalloc(sizeof(struct regulatory_request), GFP_KERNEL); if (!request) return -ENOMEM; request->wiphy_idx = get_wiphy_idx(wiphy); request->alpha2[0] = alpha2[0]; request->alpha2[1] = alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_DRIVER; /* Allow calling CRDA again */ reset_crda_timeouts(); queue_regulatory_request(request); return 0; } EXPORT_SYMBOL(regulatory_hint); void regulatory_hint_country_ie(struct wiphy *wiphy, enum nl80211_band band, const u8 *country_ie, u8 country_ie_len) { char alpha2[2]; enum environment_cap env = ENVIRON_ANY; struct regulatory_request *request = NULL, *lr; /* IE len must be evenly divisible by 2 */ if (country_ie_len & 0x01) return; if (country_ie_len < IEEE80211_COUNTRY_IE_MIN_LEN) return; request = kzalloc(sizeof(*request), GFP_KERNEL); if (!request) return; alpha2[0] = country_ie[0]; alpha2[1] = country_ie[1]; if (country_ie[2] == 'I') env = ENVIRON_INDOOR; else if (country_ie[2] == 'O') env = ENVIRON_OUTDOOR; rcu_read_lock(); lr = get_last_request(); if (unlikely(!lr)) goto out; /* * We will run this only upon a successful connection on cfg80211. * We leave conflict resolution to the workqueue, where can hold * the RTNL. */ if (lr->initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE && lr->wiphy_idx != WIPHY_IDX_INVALID) goto out; request->wiphy_idx = get_wiphy_idx(wiphy); request->alpha2[0] = alpha2[0]; request->alpha2[1] = alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_COUNTRY_IE; request->country_ie_env = env; /* Allow calling CRDA again */ reset_crda_timeouts(); queue_regulatory_request(request); request = NULL; out: kfree(request); rcu_read_unlock(); } static void restore_alpha2(char *alpha2, bool reset_user) { /* indicates there is no alpha2 to consider for restoration */ alpha2[0] = '9'; alpha2[1] = '7'; /* The user setting has precedence over the module parameter */ if (is_user_regdom_saved()) { /* Unless we're asked to ignore it and reset it */ if (reset_user) { pr_debug("Restoring regulatory settings including user preference\n"); user_alpha2[0] = '9'; user_alpha2[1] = '7'; /* * If we're ignoring user settings, we still need to * check the module parameter to ensure we put things * back as they were for a full restore. */ if (!is_world_regdom(ieee80211_regdom)) { pr_debug("Keeping preference on module parameter ieee80211_regdom: %c%c\n", ieee80211_regdom[0], ieee80211_regdom[1]); alpha2[0] = ieee80211_regdom[0]; alpha2[1] = ieee80211_regdom[1]; } } else { pr_debug("Restoring regulatory settings while preserving user preference for: %c%c\n", user_alpha2[0], user_alpha2[1]); alpha2[0] = user_alpha2[0]; alpha2[1] = user_alpha2[1]; } } else if (!is_world_regdom(ieee80211_regdom)) { pr_debug("Keeping preference on module parameter ieee80211_regdom: %c%c\n", ieee80211_regdom[0], ieee80211_regdom[1]); alpha2[0] = ieee80211_regdom[0]; alpha2[1] = ieee80211_regdom[1]; } else pr_debug("Restoring regulatory settings\n"); } static void restore_custom_reg_settings(struct wiphy *wiphy) { struct ieee80211_supported_band *sband; enum nl80211_band band; struct ieee80211_channel *chan; int i; for (band = 0; band < NUM_NL80211_BANDS; band++) { sband = wiphy->bands[band]; if (!sband) continue; for (i = 0; i < sband->n_channels; i++) { chan = &sband->channels[i]; chan->flags = chan->orig_flags; chan->max_antenna_gain = chan->orig_mag; chan->max_power = chan->orig_mpwr; chan->beacon_found = false; } } } /* * Restoring regulatory settings involves ignoring any * possibly stale country IE information and user regulatory * settings if so desired, this includes any beacon hints * learned as we could have traveled outside to another country * after disconnection. To restore regulatory settings we do * exactly what we did at bootup: * * - send a core regulatory hint * - send a user regulatory hint if applicable * * Device drivers that send a regulatory hint for a specific country * keep their own regulatory domain on wiphy->regd so that does * not need to be remembered. */ static void restore_regulatory_settings(bool reset_user, bool cached) { char alpha2[2]; char world_alpha2[2]; struct reg_beacon *reg_beacon, *btmp; LIST_HEAD(tmp_reg_req_list); struct cfg80211_registered_device *rdev; ASSERT_RTNL(); /* * Clear the indoor setting in case that it is not controlled by user * space, as otherwise there is no guarantee that the device is still * operating in an indoor environment. */ spin_lock(®_indoor_lock); if (reg_is_indoor && !reg_is_indoor_portid) { reg_is_indoor = false; reg_check_channels(); } spin_unlock(®_indoor_lock); reset_regdomains(true, &world_regdom); restore_alpha2(alpha2, reset_user); /* * If there's any pending requests we simply * stash them to a temporary pending queue and * add then after we've restored regulatory * settings. */ spin_lock(®_requests_lock); list_splice_tail_init(®_requests_list, &tmp_reg_req_list); spin_unlock(®_requests_lock); /* Clear beacon hints */ spin_lock_bh(®_pending_beacons_lock); list_for_each_entry_safe(reg_beacon, btmp, ®_pending_beacons, list) { list_del(®_beacon->list); kfree(reg_beacon); } spin_unlock_bh(®_pending_beacons_lock); list_for_each_entry_safe(reg_beacon, btmp, ®_beacon_list, list) { list_del(®_beacon->list); kfree(reg_beacon); } /* First restore to the basic regulatory settings */ world_alpha2[0] = cfg80211_world_regdom->alpha2[0]; world_alpha2[1] = cfg80211_world_regdom->alpha2[1]; for_each_rdev(rdev) { if (rdev->wiphy.regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) continue; if (rdev->wiphy.regulatory_flags & REGULATORY_CUSTOM_REG) restore_custom_reg_settings(&rdev->wiphy); } if (cached && (!is_an_alpha2(alpha2) || !IS_ERR_OR_NULL(cfg80211_user_regdom))) { reset_regdomains(false, cfg80211_world_regdom); update_all_wiphy_regulatory(NL80211_REGDOM_SET_BY_CORE); print_regdomain(get_cfg80211_regdom()); nl80211_send_reg_change_event(&core_request_world); reg_set_request_processed(); if (is_an_alpha2(alpha2) && !regulatory_hint_user(alpha2, NL80211_USER_REG_HINT_USER)) { struct regulatory_request *ureq; spin_lock(®_requests_lock); ureq = list_last_entry(®_requests_list, struct regulatory_request, list); list_del(&ureq->list); spin_unlock(®_requests_lock); notify_self_managed_wiphys(ureq); reg_update_last_request(ureq); set_regdom(reg_copy_regd(cfg80211_user_regdom), REGD_SOURCE_CACHED); } } else { regulatory_hint_core(world_alpha2); /* * This restores the ieee80211_regdom module parameter * preference or the last user requested regulatory * settings, user regulatory settings takes precedence. */ if (is_an_alpha2(alpha2)) regulatory_hint_user(alpha2, NL80211_USER_REG_HINT_USER); } spin_lock(®_requests_lock); list_splice_tail_init(&tmp_reg_req_list, ®_requests_list); spin_unlock(®_requests_lock); pr_debug("Kicking the queue\n"); schedule_work(®_work); } static bool is_wiphy_all_set_reg_flag(enum ieee80211_regulatory_flags flag) { struct cfg80211_registered_device *rdev; struct wireless_dev *wdev; for_each_rdev(rdev) { wiphy_lock(&rdev->wiphy); list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) { if (!(wdev->wiphy->regulatory_flags & flag)) { wiphy_unlock(&rdev->wiphy); return false; } } wiphy_unlock(&rdev->wiphy); } return true; } void regulatory_hint_disconnect(void) { /* Restore of regulatory settings is not required when wiphy(s) * ignore IE from connected access point but clearance of beacon hints * is required when wiphy(s) supports beacon hints. */ if (is_wiphy_all_set_reg_flag(REGULATORY_COUNTRY_IE_IGNORE)) { struct reg_beacon *reg_beacon, *btmp; if (is_wiphy_all_set_reg_flag(REGULATORY_DISABLE_BEACON_HINTS)) return; spin_lock_bh(®_pending_beacons_lock); list_for_each_entry_safe(reg_beacon, btmp, ®_pending_beacons, list) { list_del(®_beacon->list); kfree(reg_beacon); } spin_unlock_bh(®_pending_beacons_lock); list_for_each_entry_safe(reg_beacon, btmp, ®_beacon_list, list) { list_del(®_beacon->list); kfree(reg_beacon); } return; } pr_debug("All devices are disconnected, going to restore regulatory settings\n"); restore_regulatory_settings(false, true); } static bool freq_is_chan_12_13_14(u32 freq) { if (freq == ieee80211_channel_to_frequency(12, NL80211_BAND_2GHZ) || freq == ieee80211_channel_to_frequency(13, NL80211_BAND_2GHZ) || freq == ieee80211_channel_to_frequency(14, NL80211_BAND_2GHZ)) return true; return false; } static bool pending_reg_beacon(struct ieee80211_channel *beacon_chan) { struct reg_beacon *pending_beacon; list_for_each_entry(pending_beacon, ®_pending_beacons, list) if (ieee80211_channel_equal(beacon_chan, &pending_beacon->chan)) return true; return false; } void regulatory_hint_found_beacon(struct wiphy *wiphy, struct ieee80211_channel *beacon_chan, gfp_t gfp) { struct reg_beacon *reg_beacon; bool processing; if (beacon_chan->beacon_found || beacon_chan->flags & IEEE80211_CHAN_RADAR || (beacon_chan->band == NL80211_BAND_2GHZ && !freq_is_chan_12_13_14(beacon_chan->center_freq))) return; spin_lock_bh(®_pending_beacons_lock); processing = pending_reg_beacon(beacon_chan); spin_unlock_bh(®_pending_beacons_lock); if (processing) return; reg_beacon = kzalloc(sizeof(struct reg_beacon), gfp); if (!reg_beacon) return; pr_debug("Found new beacon on frequency: %d.%03d MHz (Ch %d) on %s\n", beacon_chan->center_freq, beacon_chan->freq_offset, ieee80211_freq_khz_to_channel( ieee80211_channel_to_khz(beacon_chan)), wiphy_name(wiphy)); memcpy(®_beacon->chan, beacon_chan, sizeof(struct ieee80211_channel)); /* * Since we can be called from BH or and non-BH context * we must use spin_lock_bh() */ spin_lock_bh(®_pending_beacons_lock); list_add_tail(®_beacon->list, ®_pending_beacons); spin_unlock_bh(®_pending_beacons_lock); schedule_work(®_work); } static void print_rd_rules(const struct ieee80211_regdomain *rd) { unsigned int i; const struct ieee80211_reg_rule *reg_rule = NULL; const struct ieee80211_freq_range *freq_range = NULL; const struct ieee80211_power_rule *power_rule = NULL; char bw[32], cac_time[32]; pr_debug(" (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)\n"); for (i = 0; i < rd->n_reg_rules; i++) { reg_rule = &rd->reg_rules[i]; freq_range = ®_rule->freq_range; power_rule = ®_rule->power_rule; if (reg_rule->flags & NL80211_RRF_AUTO_BW) snprintf(bw, sizeof(bw), "%d KHz, %u KHz AUTO", freq_range->max_bandwidth_khz, reg_get_max_bandwidth(rd, reg_rule)); else snprintf(bw, sizeof(bw), "%d KHz", freq_range->max_bandwidth_khz); if (reg_rule->flags & NL80211_RRF_DFS) scnprintf(cac_time, sizeof(cac_time), "%u s", reg_rule->dfs_cac_ms/1000); else scnprintf(cac_time, sizeof(cac_time), "N/A"); /* * There may not be documentation for max antenna gain * in certain regions */ if (power_rule->max_antenna_gain) pr_debug(" (%d KHz - %d KHz @ %s), (%d mBi, %d mBm), (%s)\n", freq_range->start_freq_khz, freq_range->end_freq_khz, bw, power_rule->max_antenna_gain, power_rule->max_eirp, cac_time); else pr_debug(" (%d KHz - %d KHz @ %s), (N/A, %d mBm), (%s)\n", freq_range->start_freq_khz, freq_range->end_freq_khz, bw, power_rule->max_eirp, cac_time); } } bool reg_supported_dfs_region(enum nl80211_dfs_regions dfs_region) { switch (dfs_region) { case NL80211_DFS_UNSET: case NL80211_DFS_FCC: case NL80211_DFS_ETSI: case NL80211_DFS_JP: return true; default: pr_debug("Ignoring unknown DFS master region: %d\n", dfs_region); return false; } } static void print_regdomain(const struct ieee80211_regdomain *rd) { struct regulatory_request *lr = get_last_request(); if (is_intersected_alpha2(rd->alpha2)) { if (lr->initiator == NL80211_REGDOM_SET_BY_COUNTRY_IE) { struct cfg80211_registered_device *rdev; rdev = cfg80211_rdev_by_wiphy_idx(lr->wiphy_idx); if (rdev) { pr_debug("Current regulatory domain updated by AP to: %c%c\n", rdev->country_ie_alpha2[0], rdev->country_ie_alpha2[1]); } else pr_debug("Current regulatory domain intersected:\n"); } else pr_debug("Current regulatory domain intersected:\n"); } else if (is_world_regdom(rd->alpha2)) { pr_debug("World regulatory domain updated:\n"); } else { if (is_unknown_alpha2(rd->alpha2)) pr_debug("Regulatory domain changed to driver built-in settings (unknown country)\n"); else { if (reg_request_cell_base(lr)) pr_debug("Regulatory domain changed to country: %c%c by Cell Station\n", rd->alpha2[0], rd->alpha2[1]); else pr_debug("Regulatory domain changed to country: %c%c\n", rd->alpha2[0], rd->alpha2[1]); } } pr_debug(" DFS Master region: %s", reg_dfs_region_str(rd->dfs_region)); print_rd_rules(rd); } static void print_regdomain_info(const struct ieee80211_regdomain *rd) { pr_debug("Regulatory domain: %c%c\n", rd->alpha2[0], rd->alpha2[1]); print_rd_rules(rd); } static int reg_set_rd_core(const struct ieee80211_regdomain *rd) { if (!is_world_regdom(rd->alpha2)) return -EINVAL; update_world_regdomain(rd); return 0; } static int reg_set_rd_user(const struct ieee80211_regdomain *rd, struct regulatory_request *user_request) { const struct ieee80211_regdomain *intersected_rd = NULL; if (!regdom_changes(rd->alpha2)) return -EALREADY; if (!is_valid_rd(rd)) { pr_err("Invalid regulatory domain detected: %c%c\n", rd->alpha2[0], rd->alpha2[1]); print_regdomain_info(rd); return -EINVAL; } if (!user_request->intersect) { reset_regdomains(false, rd); return 0; } intersected_rd = regdom_intersect(rd, get_cfg80211_regdom()); if (!intersected_rd) return -EINVAL; kfree(rd); rd = NULL; reset_regdomains(false, intersected_rd); return 0; } static int reg_set_rd_driver(const struct ieee80211_regdomain *rd, struct regulatory_request *driver_request) { const struct ieee80211_regdomain *regd; const struct ieee80211_regdomain *intersected_rd = NULL; const struct ieee80211_regdomain *tmp = NULL; struct wiphy *request_wiphy; if (is_world_regdom(rd->alpha2)) return -EINVAL; if (!regdom_changes(rd->alpha2)) return -EALREADY; if (!is_valid_rd(rd)) { pr_err("Invalid regulatory domain detected: %c%c\n", rd->alpha2[0], rd->alpha2[1]); print_regdomain_info(rd); return -EINVAL; } request_wiphy = wiphy_idx_to_wiphy(driver_request->wiphy_idx); if (!request_wiphy) return -ENODEV; if (!driver_request->intersect) { ASSERT_RTNL(); wiphy_lock(request_wiphy); if (request_wiphy->regd) tmp = get_wiphy_regdom(request_wiphy); regd = reg_copy_regd(rd); if (IS_ERR(regd)) { wiphy_unlock(request_wiphy); return PTR_ERR(regd); } rcu_assign_pointer(request_wiphy->regd, regd); rcu_free_regdom(tmp); wiphy_unlock(request_wiphy); reset_regdomains(false, rd); return 0; } intersected_rd = regdom_intersect(rd, get_cfg80211_regdom()); if (!intersected_rd) return -EINVAL; /* * We can trash what CRDA provided now. * However if a driver requested this specific regulatory * domain we keep it for its private use */ tmp = get_wiphy_regdom(request_wiphy); rcu_assign_pointer(request_wiphy->regd, rd); rcu_free_regdom(tmp); rd = NULL; reset_regdomains(false, intersected_rd); return 0; } static int reg_set_rd_country_ie(const struct ieee80211_regdomain *rd, struct regulatory_request *country_ie_request) { struct wiphy *request_wiphy; if (!is_alpha2_set(rd->alpha2) && !is_an_alpha2(rd->alpha2) && !is_unknown_alpha2(rd->alpha2)) return -EINVAL; /* * Lets only bother proceeding on the same alpha2 if the current * rd is non static (it means CRDA was present and was used last) * and the pending request came in from a country IE */ if (!is_valid_rd(rd)) { pr_err("Invalid regulatory domain detected: %c%c\n", rd->alpha2[0], rd->alpha2[1]); print_regdomain_info(rd); return -EINVAL; } request_wiphy = wiphy_idx_to_wiphy(country_ie_request->wiphy_idx); if (!request_wiphy) return -ENODEV; if (country_ie_request->intersect) return -EINVAL; reset_regdomains(false, rd); return 0; } /* * Use this call to set the current regulatory domain. Conflicts with * multiple drivers can be ironed out later. Caller must've already * kmalloc'd the rd structure. */ int set_regdom(const struct ieee80211_regdomain *rd, enum ieee80211_regd_source regd_src) { struct regulatory_request *lr; bool user_reset = false; int r; if (IS_ERR_OR_NULL(rd)) return -ENODATA; if (!reg_is_valid_request(rd->alpha2)) { kfree(rd); return -EINVAL; } if (regd_src == REGD_SOURCE_CRDA) reset_crda_timeouts(); lr = get_last_request(); /* Note that this doesn't update the wiphys, this is done below */ switch (lr->initiator) { case NL80211_REGDOM_SET_BY_CORE: r = reg_set_rd_core(rd); break; case NL80211_REGDOM_SET_BY_USER: cfg80211_save_user_regdom(rd); r = reg_set_rd_user(rd, lr); user_reset = true; break; case NL80211_REGDOM_SET_BY_DRIVER: r = reg_set_rd_driver(rd, lr); break; case NL80211_REGDOM_SET_BY_COUNTRY_IE: r = reg_set_rd_country_ie(rd, lr); break; default: WARN(1, "invalid initiator %d\n", lr->initiator); kfree(rd); return -EINVAL; } if (r) { switch (r) { case -EALREADY: reg_set_request_processed(); break; default: /* Back to world regulatory in case of errors */ restore_regulatory_settings(user_reset, false); } kfree(rd); return r; } /* This would make this whole thing pointless */ if (WARN_ON(!lr->intersect && rd != get_cfg80211_regdom())) return -EINVAL; /* update all wiphys now with the new established regulatory domain */ update_all_wiphy_regulatory(lr->initiator); print_regdomain(get_cfg80211_regdom()); nl80211_send_reg_change_event(lr); reg_set_request_processed(); return 0; } static int __regulatory_set_wiphy_regd(struct wiphy *wiphy, struct ieee80211_regdomain *rd) { const struct ieee80211_regdomain *regd; const struct ieee80211_regdomain *prev_regd; struct cfg80211_registered_device *rdev; if (WARN_ON(!wiphy || !rd)) return -EINVAL; if (WARN(!(wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED), "wiphy should have REGULATORY_WIPHY_SELF_MANAGED\n")) return -EPERM; if (WARN(!is_valid_rd(rd), "Invalid regulatory domain detected: %c%c\n", rd->alpha2[0], rd->alpha2[1])) { print_regdomain_info(rd); return -EINVAL; } regd = reg_copy_regd(rd); if (IS_ERR(regd)) return PTR_ERR(regd); rdev = wiphy_to_rdev(wiphy); spin_lock(®_requests_lock); prev_regd = rdev->requested_regd; rdev->requested_regd = regd; spin_unlock(®_requests_lock); kfree(prev_regd); return 0; } int regulatory_set_wiphy_regd(struct wiphy *wiphy, struct ieee80211_regdomain *rd) { int ret = __regulatory_set_wiphy_regd(wiphy, rd); if (ret) return ret; schedule_work(®_work); return 0; } EXPORT_SYMBOL(regulatory_set_wiphy_regd); int regulatory_set_wiphy_regd_sync(struct wiphy *wiphy, struct ieee80211_regdomain *rd) { int ret; ASSERT_RTNL(); ret = __regulatory_set_wiphy_regd(wiphy, rd); if (ret) return ret; /* process the request immediately */ reg_process_self_managed_hint(wiphy); reg_check_channels(); return 0; } EXPORT_SYMBOL(regulatory_set_wiphy_regd_sync); void wiphy_regulatory_register(struct wiphy *wiphy) { struct regulatory_request *lr = get_last_request(); /* self-managed devices ignore beacon hints and country IE */ if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) { wiphy->regulatory_flags |= REGULATORY_DISABLE_BEACON_HINTS | REGULATORY_COUNTRY_IE_IGNORE; /* * The last request may have been received before this * registration call. Call the driver notifier if * initiator is USER. */ if (lr->initiator == NL80211_REGDOM_SET_BY_USER) reg_call_notifier(wiphy, lr); } if (!reg_dev_ignore_cell_hint(wiphy)) reg_num_devs_support_basehint++; wiphy_update_regulatory(wiphy, lr->initiator); wiphy_all_share_dfs_chan_state(wiphy); reg_process_self_managed_hints(); } void wiphy_regulatory_deregister(struct wiphy *wiphy) { struct wiphy *request_wiphy = NULL; struct regulatory_request *lr; lr = get_last_request(); if (!reg_dev_ignore_cell_hint(wiphy)) reg_num_devs_support_basehint--; rcu_free_regdom(get_wiphy_regdom(wiphy)); RCU_INIT_POINTER(wiphy->regd, NULL); if (lr) request_wiphy = wiphy_idx_to_wiphy(lr->wiphy_idx); if (!request_wiphy || request_wiphy != wiphy) return; lr->wiphy_idx = WIPHY_IDX_INVALID; lr->country_ie_env = ENVIRON_ANY; } /* * See FCC notices for UNII band definitions * 5GHz: https://www.fcc.gov/document/5-ghz-unlicensed-spectrum-unii * 6GHz: https://www.fcc.gov/document/fcc-proposes-more-spectrum-unlicensed-use-0 */ int cfg80211_get_unii(int freq) { /* UNII-1 */ if (freq >= 5150 && freq <= 5250) return 0; /* UNII-2A */ if (freq > 5250 && freq <= 5350) return 1; /* UNII-2B */ if (freq > 5350 && freq <= 5470) return 2; /* UNII-2C */ if (freq > 5470 && freq <= 5725) return 3; /* UNII-3 */ if (freq > 5725 && freq <= 5825) return 4; /* UNII-5 */ if (freq > 5925 && freq <= 6425) return 5; /* UNII-6 */ if (freq > 6425 && freq <= 6525) return 6; /* UNII-7 */ if (freq > 6525 && freq <= 6875) return 7; /* UNII-8 */ if (freq > 6875 && freq <= 7125) return 8; return -EINVAL; } bool regulatory_indoor_allowed(void) { return reg_is_indoor; } bool regulatory_pre_cac_allowed(struct wiphy *wiphy) { const struct ieee80211_regdomain *regd = NULL; const struct ieee80211_regdomain *wiphy_regd = NULL; bool pre_cac_allowed = false; rcu_read_lock(); regd = rcu_dereference(cfg80211_regdomain); wiphy_regd = rcu_dereference(wiphy->regd); if (!wiphy_regd) { if (regd->dfs_region == NL80211_DFS_ETSI) pre_cac_allowed = true; rcu_read_unlock(); return pre_cac_allowed; } if (regd->dfs_region == wiphy_regd->dfs_region && wiphy_regd->dfs_region == NL80211_DFS_ETSI) pre_cac_allowed = true; rcu_read_unlock(); return pre_cac_allowed; } EXPORT_SYMBOL(regulatory_pre_cac_allowed); static void cfg80211_check_and_end_cac(struct cfg80211_registered_device *rdev) { struct wireless_dev *wdev; /* If we finished CAC or received radar, we should end any * CAC running on the same channels. * the check !cfg80211_chandef_dfs_usable contain 2 options: * either all channels are available - those the CAC_FINISHED * event has effected another wdev state, or there is a channel * in unavailable state in wdev chandef - those the RADAR_DETECTED * event has effected another wdev state. * In both cases we should end the CAC on the wdev. */ list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) { struct cfg80211_chan_def *chandef; if (!wdev->cac_started) continue; /* FIXME: radar detection is tied to link 0 for now */ chandef = wdev_chandef(wdev, 0); if (!chandef) continue; if (!cfg80211_chandef_dfs_usable(&rdev->wiphy, chandef)) rdev_end_cac(rdev, wdev->netdev); } } void regulatory_propagate_dfs_state(struct wiphy *wiphy, struct cfg80211_chan_def *chandef, enum nl80211_dfs_state dfs_state, enum nl80211_radar_event event) { struct cfg80211_registered_device *rdev; ASSERT_RTNL(); if (WARN_ON(!cfg80211_chandef_valid(chandef))) return; for_each_rdev(rdev) { if (wiphy == &rdev->wiphy) continue; if (!reg_dfs_domain_same(wiphy, &rdev->wiphy)) continue; if (!ieee80211_get_channel(&rdev->wiphy, chandef->chan->center_freq)) continue; cfg80211_set_dfs_state(&rdev->wiphy, chandef, dfs_state); if (event == NL80211_RADAR_DETECTED || event == NL80211_RADAR_CAC_FINISHED) { cfg80211_sched_dfs_chan_update(rdev); cfg80211_check_and_end_cac(rdev); } nl80211_radar_notify(rdev, chandef, event, NULL, GFP_KERNEL); } } static int __init regulatory_init_db(void) { int err; /* * It's possible that - due to other bugs/issues - cfg80211 * never called regulatory_init() below, or that it failed; * in that case, don't try to do any further work here as * it's doomed to lead to crashes. */ if (IS_ERR_OR_NULL(reg_pdev)) return -EINVAL; err = load_builtin_regdb_keys(); if (err) { platform_device_unregister(reg_pdev); return err; } /* We always try to get an update for the static regdomain */ err = regulatory_hint_core(cfg80211_world_regdom->alpha2); if (err) { if (err == -ENOMEM) { platform_device_unregister(reg_pdev); return err; } /* * N.B. kobject_uevent_env() can fail mainly for when we're out * memory which is handled and propagated appropriately above * but it can also fail during a netlink_broadcast() or during * early boot for call_usermodehelper(). For now treat these * errors as non-fatal. */ pr_err("kobject_uevent_env() was unable to call CRDA during init\n"); } /* * Finally, if the user set the module parameter treat it * as a user hint. */ if (!is_world_regdom(ieee80211_regdom)) regulatory_hint_user(ieee80211_regdom, NL80211_USER_REG_HINT_USER); return 0; } #ifndef MODULE late_initcall(regulatory_init_db); #endif int __init regulatory_init(void) { reg_pdev = platform_device_register_simple("regulatory", 0, NULL, 0); if (IS_ERR(reg_pdev)) return PTR_ERR(reg_pdev); rcu_assign_pointer(cfg80211_regdomain, cfg80211_world_regdom); user_alpha2[0] = '9'; user_alpha2[1] = '7'; #ifdef MODULE return regulatory_init_db(); #else return 0; #endif } void regulatory_exit(void) { struct regulatory_request *reg_request, *tmp; struct reg_beacon *reg_beacon, *btmp; cancel_work_sync(®_work); cancel_crda_timeout_sync(); cancel_delayed_work_sync(®_check_chans); /* Lock to suppress warnings */ rtnl_lock(); reset_regdomains(true, NULL); rtnl_unlock(); dev_set_uevent_suppress(®_pdev->dev, true); platform_device_unregister(reg_pdev); list_for_each_entry_safe(reg_beacon, btmp, ®_pending_beacons, list) { list_del(®_beacon->list); kfree(reg_beacon); } list_for_each_entry_safe(reg_beacon, btmp, ®_beacon_list, list) { list_del(®_beacon->list); kfree(reg_beacon); } list_for_each_entry_safe(reg_request, tmp, ®_requests_list, list) { list_del(®_request->list); kfree(reg_request); } if (!IS_ERR_OR_NULL(regdb)) kfree(regdb); if (!IS_ERR_OR_NULL(cfg80211_user_regdom)) kfree(cfg80211_user_regdom); free_regdb_keyring(); } |
| 6 6 4 2 6 3 3 2 1 3 3 3 2 1 3 3 3 2 1 3 2 2 1 2 3 3 3 2 2 1 3 4 4 2 1 4 4 4 3 2 1 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Copyright Samuel Mendoza-Jonas, IBM Corporation 2018. */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/if_arp.h> #include <linux/rtnetlink.h> #include <linux/etherdevice.h> #include <net/genetlink.h> #include <net/ncsi.h> #include <linux/skbuff.h> #include <net/sock.h> #include <uapi/linux/ncsi.h> #include "internal.h" #include "ncsi-pkt.h" #include "ncsi-netlink.h" static struct genl_family ncsi_genl_family; static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = { [NCSI_ATTR_IFINDEX] = { .type = NLA_U32 }, [NCSI_ATTR_PACKAGE_LIST] = { .type = NLA_NESTED }, [NCSI_ATTR_PACKAGE_ID] = { .type = NLA_U32 }, [NCSI_ATTR_CHANNEL_ID] = { .type = NLA_U32 }, [NCSI_ATTR_DATA] = { .type = NLA_BINARY, .len = 2048 }, [NCSI_ATTR_MULTI_FLAG] = { .type = NLA_FLAG }, [NCSI_ATTR_PACKAGE_MASK] = { .type = NLA_U32 }, [NCSI_ATTR_CHANNEL_MASK] = { .type = NLA_U32 }, }; static struct ncsi_dev_priv *ndp_from_ifindex(struct net *net, u32 ifindex) { struct ncsi_dev_priv *ndp; struct net_device *dev; struct ncsi_dev *nd; struct ncsi_dev; if (!net) return NULL; dev = dev_get_by_index(net, ifindex); if (!dev) { pr_err("NCSI netlink: No device for ifindex %u\n", ifindex); return NULL; } nd = ncsi_find_dev(dev); ndp = nd ? TO_NCSI_DEV_PRIV(nd) : NULL; dev_put(dev); return ndp; } static int ncsi_write_channel_info(struct sk_buff *skb, struct ncsi_dev_priv *ndp, struct ncsi_channel *nc) { struct ncsi_channel_vlan_filter *ncf; struct ncsi_channel_mode *m; struct nlattr *vid_nest; int i; nla_put_u32(skb, NCSI_CHANNEL_ATTR_ID, nc->id); m = &nc->modes[NCSI_MODE_LINK]; nla_put_u32(skb, NCSI_CHANNEL_ATTR_LINK_STATE, m->data[2]); if (nc->state == NCSI_CHANNEL_ACTIVE) nla_put_flag(skb, NCSI_CHANNEL_ATTR_ACTIVE); if (nc == nc->package->preferred_channel) nla_put_flag(skb, NCSI_CHANNEL_ATTR_FORCED); nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MAJOR, nc->version.major); nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MINOR, nc->version.minor); nla_put_string(skb, NCSI_CHANNEL_ATTR_VERSION_STR, nc->version.fw_name); vid_nest = nla_nest_start_noflag(skb, NCSI_CHANNEL_ATTR_VLAN_LIST); if (!vid_nest) return -ENOMEM; ncf = &nc->vlan_filter; i = -1; while ((i = find_next_bit((void *)&ncf->bitmap, ncf->n_vids, i + 1)) < ncf->n_vids) { if (ncf->vids[i]) nla_put_u16(skb, NCSI_CHANNEL_ATTR_VLAN_ID, ncf->vids[i]); } nla_nest_end(skb, vid_nest); return 0; } static int ncsi_write_package_info(struct sk_buff *skb, struct ncsi_dev_priv *ndp, unsigned int id) { struct nlattr *pnest, *cnest, *nest; struct ncsi_package *np; struct ncsi_channel *nc; bool found; int rc; if (id > ndp->package_num - 1) { netdev_info(ndp->ndev.dev, "NCSI: No package with id %u\n", id); return -ENODEV; } found = false; NCSI_FOR_EACH_PACKAGE(ndp, np) { if (np->id != id) continue; pnest = nla_nest_start_noflag(skb, NCSI_PKG_ATTR); if (!pnest) return -ENOMEM; rc = nla_put_u32(skb, NCSI_PKG_ATTR_ID, np->id); if (rc) { nla_nest_cancel(skb, pnest); return rc; } if ((0x1 << np->id) == ndp->package_whitelist) nla_put_flag(skb, NCSI_PKG_ATTR_FORCED); cnest = nla_nest_start_noflag(skb, NCSI_PKG_ATTR_CHANNEL_LIST); if (!cnest) { nla_nest_cancel(skb, pnest); return -ENOMEM; } NCSI_FOR_EACH_CHANNEL(np, nc) { nest = nla_nest_start_noflag(skb, NCSI_CHANNEL_ATTR); if (!nest) { nla_nest_cancel(skb, cnest); nla_nest_cancel(skb, pnest); return -ENOMEM; } rc = ncsi_write_channel_info(skb, ndp, nc); if (rc) { nla_nest_cancel(skb, nest); nla_nest_cancel(skb, cnest); nla_nest_cancel(skb, pnest); return rc; } nla_nest_end(skb, nest); } nla_nest_end(skb, cnest); nla_nest_end(skb, pnest); found = true; } if (!found) return -ENODEV; return 0; } static int ncsi_pkg_info_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_dev_priv *ndp; unsigned int package_id; struct sk_buff *skb; struct nlattr *attr; void *hdr; int rc; if (!info || !info->attrs) return -EINVAL; if (!info->attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) return -EINVAL; ndp = ndp_from_ifindex(genl_info_net(info), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!skb) return -ENOMEM; hdr = genlmsg_put(skb, info->snd_portid, info->snd_seq, &ncsi_genl_family, 0, NCSI_CMD_PKG_INFO); if (!hdr) { kfree_skb(skb); return -EMSGSIZE; } package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]); attr = nla_nest_start_noflag(skb, NCSI_ATTR_PACKAGE_LIST); if (!attr) { kfree_skb(skb); return -EMSGSIZE; } rc = ncsi_write_package_info(skb, ndp, package_id); if (rc) { nla_nest_cancel(skb, attr); goto err; } nla_nest_end(skb, attr); genlmsg_end(skb, hdr); return genlmsg_reply(skb, info); err: kfree_skb(skb); return rc; } static int ncsi_pkg_info_all_nl(struct sk_buff *skb, struct netlink_callback *cb) { struct nlattr *attrs[NCSI_ATTR_MAX + 1]; struct ncsi_package *np, *package; struct ncsi_dev_priv *ndp; unsigned int package_id; struct nlattr *attr; void *hdr; int rc; rc = genlmsg_parse_deprecated(cb->nlh, &ncsi_genl_family, attrs, NCSI_ATTR_MAX, ncsi_genl_policy, NULL); if (rc) return rc; if (!attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; ndp = ndp_from_ifindex(get_net(sock_net(skb->sk)), nla_get_u32(attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; package_id = cb->args[0]; package = NULL; NCSI_FOR_EACH_PACKAGE(ndp, np) if (np->id == package_id) package = np; if (!package) return 0; /* done */ hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, &ncsi_genl_family, NLM_F_MULTI, NCSI_CMD_PKG_INFO); if (!hdr) { rc = -EMSGSIZE; goto err; } attr = nla_nest_start_noflag(skb, NCSI_ATTR_PACKAGE_LIST); if (!attr) { rc = -EMSGSIZE; goto err; } rc = ncsi_write_package_info(skb, ndp, package->id); if (rc) { nla_nest_cancel(skb, attr); goto err; } nla_nest_end(skb, attr); genlmsg_end(skb, hdr); cb->args[0] = package_id + 1; return skb->len; err: genlmsg_cancel(skb, hdr); return rc; } static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_package *np, *package; struct ncsi_channel *nc, *channel; u32 package_id, channel_id; struct ncsi_dev_priv *ndp; unsigned long flags; if (!info || !info->attrs) return -EINVAL; if (!info->attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) return -EINVAL; ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]); package = NULL; NCSI_FOR_EACH_PACKAGE(ndp, np) if (np->id == package_id) package = np; if (!package) { /* The user has set a package that does not exist */ return -ERANGE; } channel = NULL; if (info->attrs[NCSI_ATTR_CHANNEL_ID]) { channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]); NCSI_FOR_EACH_CHANNEL(package, nc) if (nc->id == channel_id) { channel = nc; break; } if (!channel) { netdev_info(ndp->ndev.dev, "NCSI: Channel %u does not exist!\n", channel_id); return -ERANGE; } } spin_lock_irqsave(&ndp->lock, flags); ndp->package_whitelist = 0x1 << package->id; ndp->multi_package = false; spin_unlock_irqrestore(&ndp->lock, flags); spin_lock_irqsave(&package->lock, flags); package->multi_channel = false; if (channel) { package->channel_whitelist = 0x1 << channel->id; package->preferred_channel = channel; } else { /* Allow any channel */ package->channel_whitelist = UINT_MAX; package->preferred_channel = NULL; } spin_unlock_irqrestore(&package->lock, flags); if (channel) netdev_info(ndp->ndev.dev, "Set package 0x%x, channel 0x%x as preferred\n", package_id, channel_id); else netdev_info(ndp->ndev.dev, "Set package 0x%x as preferred\n", package_id); /* Update channel configuration */ if (!(ndp->flags & NCSI_DEV_RESET)) ncsi_reset_dev(&ndp->ndev); return 0; } static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_dev_priv *ndp; struct ncsi_package *np; unsigned long flags; if (!info || !info->attrs) return -EINVAL; if (!info->attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; /* Reset any whitelists and disable multi mode */ spin_lock_irqsave(&ndp->lock, flags); ndp->package_whitelist = UINT_MAX; ndp->multi_package = false; spin_unlock_irqrestore(&ndp->lock, flags); NCSI_FOR_EACH_PACKAGE(ndp, np) { spin_lock_irqsave(&np->lock, flags); np->multi_channel = false; np->channel_whitelist = UINT_MAX; np->preferred_channel = NULL; spin_unlock_irqrestore(&np->lock, flags); } netdev_info(ndp->ndev.dev, "NCSI: Cleared preferred package/channel\n"); /* Update channel configuration */ if (!(ndp->flags & NCSI_DEV_RESET)) ncsi_reset_dev(&ndp->ndev); return 0; } static int ncsi_send_cmd_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_dev_priv *ndp; struct ncsi_pkt_hdr *hdr; struct ncsi_cmd_arg nca; unsigned char *data; u32 package_id; u32 channel_id; int len, ret; if (!info || !info->attrs) { ret = -EINVAL; goto out; } if (!info->attrs[NCSI_ATTR_IFINDEX]) { ret = -EINVAL; goto out; } if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) { ret = -EINVAL; goto out; } if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) { ret = -EINVAL; goto out; } if (!info->attrs[NCSI_ATTR_DATA]) { ret = -EINVAL; goto out; } ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) { ret = -ENODEV; goto out; } package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]); channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]); if (package_id >= NCSI_MAX_PACKAGE || channel_id >= NCSI_MAX_CHANNEL) { ret = -ERANGE; goto out_netlink; } len = nla_len(info->attrs[NCSI_ATTR_DATA]); if (len < sizeof(struct ncsi_pkt_hdr)) { netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n", package_id); ret = -EINVAL; goto out_netlink; } else { data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]); } hdr = (struct ncsi_pkt_hdr *)data; nca.ndp = ndp; nca.package = (unsigned char)package_id; nca.channel = (unsigned char)channel_id; nca.type = hdr->type; nca.req_flags = NCSI_REQ_FLAG_NETLINK_DRIVEN; nca.info = info; nca.payload = ntohs(hdr->length); nca.data = data + sizeof(*hdr); ret = ncsi_xmit_cmd(&nca); out_netlink: if (ret != 0) { netdev_err(ndp->ndev.dev, "NCSI: Error %d sending command\n", ret); ncsi_send_netlink_err(ndp->ndev.dev, info->snd_seq, info->snd_portid, info->nlhdr, ret); } out: return ret; } int ncsi_send_netlink_rsp(struct ncsi_request *nr, struct ncsi_package *np, struct ncsi_channel *nc) { struct sk_buff *skb; struct net *net; void *hdr; int rc; net = dev_net(nr->rsp->dev); skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); if (!skb) return -ENOMEM; hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq, &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD); if (!hdr) { kfree_skb(skb); return -EMSGSIZE; } nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->rsp->dev->ifindex); if (np) nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id); if (nc) nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id); else nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL); rc = nla_put(skb, NCSI_ATTR_DATA, nr->rsp->len, (void *)nr->rsp->data); if (rc) goto err; genlmsg_end(skb, hdr); return genlmsg_unicast(net, skb, nr->snd_portid); err: kfree_skb(skb); return rc; } int ncsi_send_netlink_timeout(struct ncsi_request *nr, struct ncsi_package *np, struct ncsi_channel *nc) { struct sk_buff *skb; struct net *net; void *hdr; skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); if (!skb) return -ENOMEM; hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq, &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD); if (!hdr) { kfree_skb(skb); return -EMSGSIZE; } net = dev_net(nr->cmd->dev); nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->cmd->dev->ifindex); if (np) nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id); else nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, NCSI_PACKAGE_INDEX((((struct ncsi_pkt_hdr *) nr->cmd->data)->channel))); if (nc) nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id); else nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL); genlmsg_end(skb, hdr); return genlmsg_unicast(net, skb, nr->snd_portid); } int ncsi_send_netlink_err(struct net_device *dev, u32 snd_seq, u32 snd_portid, const struct nlmsghdr *nlhdr, int err) { struct nlmsghdr *nlh; struct nlmsgerr *nle; struct sk_buff *skb; struct net *net; skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); if (!skb) return -ENOMEM; net = dev_net(dev); nlh = nlmsg_put(skb, snd_portid, snd_seq, NLMSG_ERROR, sizeof(*nle), 0); nle = (struct nlmsgerr *)nlmsg_data(nlh); nle->error = err; memcpy(&nle->msg, nlhdr, sizeof(*nlh)); nlmsg_end(skb, nlh); return nlmsg_unicast(net->genl_sock, skb, snd_portid); } static int ncsi_set_package_mask_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_dev_priv *ndp; unsigned long flags; int rc; if (!info || !info->attrs) return -EINVAL; if (!info->attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; if (!info->attrs[NCSI_ATTR_PACKAGE_MASK]) return -EINVAL; ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; spin_lock_irqsave(&ndp->lock, flags); if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) { if (ndp->flags & NCSI_DEV_HWA) { ndp->multi_package = true; rc = 0; } else { netdev_err(ndp->ndev.dev, "NCSI: Can't use multiple packages without HWA\n"); rc = -EPERM; } } else { ndp->multi_package = false; rc = 0; } if (!rc) ndp->package_whitelist = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_MASK]); spin_unlock_irqrestore(&ndp->lock, flags); if (!rc) { /* Update channel configuration */ if (!(ndp->flags & NCSI_DEV_RESET)) ncsi_reset_dev(&ndp->ndev); } return rc; } static int ncsi_set_channel_mask_nl(struct sk_buff *msg, struct genl_info *info) { struct ncsi_package *np, *package; struct ncsi_channel *nc, *channel; u32 package_id, channel_id; struct ncsi_dev_priv *ndp; unsigned long flags; if (!info || !info->attrs) return -EINVAL; if (!info->attrs[NCSI_ATTR_IFINDEX]) return -EINVAL; if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) return -EINVAL; if (!info->attrs[NCSI_ATTR_CHANNEL_MASK]) return -EINVAL; ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)), nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX])); if (!ndp) return -ENODEV; package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]); package = NULL; NCSI_FOR_EACH_PACKAGE(ndp, np) if (np->id == package_id) { package = np; break; } if (!package) return -ERANGE; spin_lock_irqsave(&package->lock, flags); channel = NULL; if (info->attrs[NCSI_ATTR_CHANNEL_ID]) { channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]); NCSI_FOR_EACH_CHANNEL(np, nc) if (nc->id == channel_id) { channel = nc; break; } if (!channel) { spin_unlock_irqrestore(&package->lock, flags); return -ERANGE; } netdev_dbg(ndp->ndev.dev, "NCSI: Channel %u set as preferred channel\n", channel->id); } package->channel_whitelist = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_MASK]); if (package->channel_whitelist == 0) netdev_dbg(ndp->ndev.dev, "NCSI: Package %u set to all channels disabled\n", package->id); package->preferred_channel = channel; if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) { package->multi_channel = true; netdev_info(ndp->ndev.dev, "NCSI: Multi-channel enabled on package %u\n", package_id); } else { package->multi_channel = false; } spin_unlock_irqrestore(&package->lock, flags); /* Update channel configuration */ if (!(ndp->flags & NCSI_DEV_RESET)) ncsi_reset_dev(&ndp->ndev); return 0; } static const struct genl_small_ops ncsi_ops[] = { { .cmd = NCSI_CMD_PKG_INFO, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_pkg_info_nl, .dumpit = ncsi_pkg_info_all_nl, .flags = 0, }, { .cmd = NCSI_CMD_SET_INTERFACE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_set_interface_nl, .flags = GENL_ADMIN_PERM, }, { .cmd = NCSI_CMD_CLEAR_INTERFACE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_clear_interface_nl, .flags = GENL_ADMIN_PERM, }, { .cmd = NCSI_CMD_SEND_CMD, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_send_cmd_nl, .flags = GENL_ADMIN_PERM, }, { .cmd = NCSI_CMD_SET_PACKAGE_MASK, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_set_package_mask_nl, .flags = GENL_ADMIN_PERM, }, { .cmd = NCSI_CMD_SET_CHANNEL_MASK, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = ncsi_set_channel_mask_nl, .flags = GENL_ADMIN_PERM, }, }; static struct genl_family ncsi_genl_family __ro_after_init = { .name = "NCSI", .version = 0, .maxattr = NCSI_ATTR_MAX, .policy = ncsi_genl_policy, .module = THIS_MODULE, .small_ops = ncsi_ops, .n_small_ops = ARRAY_SIZE(ncsi_ops), .resv_start_op = NCSI_CMD_SET_CHANNEL_MASK + 1, }; static int __init ncsi_init_netlink(void) { return genl_register_family(&ncsi_genl_family); } subsys_initcall(ncsi_init_netlink); |
| 31 846 47 4 8158 195 2108 10 428 428 427 305 4 37 1686 2 177 283 3736 11732 246 1603 127 17 1930 1 33 4 72 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __LINUX_BITMAP_H #define __LINUX_BITMAP_H #ifndef __ASSEMBLY__ #include <linux/align.h> #include <linux/bitops.h> #include <linux/cleanup.h> #include <linux/errno.h> #include <linux/find.h> #include <linux/limits.h> #include <linux/string.h> #include <linux/types.h> #include <linux/bitmap-str.h> struct device; /* * bitmaps provide bit arrays that consume one or more unsigned * longs. The bitmap interface and available operations are listed * here, in bitmap.h * * Function implementations generic to all architectures are in * lib/bitmap.c. Functions implementations that are architecture * specific are in various include/asm-<arch>/bitops.h headers * and other arch/<arch> specific files. * * See lib/bitmap.c for more details. */ /** * DOC: bitmap overview * * The available bitmap operations and their rough meaning in the * case that the bitmap is a single unsigned long are thus: * * The generated code is more efficient when nbits is known at * compile-time and at most BITS_PER_LONG. * * :: * * bitmap_zero(dst, nbits) *dst = 0UL * bitmap_fill(dst, nbits) *dst = ~0UL * bitmap_copy(dst, src, nbits) *dst = *src * bitmap_and(dst, src1, src2, nbits) *dst = *src1 & *src2 * bitmap_or(dst, src1, src2, nbits) *dst = *src1 | *src2 * bitmap_xor(dst, src1, src2, nbits) *dst = *src1 ^ *src2 * bitmap_andnot(dst, src1, src2, nbits) *dst = *src1 & ~(*src2) * bitmap_complement(dst, src, nbits) *dst = ~(*src) * bitmap_equal(src1, src2, nbits) Are *src1 and *src2 equal? * bitmap_intersects(src1, src2, nbits) Do *src1 and *src2 overlap? * bitmap_subset(src1, src2, nbits) Is *src1 a subset of *src2? * bitmap_empty(src, nbits) Are all bits zero in *src? * bitmap_full(src, nbits) Are all bits set in *src? * bitmap_weight(src, nbits) Hamming Weight: number set bits * bitmap_weight_and(src1, src2, nbits) Hamming Weight of and'ed bitmap * bitmap_weight_andnot(src1, src2, nbits) Hamming Weight of andnot'ed bitmap * bitmap_set(dst, pos, nbits) Set specified bit area * bitmap_clear(dst, pos, nbits) Clear specified bit area * bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area * bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off) as above * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n * bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest * bitmap_replace(dst, old, new, mask, nbits) *dst = (*old & ~(*mask)) | (*new & *mask) * bitmap_scatter(dst, src, mask, nbits) *dst = map(dense, sparse)(src) * bitmap_gather(dst, src, mask, nbits) *dst = map(sparse, dense)(src) * bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src) * bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit) * bitmap_onto(dst, orig, relmap, nbits) *dst = orig relative to relmap * bitmap_fold(dst, orig, sz, nbits) dst bits = orig bits mod sz * bitmap_parse(buf, buflen, dst, nbits) Parse bitmap dst from kernel buf * bitmap_parse_user(ubuf, ulen, dst, nbits) Parse bitmap dst from user buf * bitmap_parselist(buf, dst, nbits) Parse bitmap dst from kernel buf * bitmap_parselist_user(buf, dst, nbits) Parse bitmap dst from user buf * bitmap_find_free_region(bitmap, bits, order) Find and allocate bit region * bitmap_release_region(bitmap, pos, order) Free specified bit region * bitmap_allocate_region(bitmap, pos, order) Allocate specified bit region * bitmap_from_arr32(dst, buf, nbits) Copy nbits from u32[] buf to dst * bitmap_from_arr64(dst, buf, nbits) Copy nbits from u64[] buf to dst * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst * bitmap_get_value8(map, start) Get 8bit value from map at start * bitmap_set_value8(map, value, start) Set 8bit value to map at start * bitmap_read(map, start, nbits) Read an nbits-sized value from * map at start * bitmap_write(map, value, start, nbits) Write an nbits-sized value to * map at start * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long * boundary will be zeroed or filled as well. Consider to use * bitmap_clear() or bitmap_set() to make explicit zeroing or filling * respectively. */ /** * DOC: bitmap bitops * * Also the following operations in asm/bitops.h apply to bitmaps.:: * * set_bit(bit, addr) *addr |= bit * clear_bit(bit, addr) *addr &= ~bit * change_bit(bit, addr) *addr ^= bit * test_bit(bit, addr) Is bit set in *addr? * test_and_set_bit(bit, addr) Set bit and return old value * test_and_clear_bit(bit, addr) Clear bit and return old value * test_and_change_bit(bit, addr) Change bit and return old value * find_first_zero_bit(addr, nbits) Position first zero bit in *addr * find_first_bit(addr, nbits) Position first set bit in *addr * find_next_zero_bit(addr, nbits, bit) * Position next zero bit in *addr >= bit * find_next_bit(addr, nbits, bit) Position next set bit in *addr >= bit * find_next_and_bit(addr1, addr2, nbits, bit) * Same as find_next_bit, but in * (*addr1 & *addr2) * */ /** * DOC: declare bitmap * The DECLARE_BITMAP(name,bits) macro, in linux/types.h, can be used * to declare an array named 'name' of just enough unsigned longs to * contain all bit positions from 0 to 'bits' - 1. */ /* * Allocation and deallocation of bitmap. * Provided in lib/bitmap.c to avoid circular dependency. */ unsigned long *bitmap_alloc(unsigned int nbits, gfp_t flags); unsigned long *bitmap_zalloc(unsigned int nbits, gfp_t flags); unsigned long *bitmap_alloc_node(unsigned int nbits, gfp_t flags, int node); unsigned long *bitmap_zalloc_node(unsigned int nbits, gfp_t flags, int node); void bitmap_free(const unsigned long *bitmap); DEFINE_FREE(bitmap, unsigned long *, if (_T) bitmap_free(_T)) /* Managed variants of the above. */ unsigned long *devm_bitmap_alloc(struct device *dev, unsigned int nbits, gfp_t flags); unsigned long *devm_bitmap_zalloc(struct device *dev, unsigned int nbits, gfp_t flags); /* * lib/bitmap.c provides these functions: */ bool __bitmap_equal(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); bool __pure __bitmap_or_equal(const unsigned long *src1, const unsigned long *src2, const unsigned long *src3, unsigned int nbits); void __bitmap_complement(unsigned long *dst, const unsigned long *src, unsigned int nbits); void __bitmap_shift_right(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits); void __bitmap_shift_left(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits); void bitmap_cut(unsigned long *dst, const unsigned long *src, unsigned int first, unsigned int cut, unsigned int nbits); bool __bitmap_and(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); bool __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); void __bitmap_replace(unsigned long *dst, const unsigned long *old, const unsigned long *new, const unsigned long *mask, unsigned int nbits); bool __bitmap_intersects(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); bool __bitmap_subset(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); unsigned int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits); unsigned int __bitmap_weight_and(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); unsigned int __bitmap_weight_andnot(const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); void __bitmap_set(unsigned long *map, unsigned int start, int len); void __bitmap_clear(unsigned long *map, unsigned int start, int len); unsigned long bitmap_find_next_zero_area_off(unsigned long *map, unsigned long size, unsigned long start, unsigned int nr, unsigned long align_mask, unsigned long align_offset); /** * bitmap_find_next_zero_area - find a contiguous aligned zero area * @map: The address to base the search on * @size: The bitmap size in bits * @start: The bitnumber to start searching at * @nr: The number of zeroed bits we're looking for * @align_mask: Alignment mask for zero area * * The @align_mask should be one less than a power of 2; the effect is that * the bit offset of all zero areas this function finds is multiples of that * power of 2. A @align_mask of 0 means no alignment is required. */ static inline unsigned long bitmap_find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, unsigned int nr, unsigned long align_mask) { return bitmap_find_next_zero_area_off(map, size, start, nr, align_mask, 0); } void bitmap_remap(unsigned long *dst, const unsigned long *src, const unsigned long *old, const unsigned long *new, unsigned int nbits); int bitmap_bitremap(int oldbit, const unsigned long *old, const unsigned long *new, int bits); void bitmap_onto(unsigned long *dst, const unsigned long *orig, const unsigned long *relmap, unsigned int bits); void bitmap_fold(unsigned long *dst, const unsigned long *orig, unsigned int sz, unsigned int nbits); #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1))) #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1))) #define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE) static inline void bitmap_zero(unsigned long *dst, unsigned int nbits) { unsigned int len = bitmap_size(nbits); if (small_const_nbits(nbits)) *dst = 0; else memset(dst, 0, len); } static inline void bitmap_fill(unsigned long *dst, unsigned int nbits) { unsigned int len = bitmap_size(nbits); if (small_const_nbits(nbits)) *dst = ~0UL; else memset(dst, 0xff, len); } static inline void bitmap_copy(unsigned long *dst, const unsigned long *src, unsigned int nbits) { unsigned int len = bitmap_size(nbits); if (small_const_nbits(nbits)) *dst = *src; else memcpy(dst, src, len); } /* * Copy bitmap and clear tail bits in last word. */ static inline void bitmap_copy_clear_tail(unsigned long *dst, const unsigned long *src, unsigned int nbits) { bitmap_copy(dst, src, nbits); if (nbits % BITS_PER_LONG) dst[nbits / BITS_PER_LONG] &= BITMAP_LAST_WORD_MASK(nbits); } static inline void bitmap_copy_and_extend(unsigned long *to, const unsigned long *from, unsigned int count, unsigned int size) { unsigned int copy = BITS_TO_LONGS(count); memcpy(to, from, copy * sizeof(long)); if (count % BITS_PER_LONG) to[copy - 1] &= BITMAP_LAST_WORD_MASK(count); memset(to + copy, 0, bitmap_size(size) - copy * sizeof(long)); } /* * On 32-bit systems bitmaps are represented as u32 arrays internally. On LE64 * machines the order of hi and lo parts of numbers match the bitmap structure. * In both cases conversion is not needed when copying data from/to arrays of * u32. But in LE64 case, typecast in bitmap_copy_clear_tail() may lead * to out-of-bound access. To avoid that, both LE and BE variants of 64-bit * architectures are not using bitmap_copy_clear_tail(). */ #if BITS_PER_LONG == 64 void bitmap_from_arr32(unsigned long *bitmap, const u32 *buf, unsigned int nbits); void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits); #else #define bitmap_from_arr32(bitmap, buf, nbits) \ bitmap_copy_clear_tail((unsigned long *) (bitmap), \ (const unsigned long *) (buf), (nbits)) #define bitmap_to_arr32(buf, bitmap, nbits) \ bitmap_copy_clear_tail((unsigned long *) (buf), \ (const unsigned long *) (bitmap), (nbits)) #endif /* * On 64-bit systems bitmaps are represented as u64 arrays internally. So, * the conversion is not needed when copying data from/to arrays of u64. */ #if BITS_PER_LONG == 32 void bitmap_from_arr64(unsigned long *bitmap, const u64 *buf, unsigned int nbits); void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits); #else #define bitmap_from_arr64(bitmap, buf, nbits) \ bitmap_copy_clear_tail((unsigned long *)(bitmap), (const unsigned long *)(buf), (nbits)) #define bitmap_to_arr64(buf, bitmap, nbits) \ bitmap_copy_clear_tail((unsigned long *)(buf), (const unsigned long *)(bitmap), (nbits)) #endif static inline bool bitmap_and(unsigned long *dst, const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return (*dst = *src1 & *src2 & BITMAP_LAST_WORD_MASK(nbits)) != 0; return __bitmap_and(dst, src1, src2, nbits); } static inline void bitmap_or(unsigned long *dst, const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = *src1 | *src2; else __bitmap_or(dst, src1, src2, nbits); } static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = *src1 ^ *src2; else __bitmap_xor(dst, src1, src2, nbits); } static inline bool bitmap_andnot(unsigned long *dst, const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return (*dst = *src1 & ~(*src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0; return __bitmap_andnot(dst, src1, src2, nbits); } static inline void bitmap_complement(unsigned long *dst, const unsigned long *src, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = ~(*src); else __bitmap_complement(dst, src, nbits); } #ifdef __LITTLE_ENDIAN #define BITMAP_MEM_ALIGNMENT 8 #else #define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long)) #endif #define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1) static inline bool bitmap_equal(const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits)); if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) && IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) return !memcmp(src1, src2, nbits / 8); return __bitmap_equal(src1, src2, nbits); } /** * bitmap_or_equal - Check whether the or of two bitmaps is equal to a third * @src1: Pointer to bitmap 1 * @src2: Pointer to bitmap 2 will be or'ed with bitmap 1 * @src3: Pointer to bitmap 3. Compare to the result of *@src1 | *@src2 * @nbits: number of bits in each of these bitmaps * * Returns: True if (*@src1 | *@src2) == *@src3, false otherwise */ static inline bool bitmap_or_equal(const unsigned long *src1, const unsigned long *src2, const unsigned long *src3, unsigned int nbits) { if (!small_const_nbits(nbits)) return __bitmap_or_equal(src1, src2, src3, nbits); return !(((*src1 | *src2) ^ *src3) & BITMAP_LAST_WORD_MASK(nbits)); } static inline bool bitmap_intersects(const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0; else return __bitmap_intersects(src1, src2, nbits); } static inline bool bitmap_subset(const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return ! ((*src1 & ~(*src2)) & BITMAP_LAST_WORD_MASK(nbits)); else return __bitmap_subset(src1, src2, nbits); } static inline bool bitmap_empty(const unsigned long *src, unsigned nbits) { if (small_const_nbits(nbits)) return ! (*src & BITMAP_LAST_WORD_MASK(nbits)); return find_first_bit(src, nbits) == nbits; } static inline bool bitmap_full(const unsigned long *src, unsigned int nbits) { if (small_const_nbits(nbits)) return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits)); return find_first_zero_bit(src, nbits) == nbits; } static __always_inline unsigned int bitmap_weight(const unsigned long *src, unsigned int nbits) { if (small_const_nbits(nbits)) return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)); return __bitmap_weight(src, nbits); } static __always_inline unsigned long bitmap_weight_and(const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return hweight_long(*src1 & *src2 & BITMAP_LAST_WORD_MASK(nbits)); return __bitmap_weight_and(src1, src2, nbits); } static __always_inline unsigned long bitmap_weight_andnot(const unsigned long *src1, const unsigned long *src2, unsigned int nbits) { if (small_const_nbits(nbits)) return hweight_long(*src1 & ~(*src2) & BITMAP_LAST_WORD_MASK(nbits)); return __bitmap_weight_andnot(src1, src2, nbits); } static __always_inline void bitmap_set(unsigned long *map, unsigned int start, unsigned int nbits) { if (__builtin_constant_p(nbits) && nbits == 1) __set_bit(start, map); else if (small_const_nbits(start + nbits)) *map |= GENMASK(start + nbits - 1, start); else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && __builtin_constant_p(nbits & BITMAP_MEM_MASK) && IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) memset((char *)map + start / 8, 0xff, nbits / 8); else __bitmap_set(map, start, nbits); } static __always_inline void bitmap_clear(unsigned long *map, unsigned int start, unsigned int nbits) { if (__builtin_constant_p(nbits) && nbits == 1) __clear_bit(start, map); else if (small_const_nbits(start + nbits)) *map &= ~GENMASK(start + nbits - 1, start); else if (__builtin_constant_p(start & BITMAP_MEM_MASK) && IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) && __builtin_constant_p(nbits & BITMAP_MEM_MASK) && IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT)) memset((char *)map + start / 8, 0, nbits / 8); else __bitmap_clear(map, start, nbits); } static inline void bitmap_shift_right(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = (*src & BITMAP_LAST_WORD_MASK(nbits)) >> shift; else __bitmap_shift_right(dst, src, shift, nbits); } static inline void bitmap_shift_left(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = (*src << shift) & BITMAP_LAST_WORD_MASK(nbits); else __bitmap_shift_left(dst, src, shift, nbits); } static inline void bitmap_replace(unsigned long *dst, const unsigned long *old, const unsigned long *new, const unsigned long *mask, unsigned int nbits) { if (small_const_nbits(nbits)) *dst = (*old & ~(*mask)) | (*new & *mask); else __bitmap_replace(dst, old, new, mask, nbits); } /** * bitmap_scatter - Scatter a bitmap according to the given mask * @dst: scattered bitmap * @src: gathered bitmap * @mask: mask representing bits to assign to in the scattered bitmap * @nbits: number of bits in each of these bitmaps * * Scatters bitmap with sequential bits according to the given @mask. * * Example: * If @src bitmap = 0x005a, with @mask = 0x1313, @dst will be 0x0302. * * Or in binary form * @src @mask @dst * 0000000001011010 0001001100010011 0000001100000010 * * (Bits 0, 1, 2, 3, 4, 5 are copied to the bits 0, 1, 4, 8, 9, 12) * * A more 'visual' description of the operation:: * * src: 0000000001011010 * |||||| * +------+||||| * | +----+|||| * | |+----+||| * | || +-+|| * | || | || * mask: ...v..vv...v..vv * ...0..11...0..10 * dst: 0000001100000010 * * A relationship exists between bitmap_scatter() and bitmap_gather(). * bitmap_gather() can be seen as the 'reverse' bitmap_scatter() operation. * See bitmap_scatter() for details related to this relationship. */ static inline void bitmap_scatter(unsigned long *dst, const unsigned long *src, const unsigned long *mask, unsigned int nbits) { unsigned int n = 0; unsigned int bit; bitmap_zero(dst, nbits); for_each_set_bit(bit, mask, nbits) __assign_bit(bit, dst, test_bit(n++, src)); } /** * bitmap_gather - Gather a bitmap according to given mask * @dst: gathered bitmap * @src: scattered bitmap * @mask: mask representing bits to extract from in the scattered bitmap * @nbits: number of bits in each of these bitmaps * * Gathers bitmap with sparse bits according to the given @mask. * * Example: * If @src bitmap = 0x0302, with @mask = 0x1313, @dst will be 0x001a. * * Or in binary form * @src @mask @dst * 0000001100000010 0001001100010011 0000000000011010 * * (Bits 0, 1, 4, 8, 9, 12 are copied to the bits 0, 1, 2, 3, 4, 5) * * A more 'visual' description of the operation:: * * mask: ...v..vv...v..vv * src: 0000001100000010 * ^ ^^ ^ 0 * | || | 10 * | || > 010 * | |+--> 1010 * | +--> 11010 * +----> 011010 * dst: 0000000000011010 * * A relationship exists between bitmap_gather() and bitmap_scatter(). See * bitmap_scatter() for the bitmap scatter detailed operations. * Suppose scattered computed using bitmap_scatter(scattered, src, mask, n). * The operation bitmap_gather(result, scattered, mask, n) leads to a result * equal or equivalent to src. * * The result can be 'equivalent' because bitmap_scatter() and bitmap_gather() * are not bijective. * The result and src values are equivalent in that sense that a call to * bitmap_scatter(res, src, mask, n) and a call to * bitmap_scatter(res, result, mask, n) will lead to the same res value. */ static inline void bitmap_gather(unsigned long *dst, const unsigned long *src, const unsigned long *mask, unsigned int nbits) { unsigned int n = 0; unsigned int bit; bitmap_zero(dst, nbits); for_each_set_bit(bit, mask, nbits) __assign_bit(n++, dst, test_bit(bit, src)); } static inline void bitmap_next_set_region(unsigned long *bitmap, unsigned int *rs, unsigned int *re, unsigned int end) { *rs = find_next_bit(bitmap, end, *rs); *re = find_next_zero_bit(bitmap, end, *rs + 1); } /** * bitmap_release_region - release allocated bitmap region * @bitmap: array of unsigned longs corresponding to the bitmap * @pos: beginning of bit region to release * @order: region size (log base 2 of number of bits) to release * * This is the complement to __bitmap_find_free_region() and releases * the found region (by clearing it in the bitmap). */ static inline void bitmap_release_region(unsigned long *bitmap, unsigned int pos, int order) { bitmap_clear(bitmap, pos, BIT(order)); } /** * bitmap_allocate_region - allocate bitmap region * @bitmap: array of unsigned longs corresponding to the bitmap * @pos: beginning of bit region to allocate * @order: region size (log base 2 of number of bits) to allocate * * Allocate (set bits in) a specified region of a bitmap. * * Returns: 0 on success, or %-EBUSY if specified region wasn't * free (not all bits were zero). */ static inline int bitmap_allocate_region(unsigned long *bitmap, unsigned int pos, int order) { unsigned int len = BIT(order); if (find_next_bit(bitmap, pos + len, pos) < pos + len) return -EBUSY; bitmap_set(bitmap, pos, len); return 0; } /** * bitmap_find_free_region - find a contiguous aligned mem region * @bitmap: array of unsigned longs corresponding to the bitmap * @bits: number of bits in the bitmap * @order: region size (log base 2 of number of bits) to find * * Find a region of free (zero) bits in a @bitmap of @bits bits and * allocate them (set them to one). Only consider regions of length * a power (@order) of two, aligned to that power of two, which * makes the search algorithm much faster. * * Returns: the bit offset in bitmap of the allocated region, * or -errno on failure. */ static inline int bitmap_find_free_region(unsigned long *bitmap, unsigned int bits, int order) { unsigned int pos, end; /* scans bitmap by regions of size order */ for (pos = 0; (end = pos + BIT(order)) <= bits; pos = end) { if (!bitmap_allocate_region(bitmap, pos, order)) return pos; } return -ENOMEM; } /** * BITMAP_FROM_U64() - Represent u64 value in the format suitable for bitmap. * @n: u64 value * * Linux bitmaps are internally arrays of unsigned longs, i.e. 32-bit * integers in 32-bit environment, and 64-bit integers in 64-bit one. * * There are four combinations of endianness and length of the word in linux * ABIs: LE64, BE64, LE32 and BE32. * * On 64-bit kernels 64-bit LE and BE numbers are naturally ordered in * bitmaps and therefore don't require any special handling. * * On 32-bit kernels 32-bit LE ABI orders lo word of 64-bit number in memory * prior to hi, and 32-bit BE orders hi word prior to lo. The bitmap on the * other hand is represented as an array of 32-bit words and the position of * bit N may therefore be calculated as: word #(N/32) and bit #(N%32) in that * word. For example, bit #42 is located at 10th position of 2nd word. * It matches 32-bit LE ABI, and we can simply let the compiler store 64-bit * values in memory as it usually does. But for BE we need to swap hi and lo * words manually. * * With all that, the macro BITMAP_FROM_U64() does explicit reordering of hi and * lo parts of u64. For LE32 it does nothing, and for BE environment it swaps * hi and lo words, as is expected by bitmap. */ #if __BITS_PER_LONG == 64 #define BITMAP_FROM_U64(n) (n) #else #define BITMAP_FROM_U64(n) ((unsigned long) ((u64)(n) & ULONG_MAX)), \ ((unsigned long) ((u64)(n) >> 32)) #endif /** * bitmap_from_u64 - Check and swap words within u64. * @mask: source bitmap * @dst: destination bitmap * * In 32-bit Big Endian kernel, when using ``(u32 *)(&val)[*]`` * to read u64 mask, we will get the wrong word. * That is ``(u32 *)(&val)[0]`` gets the upper 32 bits, * but we expect the lower 32-bits of u64. */ static inline void bitmap_from_u64(unsigned long *dst, u64 mask) { bitmap_from_arr64(dst, &mask, 64); } /** * bitmap_read - read a value of n-bits from the memory region * @map: address to the bitmap memory region * @start: bit offset of the n-bit value * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG * * Returns: value of @nbits bits located at the @start bit offset within the * @map memory region. For @nbits = 0 and @nbits > BITS_PER_LONG the return * value is undefined. */ static inline unsigned long bitmap_read(const unsigned long *map, unsigned long start, unsigned long nbits) { size_t index = BIT_WORD(start); unsigned long offset = start % BITS_PER_LONG; unsigned long space = BITS_PER_LONG - offset; unsigned long value_low, value_high; if (unlikely(!nbits || nbits > BITS_PER_LONG)) return 0; if (space >= nbits) return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits); value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); return (value_low >> offset) | (value_high << space); } /** * bitmap_write - write n-bit value within a memory region * @map: address to the bitmap memory region * @value: value to write, clamped to nbits * @start: bit offset of the n-bit value * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG. * * bitmap_write() behaves as-if implemented as @nbits calls of __assign_bit(), * i.e. bits beyond @nbits are ignored: * * for (bit = 0; bit < nbits; bit++) * __assign_bit(start + bit, bitmap, val & BIT(bit)); * * For @nbits == 0 and @nbits > BITS_PER_LONG no writes are performed. */ static inline void bitmap_write(unsigned long *map, unsigned long value, unsigned long start, unsigned long nbits) { size_t index; unsigned long offset; unsigned long space; unsigned long mask; bool fit; if (unlikely(!nbits || nbits > BITS_PER_LONG)) return; mask = BITMAP_LAST_WORD_MASK(nbits); value &= mask; offset = start % BITS_PER_LONG; space = BITS_PER_LONG - offset; fit = space >= nbits; index = BIT_WORD(start); map[index] &= (fit ? (~(mask << offset)) : ~BITMAP_FIRST_WORD_MASK(start)); map[index] |= value << offset; if (fit) return; map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits); map[index + 1] |= (value >> space); } #define bitmap_get_value8(map, start) \ bitmap_read(map, start, BITS_PER_BYTE) #define bitmap_set_value8(map, value, start) \ bitmap_write(map, value, start, BITS_PER_BYTE) #endif /* __ASSEMBLY__ */ #endif /* __LINUX_BITMAP_H */ |
| 5388 2359 1 5391 1 1 5414 5410 5414 5306 5291 5289 5287 5200 5185 4148 4135 1485 1485 5390 5392 5414 5193 5414 4138 5414 5413 5413 5418 197 5305 98 5247 19 17 5236 11 9 5235 8 6 5226 9 9 5222 14 5210 5419 5414 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 | // SPDX-License-Identifier: GPL-2.0 /* * security/tomoyo/mount.c * * Copyright (C) 2005-2011 NTT DATA CORPORATION */ #include <linux/slab.h> #include <uapi/linux/mount.h> #include "common.h" /* String table for special mount operations. */ static const char * const tomoyo_mounts[TOMOYO_MAX_SPECIAL_MOUNT] = { [TOMOYO_MOUNT_BIND] = "--bind", [TOMOYO_MOUNT_MOVE] = "--move", [TOMOYO_MOUNT_REMOUNT] = "--remount", [TOMOYO_MOUNT_MAKE_UNBINDABLE] = "--make-unbindable", [TOMOYO_MOUNT_MAKE_PRIVATE] = "--make-private", [TOMOYO_MOUNT_MAKE_SLAVE] = "--make-slave", [TOMOYO_MOUNT_MAKE_SHARED] = "--make-shared", }; /** * tomoyo_audit_mount_log - Audit mount log. * * @r: Pointer to "struct tomoyo_request_info". * * Returns 0 on success, negative value otherwise. */ static int tomoyo_audit_mount_log(struct tomoyo_request_info *r) { return tomoyo_supervisor(r, "file mount %s %s %s 0x%lX\n", r->param.mount.dev->name, r->param.mount.dir->name, r->param.mount.type->name, r->param.mount.flags); } /** * tomoyo_check_mount_acl - Check permission for path path path number operation. * * @r: Pointer to "struct tomoyo_request_info". * @ptr: Pointer to "struct tomoyo_acl_info". * * Returns true if granted, false otherwise. */ static bool tomoyo_check_mount_acl(struct tomoyo_request_info *r, const struct tomoyo_acl_info *ptr) { const struct tomoyo_mount_acl *acl = container_of(ptr, typeof(*acl), head); return tomoyo_compare_number_union(r->param.mount.flags, &acl->flags) && tomoyo_compare_name_union(r->param.mount.type, &acl->fs_type) && tomoyo_compare_name_union(r->param.mount.dir, &acl->dir_name) && (!r->param.mount.need_dev || tomoyo_compare_name_union(r->param.mount.dev, &acl->dev_name)); } /** * tomoyo_mount_acl - Check permission for mount() operation. * * @r: Pointer to "struct tomoyo_request_info". * @dev_name: Name of device file. Maybe NULL. * @dir: Pointer to "struct path". * @type: Name of filesystem type. * @flags: Mount options. * * Returns 0 on success, negative value otherwise. * * Caller holds tomoyo_read_lock(). */ static int tomoyo_mount_acl(struct tomoyo_request_info *r, const char *dev_name, const struct path *dir, const char *type, unsigned long flags) { struct tomoyo_obj_info obj = { }; struct path path; struct file_system_type *fstype = NULL; const char *requested_type = NULL; const char *requested_dir_name = NULL; const char *requested_dev_name = NULL; struct tomoyo_path_info rtype; struct tomoyo_path_info rdev; struct tomoyo_path_info rdir; int need_dev = 0; int error = -ENOMEM; r->obj = &obj; /* Get fstype. */ requested_type = tomoyo_encode(type); if (!requested_type) goto out; rtype.name = requested_type; tomoyo_fill_path_info(&rtype); /* Get mount point. */ obj.path2 = *dir; requested_dir_name = tomoyo_realpath_from_path(dir); if (!requested_dir_name) { error = -ENOMEM; goto out; } rdir.name = requested_dir_name; tomoyo_fill_path_info(&rdir); /* Compare fs name. */ if (type == tomoyo_mounts[TOMOYO_MOUNT_REMOUNT]) { /* dev_name is ignored. */ } else if (type == tomoyo_mounts[TOMOYO_MOUNT_MAKE_UNBINDABLE] || type == tomoyo_mounts[TOMOYO_MOUNT_MAKE_PRIVATE] || type == tomoyo_mounts[TOMOYO_MOUNT_MAKE_SLAVE] || type == tomoyo_mounts[TOMOYO_MOUNT_MAKE_SHARED]) { /* dev_name is ignored. */ } else if (type == tomoyo_mounts[TOMOYO_MOUNT_BIND] || type == tomoyo_mounts[TOMOYO_MOUNT_MOVE]) { need_dev = -1; /* dev_name is a directory */ } else { fstype = get_fs_type(type); if (!fstype) { error = -ENODEV; goto out; } if (fstype->fs_flags & FS_REQUIRES_DEV) /* dev_name is a block device file. */ need_dev = 1; } if (need_dev) { /* Get mount point or device file. */ if (!dev_name || kern_path(dev_name, LOOKUP_FOLLOW, &path)) { error = -ENOENT; goto out; } obj.path1 = path; requested_dev_name = tomoyo_realpath_from_path(&path); if (!requested_dev_name) { error = -ENOENT; goto out; } } else { /* Map dev_name to "<NULL>" if no dev_name given. */ if (!dev_name) dev_name = "<NULL>"; requested_dev_name = tomoyo_encode(dev_name); if (!requested_dev_name) { error = -ENOMEM; goto out; } } rdev.name = requested_dev_name; tomoyo_fill_path_info(&rdev); r->param_type = TOMOYO_TYPE_MOUNT_ACL; r->param.mount.need_dev = need_dev; r->param.mount.dev = &rdev; r->param.mount.dir = &rdir; r->param.mount.type = &rtype; r->param.mount.flags = flags; do { tomoyo_check_acl(r, tomoyo_check_mount_acl); error = tomoyo_audit_mount_log(r); } while (error == TOMOYO_RETRY_REQUEST); out: kfree(requested_dev_name); kfree(requested_dir_name); if (fstype) put_filesystem(fstype); kfree(requested_type); /* Drop refcount obtained by kern_path(). */ if (obj.path1.dentry) path_put(&obj.path1); return error; } /** * tomoyo_mount_permission - Check permission for mount() operation. * * @dev_name: Name of device file. Maybe NULL. * @path: Pointer to "struct path". * @type: Name of filesystem type. Maybe NULL. * @flags: Mount options. * @data_page: Optional data. Maybe NULL. * * Returns 0 on success, negative value otherwise. */ int tomoyo_mount_permission(const char *dev_name, const struct path *path, const char *type, unsigned long flags, void *data_page) { struct tomoyo_request_info r; int error; int idx; if (tomoyo_init_request_info(&r, NULL, TOMOYO_MAC_FILE_MOUNT) == TOMOYO_CONFIG_DISABLED) return 0; if ((flags & MS_MGC_MSK) == MS_MGC_VAL) flags &= ~MS_MGC_MSK; if (flags & MS_REMOUNT) { type = tomoyo_mounts[TOMOYO_MOUNT_REMOUNT]; flags &= ~MS_REMOUNT; } else if (flags & MS_BIND) { type = tomoyo_mounts[TOMOYO_MOUNT_BIND]; flags &= ~MS_BIND; } else if (flags & MS_SHARED) { if (flags & (MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE)) return -EINVAL; type = tomoyo_mounts[TOMOYO_MOUNT_MAKE_SHARED]; flags &= ~MS_SHARED; } else if (flags & MS_PRIVATE) { if (flags & (MS_SHARED | MS_SLAVE | MS_UNBINDABLE)) return -EINVAL; type = tomoyo_mounts[TOMOYO_MOUNT_MAKE_PRIVATE]; flags &= ~MS_PRIVATE; } else if (flags & MS_SLAVE) { if (flags & (MS_SHARED | MS_PRIVATE | MS_UNBINDABLE)) return -EINVAL; type = tomoyo_mounts[TOMOYO_MOUNT_MAKE_SLAVE]; flags &= ~MS_SLAVE; } else if (flags & MS_UNBINDABLE) { if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE)) return -EINVAL; type = tomoyo_mounts[TOMOYO_MOUNT_MAKE_UNBINDABLE]; flags &= ~MS_UNBINDABLE; } else if (flags & MS_MOVE) { type = tomoyo_mounts[TOMOYO_MOUNT_MOVE]; flags &= ~MS_MOVE; } if (!type) type = "<NULL>"; idx = tomoyo_read_lock(); error = tomoyo_mount_acl(&r, dev_name, path, type, flags); tomoyo_read_unlock(idx); return error; } |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 | /* * * Author Karsten Keil <kkeil@novell.com> * * Copyright 2008 by Karsten Keil <kkeil@novell.com> * * This code is free software; you can redistribute it and/or modify * it under the terms of the GNU LESSER GENERAL PUBLIC LICENSE * version 2.1 as published by the Free Software Foundation. * * This code is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU LESSER GENERAL PUBLIC LICENSE for more details. * */ #ifndef mISDNIF_H #define mISDNIF_H #include <linux/types.h> #include <linux/errno.h> #include <linux/socket.h> /* * ABI Version 32 bit * * <8 bit> Major version * - changed if any interface become backwards incompatible * * <8 bit> Minor version * - changed if any interface is extended but backwards compatible * * <16 bit> Release number * - should be incremented on every checkin */ #define MISDN_MAJOR_VERSION 1 #define MISDN_MINOR_VERSION 1 #define MISDN_RELEASE 29 /* primitives for information exchange * generell format * <16 bit 0 > * <8 bit command> * BIT 8 = 1 LAYER private * BIT 7 = 1 answer * BIT 6 = 1 DATA * <8 bit target layer mask> * * Layer = 00 is reserved for general commands Layer = 01 L2 -> HW Layer = 02 HW -> L2 Layer = 04 L3 -> L2 Layer = 08 L2 -> L3 * Layer = FF is reserved for broadcast commands */ #define MISDN_CMDMASK 0xff00 #define MISDN_LAYERMASK 0x00ff /* generell commands */ #define OPEN_CHANNEL 0x0100 #define CLOSE_CHANNEL 0x0200 #define CONTROL_CHANNEL 0x0300 #define CHECK_DATA 0x0400 /* layer 2 -> layer 1 */ #define PH_ACTIVATE_REQ 0x0101 #define PH_DEACTIVATE_REQ 0x0201 #define PH_DATA_REQ 0x2001 #define MPH_ACTIVATE_REQ 0x0501 #define MPH_DEACTIVATE_REQ 0x0601 #define MPH_INFORMATION_REQ 0x0701 #define PH_CONTROL_REQ 0x0801 /* layer 1 -> layer 2 */ #define PH_ACTIVATE_IND 0x0102 #define PH_ACTIVATE_CNF 0x4102 #define PH_DEACTIVATE_IND 0x0202 #define PH_DEACTIVATE_CNF 0x4202 #define PH_DATA_IND 0x2002 #define PH_DATA_E_IND 0x3002 #define MPH_ACTIVATE_IND 0x0502 #define MPH_DEACTIVATE_IND 0x0602 #define MPH_INFORMATION_IND 0x0702 #define PH_DATA_CNF 0x6002 #define PH_CONTROL_IND 0x0802 #define PH_CONTROL_CNF 0x4802 /* layer 3 -> layer 2 */ #define DL_ESTABLISH_REQ 0x1004 #define DL_RELEASE_REQ 0x1104 #define DL_DATA_REQ 0x3004 #define DL_UNITDATA_REQ 0x3104 #define DL_INFORMATION_REQ 0x0004 /* layer 2 -> layer 3 */ #define DL_ESTABLISH_IND 0x1008 #define DL_ESTABLISH_CNF 0x5008 #define DL_RELEASE_IND 0x1108 #define DL_RELEASE_CNF 0x5108 #define DL_DATA_IND 0x3008 #define DL_UNITDATA_IND 0x3108 #define DL_INFORMATION_IND 0x0008 /* intern layer 2 management */ #define MDL_ASSIGN_REQ 0x1804 #define MDL_ASSIGN_IND 0x1904 #define MDL_REMOVE_REQ 0x1A04 #define MDL_REMOVE_IND 0x1B04 #define MDL_STATUS_UP_IND 0x1C04 #define MDL_STATUS_DOWN_IND 0x1D04 #define MDL_STATUS_UI_IND 0x1E04 #define MDL_ERROR_IND 0x1F04 #define MDL_ERROR_RSP 0x5F04 /* intern layer 2 */ #define DL_TIMER200_IND 0x7004 #define DL_TIMER203_IND 0x7304 #define DL_INTERN_MSG 0x7804 /* DL_INFORMATION_IND types */ #define DL_INFO_L2_CONNECT 0x0001 #define DL_INFO_L2_REMOVED 0x0002 /* PH_CONTROL types */ /* TOUCH TONE IS 0x20XX XX "0"..."9", "A","B","C","D","*","#" */ #define DTMF_TONE_VAL 0x2000 #define DTMF_TONE_MASK 0x007F #define DTMF_TONE_START 0x2100 #define DTMF_TONE_STOP 0x2200 #define DTMF_HFC_COEF 0x4000 #define DSP_CONF_JOIN 0x2403 #define DSP_CONF_SPLIT 0x2404 #define DSP_RECEIVE_OFF 0x2405 #define DSP_RECEIVE_ON 0x2406 #define DSP_ECHO_ON 0x2407 #define DSP_ECHO_OFF 0x2408 #define DSP_MIX_ON 0x2409 #define DSP_MIX_OFF 0x240a #define DSP_DELAY 0x240b #define DSP_JITTER 0x240c #define DSP_TXDATA_ON 0x240d #define DSP_TXDATA_OFF 0x240e #define DSP_TX_DEJITTER 0x240f #define DSP_TX_DEJ_OFF 0x2410 #define DSP_TONE_PATT_ON 0x2411 #define DSP_TONE_PATT_OFF 0x2412 #define DSP_VOL_CHANGE_TX 0x2413 #define DSP_VOL_CHANGE_RX 0x2414 #define DSP_BF_ENABLE_KEY 0x2415 #define DSP_BF_DISABLE 0x2416 #define DSP_BF_ACCEPT 0x2416 #define DSP_BF_REJECT 0x2417 #define DSP_PIPELINE_CFG 0x2418 #define HFC_VOL_CHANGE_TX 0x2601 #define HFC_VOL_CHANGE_RX 0x2602 #define HFC_SPL_LOOP_ON 0x2603 #define HFC_SPL_LOOP_OFF 0x2604 /* for T30 FAX and analog modem */ #define HW_MOD_FRM 0x4000 #define HW_MOD_FRH 0x4001 #define HW_MOD_FTM 0x4002 #define HW_MOD_FTH 0x4003 #define HW_MOD_FTS 0x4004 #define HW_MOD_CONNECT 0x4010 #define HW_MOD_OK 0x4011 #define HW_MOD_NOCARR 0x4012 #define HW_MOD_FCERROR 0x4013 #define HW_MOD_READY 0x4014 #define HW_MOD_LASTDATA 0x4015 /* DSP_TONE_PATT_ON parameter */ #define TONE_OFF 0x0000 #define TONE_GERMAN_DIALTONE 0x0001 #define TONE_GERMAN_OLDDIALTONE 0x0002 #define TONE_AMERICAN_DIALTONE 0x0003 #define TONE_GERMAN_DIALPBX 0x0004 #define TONE_GERMAN_OLDDIALPBX 0x0005 #define TONE_AMERICAN_DIALPBX 0x0006 #define TONE_GERMAN_RINGING 0x0007 #define TONE_GERMAN_OLDRINGING 0x0008 #define TONE_AMERICAN_RINGPBX 0x000b #define TONE_GERMAN_RINGPBX 0x000c #define TONE_GERMAN_OLDRINGPBX 0x000d #define TONE_AMERICAN_RINGING 0x000e #define TONE_GERMAN_BUSY 0x000f #define TONE_GERMAN_OLDBUSY 0x0010 #define TONE_AMERICAN_BUSY 0x0011 #define TONE_GERMAN_HANGUP 0x0012 #define TONE_GERMAN_OLDHANGUP 0x0013 #define TONE_AMERICAN_HANGUP 0x0014 #define TONE_SPECIAL_INFO 0x0015 #define TONE_GERMAN_GASSENBESETZT 0x0016 #define TONE_GERMAN_AUFSCHALTTON 0x0016 /* MPH_INFORMATION_IND */ #define L1_SIGNAL_LOS_OFF 0x0010 #define L1_SIGNAL_LOS_ON 0x0011 #define L1_SIGNAL_AIS_OFF 0x0012 #define L1_SIGNAL_AIS_ON 0x0013 #define L1_SIGNAL_RDI_OFF 0x0014 #define L1_SIGNAL_RDI_ON 0x0015 #define L1_SIGNAL_SLIP_RX 0x0020 #define L1_SIGNAL_SLIP_TX 0x0021 /* * protocol ids * D channel 1-31 * B channel 33 - 63 */ #define ISDN_P_NONE 0 #define ISDN_P_BASE 0 #define ISDN_P_TE_S0 0x01 #define ISDN_P_NT_S0 0x02 #define ISDN_P_TE_E1 0x03 #define ISDN_P_NT_E1 0x04 #define ISDN_P_TE_UP0 0x05 #define ISDN_P_NT_UP0 0x06 #define IS_ISDN_P_TE(p) ((p == ISDN_P_TE_S0) || (p == ISDN_P_TE_E1) || \ (p == ISDN_P_TE_UP0) || (p == ISDN_P_LAPD_TE)) #define IS_ISDN_P_NT(p) ((p == ISDN_P_NT_S0) || (p == ISDN_P_NT_E1) || \ (p == ISDN_P_NT_UP0) || (p == ISDN_P_LAPD_NT)) #define IS_ISDN_P_S0(p) ((p == ISDN_P_TE_S0) || (p == ISDN_P_NT_S0)) #define IS_ISDN_P_E1(p) ((p == ISDN_P_TE_E1) || (p == ISDN_P_NT_E1)) #define IS_ISDN_P_UP0(p) ((p == ISDN_P_TE_UP0) || (p == ISDN_P_NT_UP0)) #define ISDN_P_LAPD_TE 0x10 #define ISDN_P_LAPD_NT 0x11 #define ISDN_P_B_MASK 0x1f #define ISDN_P_B_START 0x20 #define ISDN_P_B_RAW 0x21 #define ISDN_P_B_HDLC 0x22 #define ISDN_P_B_X75SLP 0x23 #define ISDN_P_B_L2DTMF 0x24 #define ISDN_P_B_L2DSP 0x25 #define ISDN_P_B_L2DSPHDLC 0x26 #define ISDN_P_B_T30_FAX 0x27 #define ISDN_P_B_MODEM_ASYNC 0x28 #define OPTION_L2_PMX 1 #define OPTION_L2_PTP 2 #define OPTION_L2_FIXEDTEI 3 #define OPTION_L2_CLEANUP 4 #define OPTION_L1_HOLD 5 /* should be in sync with linux/kobject.h:KOBJ_NAME_LEN */ #define MISDN_MAX_IDLEN 20 struct mISDNhead { unsigned int prim; unsigned int id; } __packed; #define MISDN_HEADER_LEN sizeof(struct mISDNhead) #define MAX_DATA_SIZE 2048 #define MAX_DATA_MEM (MAX_DATA_SIZE + MISDN_HEADER_LEN) #define MAX_DFRAME_LEN 260 #define MISDN_ID_ADDR_MASK 0xFFFF #define MISDN_ID_TEI_MASK 0xFF00 #define MISDN_ID_SAPI_MASK 0x00FF #define MISDN_ID_TEI_ANY 0x7F00 #define MISDN_ID_ANY 0xFFFF #define MISDN_ID_NONE 0xFFFE #define GROUP_TEI 127 #define TEI_SAPI 63 #define CTRL_SAPI 0 #define MISDN_MAX_CHANNEL 127 #define MISDN_CHMAP_SIZE ((MISDN_MAX_CHANNEL + 1) >> 3) #define SOL_MISDN 0 struct sockaddr_mISDN { sa_family_t family; unsigned char dev; unsigned char channel; unsigned char sapi; unsigned char tei; }; struct mISDNversion { unsigned char major; unsigned char minor; unsigned short release; }; struct mISDN_devinfo { u_int id; u_int Dprotocols; u_int Bprotocols; u_int protocol; u_char channelmap[MISDN_CHMAP_SIZE]; u_int nrbchan; char name[MISDN_MAX_IDLEN]; }; struct mISDN_devrename { u_int id; char name[MISDN_MAX_IDLEN]; /* new name */ }; /* MPH_INFORMATION_REQ payload */ struct ph_info_ch { __u32 protocol; __u64 Flags; }; struct ph_info_dch { struct ph_info_ch ch; __u16 state; __u16 num_bch; }; struct ph_info { struct ph_info_dch dch; struct ph_info_ch bch[]; }; /* timer device ioctl */ #define IMADDTIMER _IOR('I', 64, int) #define IMDELTIMER _IOR('I', 65, int) /* socket ioctls */ #define IMGETVERSION _IOR('I', 66, int) #define IMGETCOUNT _IOR('I', 67, int) #define IMGETDEVINFO _IOR('I', 68, int) #define IMCTRLREQ _IOR('I', 69, int) #define IMCLEAR_L2 _IOR('I', 70, int) #define IMSETDEVNAME _IOR('I', 71, struct mISDN_devrename) #define IMHOLD_L1 _IOR('I', 72, int) static inline int test_channelmap(u_int nr, u_char *map) { if (nr <= MISDN_MAX_CHANNEL) return map[nr >> 3] & (1 << (nr & 7)); else return 0; } static inline void set_channelmap(u_int nr, u_char *map) { map[nr >> 3] |= (1 << (nr & 7)); } static inline void clear_channelmap(u_int nr, u_char *map) { map[nr >> 3] &= ~(1 << (nr & 7)); } /* CONTROL_CHANNEL parameters */ #define MISDN_CTRL_GETOP 0x0000 #define MISDN_CTRL_LOOP 0x0001 #define MISDN_CTRL_CONNECT 0x0002 #define MISDN_CTRL_DISCONNECT 0x0004 #define MISDN_CTRL_RX_BUFFER 0x0008 #define MISDN_CTRL_PCMCONNECT 0x0010 #define MISDN_CTRL_PCMDISCONNECT 0x0020 #define MISDN_CTRL_SETPEER 0x0040 #define MISDN_CTRL_UNSETPEER 0x0080 #define MISDN_CTRL_RX_OFF 0x0100 #define MISDN_CTRL_FILL_EMPTY 0x0200 #define MISDN_CTRL_GETPEER 0x0400 #define MISDN_CTRL_L1_TIMER3 0x0800 #define MISDN_CTRL_HW_FEATURES_OP 0x2000 #define MISDN_CTRL_HW_FEATURES 0x2001 #define MISDN_CTRL_HFC_OP 0x4000 #define MISDN_CTRL_HFC_PCM_CONN 0x4001 #define MISDN_CTRL_HFC_PCM_DISC 0x4002 #define MISDN_CTRL_HFC_CONF_JOIN 0x4003 #define MISDN_CTRL_HFC_CONF_SPLIT 0x4004 #define MISDN_CTRL_HFC_RECEIVE_OFF 0x4005 #define MISDN_CTRL_HFC_RECEIVE_ON 0x4006 #define MISDN_CTRL_HFC_ECHOCAN_ON 0x4007 #define MISDN_CTRL_HFC_ECHOCAN_OFF 0x4008 #define MISDN_CTRL_HFC_WD_INIT 0x4009 #define MISDN_CTRL_HFC_WD_RESET 0x400A /* special RX buffer value for MISDN_CTRL_RX_BUFFER request.p1 is the minimum * buffer size request.p2 the maximum. Using MISDN_CTRL_RX_SIZE_IGNORE will * not change the value, but still read back the actual stetting. */ #define MISDN_CTRL_RX_SIZE_IGNORE -1 /* socket options */ #define MISDN_TIME_STAMP 0x0001 struct mISDN_ctrl_req { int op; int channel; int p1; int p2; }; /* muxer options */ #define MISDN_OPT_ALL 1 #define MISDN_OPT_TEIMGR 2 #ifdef __KERNEL__ #include <linux/list.h> #include <linux/skbuff.h> #include <linux/net.h> #include <net/sock.h> #include <linux/completion.h> #define DEBUG_CORE 0x000000ff #define DEBUG_CORE_FUNC 0x00000002 #define DEBUG_SOCKET 0x00000004 #define DEBUG_MANAGER 0x00000008 #define DEBUG_SEND_ERR 0x00000010 #define DEBUG_MSG_THREAD 0x00000020 #define DEBUG_QUEUE_FUNC 0x00000040 #define DEBUG_L1 0x0000ff00 #define DEBUG_L1_FSM 0x00000200 #define DEBUG_L2 0x00ff0000 #define DEBUG_L2_FSM 0x00020000 #define DEBUG_L2_CTRL 0x00040000 #define DEBUG_L2_RECV 0x00080000 #define DEBUG_L2_TEI 0x00100000 #define DEBUG_L2_TEIFSM 0x00200000 #define DEBUG_TIMER 0x01000000 #define DEBUG_CLOCK 0x02000000 #define mISDN_HEAD_P(s) ((struct mISDNhead *)&s->cb[0]) #define mISDN_HEAD_PRIM(s) (((struct mISDNhead *)&s->cb[0])->prim) #define mISDN_HEAD_ID(s) (((struct mISDNhead *)&s->cb[0])->id) /* socket states */ #define MISDN_OPEN 1 #define MISDN_BOUND 2 #define MISDN_CLOSED 3 struct mISDNchannel; struct mISDNdevice; struct mISDNstack; struct mISDNclock; struct channel_req { u_int protocol; struct sockaddr_mISDN adr; struct mISDNchannel *ch; }; typedef int (ctrl_func_t)(struct mISDNchannel *, u_int, void *); typedef int (send_func_t)(struct mISDNchannel *, struct sk_buff *); typedef int (create_func_t)(struct channel_req *); struct Bprotocol { struct list_head list; char *name; u_int Bprotocols; create_func_t *create; }; struct mISDNchannel { struct list_head list; u_int protocol; u_int nr; u_long opt; u_int addr; struct mISDNstack *st; struct mISDNchannel *peer; send_func_t *send; send_func_t *recv; ctrl_func_t *ctrl; }; struct mISDN_sock_list { struct hlist_head head; rwlock_t lock; }; struct mISDN_sock { struct sock sk; struct mISDNchannel ch; u_int cmask; struct mISDNdevice *dev; }; struct mISDNdevice { struct mISDNchannel D; u_int id; u_int Dprotocols; u_int Bprotocols; u_int nrbchan; u_char channelmap[MISDN_CHMAP_SIZE]; struct list_head bchannels; struct mISDNchannel *teimgr; struct device dev; }; struct mISDNstack { u_long status; struct mISDNdevice *dev; struct task_struct *thread; struct completion *notify; wait_queue_head_t workq; struct sk_buff_head msgq; struct list_head layer2; struct mISDNchannel *layer1; struct mISDNchannel own; struct mutex lmutex; /* protect lists */ struct mISDN_sock_list l1sock; #ifdef MISDN_MSG_STATS u_int msg_cnt; u_int sleep_cnt; u_int stopped_cnt; #endif }; typedef int (clockctl_func_t)(void *, int); struct mISDNclock { struct list_head list; char name[64]; int pri; clockctl_func_t *ctl; void *priv; }; /* global alloc/queue functions */ static inline struct sk_buff * mI_alloc_skb(unsigned int len, gfp_t gfp_mask) { struct sk_buff *skb; skb = alloc_skb(len + MISDN_HEADER_LEN, gfp_mask); if (likely(skb)) skb_reserve(skb, MISDN_HEADER_LEN); return skb; } static inline struct sk_buff * _alloc_mISDN_skb(u_int prim, u_int id, u_int len, void *dp, gfp_t gfp_mask) { struct sk_buff *skb = mI_alloc_skb(len, gfp_mask); struct mISDNhead *hh; if (!skb) return NULL; if (len) skb_put_data(skb, dp, len); hh = mISDN_HEAD_P(skb); hh->prim = prim; hh->id = id; return skb; } static inline void _queue_data(struct mISDNchannel *ch, u_int prim, u_int id, u_int len, void *dp, gfp_t gfp_mask) { struct sk_buff *skb; if (!ch->peer) return; skb = _alloc_mISDN_skb(prim, id, len, dp, gfp_mask); if (!skb) return; if (ch->recv(ch->peer, skb)) dev_kfree_skb(skb); } /* global register/unregister functions */ extern int mISDN_register_device(struct mISDNdevice *, struct device *parent, char *name); extern void mISDN_unregister_device(struct mISDNdevice *); extern int mISDN_register_Bprotocol(struct Bprotocol *); extern void mISDN_unregister_Bprotocol(struct Bprotocol *); extern struct mISDNclock *mISDN_register_clock(char *, int, clockctl_func_t *, void *); extern void mISDN_unregister_clock(struct mISDNclock *); static inline struct mISDNdevice *dev_to_mISDN(const struct device *dev) { if (dev) return dev_get_drvdata(dev); else return NULL; } extern void set_channel_address(struct mISDNchannel *, u_int, u_int); extern void mISDN_clock_update(struct mISDNclock *, int, ktime_t *); extern unsigned short mISDN_clock_get(void); extern const char *mISDNDevName4ch(struct mISDNchannel *); #endif /* __KERNEL__ */ #endif /* mISDNIF_H */ |
| 3 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | // SPDX-License-Identifier: GPL-2.0 /* * LED Triggers for USB Activity * * Copyright 2014 Michal Sojka <sojka@merica.cz> */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/leds.h> #include <linux/usb.h> #include "common.h" #define BLINK_DELAY 30 DEFINE_LED_TRIGGER(ledtrig_usb_gadget); DEFINE_LED_TRIGGER(ledtrig_usb_host); void usb_led_activity(enum usb_led_event ev) { struct led_trigger *trig = NULL; switch (ev) { case USB_LED_EVENT_GADGET: trig = ledtrig_usb_gadget; break; case USB_LED_EVENT_HOST: trig = ledtrig_usb_host; break; } /* led_trigger_blink_oneshot() handles trig == NULL gracefully */ led_trigger_blink_oneshot(trig, BLINK_DELAY, BLINK_DELAY, 0); } EXPORT_SYMBOL_GPL(usb_led_activity); void __init ledtrig_usb_init(void) { led_trigger_register_simple("usb-gadget", &ledtrig_usb_gadget); led_trigger_register_simple("usb-host", &ledtrig_usb_host); } void __exit ledtrig_usb_exit(void) { led_trigger_unregister_simple(ledtrig_usb_gadget); led_trigger_unregister_simple(ledtrig_usb_host); } |
| 46 45 17 17 26 26 26 7 7 23 23 23 41 2 40 11 9 11 4 4 4 3 1 4 2 2 4 2 4 1 1 1 1 1 4 4 4 4 4 17 41 40 37 36 36 33 28 32 2 1 28 27 1 26 1 26 26 26 26 26 26 26 28 32 9 11 8 7 5 1 6 5 6 9 11 2 3 2 2 3 9 8 6 2 1 6 8 9 5 5 4 3 4 5 16 17 6 5 4 2 3 2 2 3 18 4 3 1 5 4 3 2 2 10 9 7 6 4 1 1 3 2 6 10 6 5 6 1 1 3 1 3 2 2 2 1 2 2 5 37 37 37 22 17 21 22 21 10 3 3 3 3 17 4 1 1 3 3 3 3 3 17 17 4 4 4 4 4 41 1 1 41 37 37 36 36 13 23 22 26 1 12 22 22 13 21 4 4 30 22 48 44 35 44 26 26 26 25 24 24 18 3 18 41 47 9 9 9 9 9 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 | // SPDX-License-Identifier: GPL-2.0 // Copyright (c) 2010-2011 EIA Electronics, // Pieter Beyens <pieter.beyens@eia.be> // Copyright (c) 2010-2011 EIA Electronics, // Kurt Van Dijck <kurt.van.dijck@eia.be> // Copyright (c) 2018 Protonic, // Robin van der Gracht <robin@protonic.nl> // Copyright (c) 2017-2019 Pengutronix, // Marc Kleine-Budde <kernel@pengutronix.de> // Copyright (c) 2017-2019 Pengutronix, // Oleksij Rempel <kernel@pengutronix.de> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/can/can-ml.h> #include <linux/can/core.h> #include <linux/can/skb.h> #include <linux/errqueue.h> #include <linux/if_arp.h> #include "j1939-priv.h" #define J1939_MIN_NAMELEN CAN_REQUIRED_SIZE(struct sockaddr_can, can_addr.j1939) /* conversion function between struct sock::sk_priority from linux and * j1939 priority field */ static inline priority_t j1939_prio(u32 sk_priority) { sk_priority = min(sk_priority, 7U); return 7 - sk_priority; } static inline u32 j1939_to_sk_priority(priority_t prio) { return 7 - prio; } /* function to see if pgn is to be evaluated */ static inline bool j1939_pgn_is_valid(pgn_t pgn) { return pgn <= J1939_PGN_MAX; } /* test function to avoid non-zero DA placeholder for pdu1 pgn's */ static inline bool j1939_pgn_is_clean_pdu(pgn_t pgn) { if (j1939_pgn_is_pdu1(pgn)) return !(pgn & 0xff); else return true; } static inline void j1939_sock_pending_add(struct sock *sk) { struct j1939_sock *jsk = j1939_sk(sk); atomic_inc(&jsk->skb_pending); } static int j1939_sock_pending_get(struct sock *sk) { struct j1939_sock *jsk = j1939_sk(sk); return atomic_read(&jsk->skb_pending); } void j1939_sock_pending_del(struct sock *sk) { struct j1939_sock *jsk = j1939_sk(sk); /* atomic_dec_return returns the new value */ if (!atomic_dec_return(&jsk->skb_pending)) wake_up(&jsk->waitq); /* no pending SKB's */ } static void j1939_jsk_add(struct j1939_priv *priv, struct j1939_sock *jsk) { jsk->state |= J1939_SOCK_BOUND; j1939_priv_get(priv); write_lock_bh(&priv->j1939_socks_lock); list_add_tail(&jsk->list, &priv->j1939_socks); write_unlock_bh(&priv->j1939_socks_lock); } static void j1939_jsk_del(struct j1939_priv *priv, struct j1939_sock *jsk) { write_lock_bh(&priv->j1939_socks_lock); list_del_init(&jsk->list); write_unlock_bh(&priv->j1939_socks_lock); j1939_priv_put(priv); jsk->state &= ~J1939_SOCK_BOUND; } static bool j1939_sk_queue_session(struct j1939_session *session) { struct j1939_sock *jsk = j1939_sk(session->sk); bool empty; spin_lock_bh(&jsk->sk_session_queue_lock); empty = list_empty(&jsk->sk_session_queue); j1939_session_get(session); list_add_tail(&session->sk_session_queue_entry, &jsk->sk_session_queue); spin_unlock_bh(&jsk->sk_session_queue_lock); j1939_sock_pending_add(&jsk->sk); return empty; } static struct j1939_session *j1939_sk_get_incomplete_session(struct j1939_sock *jsk) { struct j1939_session *session = NULL; spin_lock_bh(&jsk->sk_session_queue_lock); if (!list_empty(&jsk->sk_session_queue)) { session = list_last_entry(&jsk->sk_session_queue, struct j1939_session, sk_session_queue_entry); if (session->total_queued_size == session->total_message_size) session = NULL; else j1939_session_get(session); } spin_unlock_bh(&jsk->sk_session_queue_lock); return session; } static void j1939_sk_queue_drop_all(struct j1939_priv *priv, struct j1939_sock *jsk, int err) { struct j1939_session *session, *tmp; netdev_dbg(priv->ndev, "%s: err: %i\n", __func__, err); spin_lock_bh(&jsk->sk_session_queue_lock); list_for_each_entry_safe(session, tmp, &jsk->sk_session_queue, sk_session_queue_entry) { list_del_init(&session->sk_session_queue_entry); session->err = err; j1939_session_put(session); } spin_unlock_bh(&jsk->sk_session_queue_lock); } static void j1939_sk_queue_activate_next_locked(struct j1939_session *session) { struct j1939_sock *jsk; struct j1939_session *first; int err; /* RX-Session don't have a socket (yet) */ if (!session->sk) return; jsk = j1939_sk(session->sk); lockdep_assert_held(&jsk->sk_session_queue_lock); err = session->err; first = list_first_entry_or_null(&jsk->sk_session_queue, struct j1939_session, sk_session_queue_entry); /* Some else has already activated the next session */ if (first != session) return; activate_next: list_del_init(&first->sk_session_queue_entry); j1939_session_put(first); first = list_first_entry_or_null(&jsk->sk_session_queue, struct j1939_session, sk_session_queue_entry); if (!first) return; if (j1939_session_activate(first)) { netdev_warn_once(first->priv->ndev, "%s: 0x%p: Identical session is already activated.\n", __func__, first); first->err = -EBUSY; goto activate_next; } else { /* Give receiver some time (arbitrary chosen) to recover */ int time_ms = 0; if (err) time_ms = 10 + get_random_u32_below(16); j1939_tp_schedule_txtimer(first, time_ms); } } void j1939_sk_queue_activate_next(struct j1939_session *session) { struct j1939_sock *jsk; if (!session->sk) return; jsk = j1939_sk(session->sk); spin_lock_bh(&jsk->sk_session_queue_lock); j1939_sk_queue_activate_next_locked(session); spin_unlock_bh(&jsk->sk_session_queue_lock); } static bool j1939_sk_match_dst(struct j1939_sock *jsk, const struct j1939_sk_buff_cb *skcb) { if ((jsk->state & J1939_SOCK_PROMISC)) return true; /* Destination address filter */ if (jsk->addr.src_name && skcb->addr.dst_name) { if (jsk->addr.src_name != skcb->addr.dst_name) return false; } else { /* receive (all sockets) if * - all packages that match our bind() address * - all broadcast on a socket if SO_BROADCAST * is set */ if (j1939_address_is_unicast(skcb->addr.da)) { if (jsk->addr.sa != skcb->addr.da) return false; } else if (!sock_flag(&jsk->sk, SOCK_BROADCAST)) { /* receiving broadcast without SO_BROADCAST * flag is not allowed */ return false; } } /* Source address filter */ if (jsk->state & J1939_SOCK_CONNECTED) { /* receive (all sockets) if * - all packages that match our connect() name or address */ if (jsk->addr.dst_name && skcb->addr.src_name) { if (jsk->addr.dst_name != skcb->addr.src_name) return false; } else { if (jsk->addr.da != skcb->addr.sa) return false; } } /* PGN filter */ if (j1939_pgn_is_valid(jsk->pgn_rx_filter) && jsk->pgn_rx_filter != skcb->addr.pgn) return false; return true; } /* matches skb control buffer (addr) with a j1939 filter */ static bool j1939_sk_match_filter(struct j1939_sock *jsk, const struct j1939_sk_buff_cb *skcb) { const struct j1939_filter *f; int nfilter; spin_lock_bh(&jsk->filters_lock); f = jsk->filters; nfilter = jsk->nfilters; if (!nfilter) /* receive all when no filters are assigned */ goto filter_match_found; for (; nfilter; ++f, --nfilter) { if ((skcb->addr.pgn & f->pgn_mask) != f->pgn) continue; if ((skcb->addr.sa & f->addr_mask) != f->addr) continue; if ((skcb->addr.src_name & f->name_mask) != f->name) continue; goto filter_match_found; } spin_unlock_bh(&jsk->filters_lock); return false; filter_match_found: spin_unlock_bh(&jsk->filters_lock); return true; } static bool j1939_sk_recv_match_one(struct j1939_sock *jsk, const struct j1939_sk_buff_cb *skcb) { if (!(jsk->state & J1939_SOCK_BOUND)) return false; if (!j1939_sk_match_dst(jsk, skcb)) return false; if (!j1939_sk_match_filter(jsk, skcb)) return false; return true; } static void j1939_sk_recv_one(struct j1939_sock *jsk, struct sk_buff *oskb) { const struct j1939_sk_buff_cb *oskcb = j1939_skb_to_cb(oskb); struct j1939_sk_buff_cb *skcb; struct sk_buff *skb; if (oskb->sk == &jsk->sk) return; if (!j1939_sk_recv_match_one(jsk, oskcb)) return; skb = skb_clone(oskb, GFP_ATOMIC); if (!skb) { pr_warn("skb clone failed\n"); return; } can_skb_set_owner(skb, oskb->sk); skcb = j1939_skb_to_cb(skb); skcb->msg_flags &= ~(MSG_DONTROUTE); if (skb->sk) skcb->msg_flags |= MSG_DONTROUTE; if (sock_queue_rcv_skb(&jsk->sk, skb) < 0) kfree_skb(skb); } bool j1939_sk_recv_match(struct j1939_priv *priv, struct j1939_sk_buff_cb *skcb) { struct j1939_sock *jsk; bool match = false; read_lock_bh(&priv->j1939_socks_lock); list_for_each_entry(jsk, &priv->j1939_socks, list) { match = j1939_sk_recv_match_one(jsk, skcb); if (match) break; } read_unlock_bh(&priv->j1939_socks_lock); return match; } void j1939_sk_recv(struct j1939_priv *priv, struct sk_buff *skb) { struct j1939_sock *jsk; read_lock_bh(&priv->j1939_socks_lock); list_for_each_entry(jsk, &priv->j1939_socks, list) { j1939_sk_recv_one(jsk, skb); } read_unlock_bh(&priv->j1939_socks_lock); } static void j1939_sk_sock_destruct(struct sock *sk) { struct j1939_sock *jsk = j1939_sk(sk); /* This function will be called by the generic networking code, when * the socket is ultimately closed (sk->sk_destruct). * * The race between * - processing a received CAN frame * (can_receive -> j1939_can_recv) * and accessing j1939_priv * ... and ... * - closing a socket * (j1939_can_rx_unregister -> can_rx_unregister) * and calling the final j1939_priv_put() * * is avoided by calling the final j1939_priv_put() from this * RCU deferred cleanup call. */ if (jsk->priv) { j1939_priv_put(jsk->priv); jsk->priv = NULL; } /* call generic CAN sock destruct */ can_sock_destruct(sk); } static int j1939_sk_init(struct sock *sk) { struct j1939_sock *jsk = j1939_sk(sk); /* Ensure that "sk" is first member in "struct j1939_sock", so that we * can skip it during memset(). */ BUILD_BUG_ON(offsetof(struct j1939_sock, sk) != 0); memset((void *)jsk + sizeof(jsk->sk), 0x0, sizeof(*jsk) - sizeof(jsk->sk)); INIT_LIST_HEAD(&jsk->list); init_waitqueue_head(&jsk->waitq); jsk->sk.sk_priority = j1939_to_sk_priority(6); jsk->sk.sk_reuse = 1; /* per default */ jsk->addr.sa = J1939_NO_ADDR; jsk->addr.da = J1939_NO_ADDR; jsk->addr.pgn = J1939_NO_PGN; jsk->pgn_rx_filter = J1939_NO_PGN; atomic_set(&jsk->skb_pending, 0); spin_lock_init(&jsk->sk_session_queue_lock); INIT_LIST_HEAD(&jsk->sk_session_queue); spin_lock_init(&jsk->filters_lock); /* j1939_sk_sock_destruct() depends on SOCK_RCU_FREE flag */ sock_set_flag(sk, SOCK_RCU_FREE); sk->sk_destruct = j1939_sk_sock_destruct; sk->sk_protocol = CAN_J1939; return 0; } static int j1939_sk_sanity_check(struct sockaddr_can *addr, int len) { if (!addr) return -EDESTADDRREQ; if (len < J1939_MIN_NAMELEN) return -EINVAL; if (addr->can_family != AF_CAN) return -EINVAL; if (!addr->can_ifindex) return -ENODEV; if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn) && !j1939_pgn_is_clean_pdu(addr->can_addr.j1939.pgn)) return -EINVAL; return 0; } static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len) { struct sockaddr_can *addr = (struct sockaddr_can *)uaddr; struct j1939_sock *jsk = j1939_sk(sock->sk); struct j1939_priv *priv; struct sock *sk; struct net *net; int ret = 0; ret = j1939_sk_sanity_check(addr, len); if (ret) return ret; lock_sock(sock->sk); priv = jsk->priv; sk = sock->sk; net = sock_net(sk); /* Already bound to an interface? */ if (jsk->state & J1939_SOCK_BOUND) { /* A re-bind() to a different interface is not * supported. */ if (jsk->ifindex != addr->can_ifindex) { ret = -EINVAL; goto out_release_sock; } /* drop old references */ j1939_jsk_del(priv, jsk); j1939_local_ecu_put(priv, jsk->addr.src_name, jsk->addr.sa); } else { struct can_ml_priv *can_ml; struct net_device *ndev; ndev = dev_get_by_index(net, addr->can_ifindex); if (!ndev) { ret = -ENODEV; goto out_release_sock; } can_ml = can_get_ml_priv(ndev); if (!can_ml) { dev_put(ndev); ret = -ENODEV; goto out_release_sock; } if (!(ndev->flags & IFF_UP)) { dev_put(ndev); ret = -ENETDOWN; goto out_release_sock; } priv = j1939_netdev_start(ndev); dev_put(ndev); if (IS_ERR(priv)) { ret = PTR_ERR(priv); goto out_release_sock; } jsk->ifindex = addr->can_ifindex; /* the corresponding j1939_priv_put() is called via * sk->sk_destruct, which points to j1939_sk_sock_destruct() */ j1939_priv_get(priv); jsk->priv = priv; } /* set default transmit pgn */ if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn)) jsk->pgn_rx_filter = addr->can_addr.j1939.pgn; jsk->addr.src_name = addr->can_addr.j1939.name; jsk->addr.sa = addr->can_addr.j1939.addr; /* get new references */ ret = j1939_local_ecu_get(priv, jsk->addr.src_name, jsk->addr.sa); if (ret) { j1939_netdev_stop(priv); goto out_release_sock; } j1939_jsk_add(priv, jsk); out_release_sock: /* fall through */ release_sock(sock->sk); return ret; } static int j1939_sk_connect(struct socket *sock, struct sockaddr *uaddr, int len, int flags) { struct sockaddr_can *addr = (struct sockaddr_can *)uaddr; struct j1939_sock *jsk = j1939_sk(sock->sk); int ret = 0; ret = j1939_sk_sanity_check(addr, len); if (ret) return ret; lock_sock(sock->sk); /* bind() before connect() is mandatory */ if (!(jsk->state & J1939_SOCK_BOUND)) { ret = -EINVAL; goto out_release_sock; } /* A connect() to a different interface is not supported. */ if (jsk->ifindex != addr->can_ifindex) { ret = -EINVAL; goto out_release_sock; } if (!addr->can_addr.j1939.name && addr->can_addr.j1939.addr == J1939_NO_ADDR && !sock_flag(&jsk->sk, SOCK_BROADCAST)) { /* broadcast, but SO_BROADCAST not set */ ret = -EACCES; goto out_release_sock; } jsk->addr.dst_name = addr->can_addr.j1939.name; jsk->addr.da = addr->can_addr.j1939.addr; if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn)) jsk->addr.pgn = addr->can_addr.j1939.pgn; jsk->state |= J1939_SOCK_CONNECTED; out_release_sock: /* fall through */ release_sock(sock->sk); return ret; } static void j1939_sk_sock2sockaddr_can(struct sockaddr_can *addr, const struct j1939_sock *jsk, int peer) { /* There are two holes (2 bytes and 3 bytes) to clear to avoid * leaking kernel information to user space. */ memset(addr, 0, J1939_MIN_NAMELEN); addr->can_family = AF_CAN; addr->can_ifindex = jsk->ifindex; addr->can_addr.j1939.pgn = jsk->addr.pgn; if (peer) { addr->can_addr.j1939.name = jsk->addr.dst_name; addr->can_addr.j1939.addr = jsk->addr.da; } else { addr->can_addr.j1939.name = jsk->addr.src_name; addr->can_addr.j1939.addr = jsk->addr.sa; } } static int j1939_sk_getname(struct socket *sock, struct sockaddr *uaddr, int peer) { struct sockaddr_can *addr = (struct sockaddr_can *)uaddr; struct sock *sk = sock->sk; struct j1939_sock *jsk = j1939_sk(sk); int ret = 0; lock_sock(sk); if (peer && !(jsk->state & J1939_SOCK_CONNECTED)) { ret = -EADDRNOTAVAIL; goto failure; } j1939_sk_sock2sockaddr_can(addr, jsk, peer); ret = J1939_MIN_NAMELEN; failure: release_sock(sk); return ret; } static int j1939_sk_release(struct socket *sock) { struct sock *sk = sock->sk; struct j1939_sock *jsk; if (!sk) return 0; lock_sock(sk); jsk = j1939_sk(sk); if (jsk->state & J1939_SOCK_BOUND) { struct j1939_priv *priv = jsk->priv; if (wait_event_interruptible(jsk->waitq, !j1939_sock_pending_get(&jsk->sk))) { j1939_cancel_active_session(priv, sk); j1939_sk_queue_drop_all(priv, jsk, ESHUTDOWN); } j1939_jsk_del(priv, jsk); j1939_local_ecu_put(priv, jsk->addr.src_name, jsk->addr.sa); j1939_netdev_stop(priv); } kfree(jsk->filters); sock_orphan(sk); sock->sk = NULL; release_sock(sk); sock_put(sk); return 0; } static int j1939_sk_setsockopt_flag(struct j1939_sock *jsk, sockptr_t optval, unsigned int optlen, int flag) { int tmp; if (optlen != sizeof(tmp)) return -EINVAL; if (copy_from_sockptr(&tmp, optval, optlen)) return -EFAULT; lock_sock(&jsk->sk); if (tmp) jsk->state |= flag; else jsk->state &= ~flag; release_sock(&jsk->sk); return tmp; } static int j1939_sk_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { struct sock *sk = sock->sk; struct j1939_sock *jsk = j1939_sk(sk); int tmp, count = 0, ret = 0; struct j1939_filter *filters = NULL, *ofilters; if (level != SOL_CAN_J1939) return -EINVAL; switch (optname) { case SO_J1939_FILTER: if (!sockptr_is_null(optval) && optlen != 0) { struct j1939_filter *f; int c; if (optlen % sizeof(*filters) != 0) return -EINVAL; if (optlen > J1939_FILTER_MAX * sizeof(struct j1939_filter)) return -EINVAL; count = optlen / sizeof(*filters); filters = memdup_sockptr(optval, optlen); if (IS_ERR(filters)) return PTR_ERR(filters); for (f = filters, c = count; c; f++, c--) { f->name &= f->name_mask; f->pgn &= f->pgn_mask; f->addr &= f->addr_mask; } } lock_sock(&jsk->sk); spin_lock_bh(&jsk->filters_lock); ofilters = jsk->filters; jsk->filters = filters; jsk->nfilters = count; spin_unlock_bh(&jsk->filters_lock); release_sock(&jsk->sk); kfree(ofilters); return 0; case SO_J1939_PROMISC: return j1939_sk_setsockopt_flag(jsk, optval, optlen, J1939_SOCK_PROMISC); case SO_J1939_ERRQUEUE: ret = j1939_sk_setsockopt_flag(jsk, optval, optlen, J1939_SOCK_ERRQUEUE); if (ret < 0) return ret; if (!(jsk->state & J1939_SOCK_ERRQUEUE)) skb_queue_purge(&sk->sk_error_queue); return ret; case SO_J1939_SEND_PRIO: if (optlen != sizeof(tmp)) return -EINVAL; if (copy_from_sockptr(&tmp, optval, optlen)) return -EFAULT; if (tmp < 0 || tmp > 7) return -EDOM; if (tmp < 2 && !capable(CAP_NET_ADMIN)) return -EPERM; lock_sock(&jsk->sk); jsk->sk.sk_priority = j1939_to_sk_priority(tmp); release_sock(&jsk->sk); return 0; default: return -ENOPROTOOPT; } } static int j1939_sk_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct sock *sk = sock->sk; struct j1939_sock *jsk = j1939_sk(sk); int ret, ulen; /* set defaults for using 'int' properties */ int tmp = 0; int len = sizeof(tmp); void *val = &tmp; if (level != SOL_CAN_J1939) return -EINVAL; if (get_user(ulen, optlen)) return -EFAULT; if (ulen < 0) return -EINVAL; lock_sock(&jsk->sk); switch (optname) { case SO_J1939_PROMISC: tmp = (jsk->state & J1939_SOCK_PROMISC) ? 1 : 0; break; case SO_J1939_ERRQUEUE: tmp = (jsk->state & J1939_SOCK_ERRQUEUE) ? 1 : 0; break; case SO_J1939_SEND_PRIO: tmp = j1939_prio(jsk->sk.sk_priority); break; default: ret = -ENOPROTOOPT; goto no_copy; } /* copy to user, based on 'len' & 'val' * but most sockopt's are 'int' properties, and have 'len' & 'val' * left unchanged, but instead modified 'tmp' */ if (len > ulen) ret = -EFAULT; else if (put_user(len, optlen)) ret = -EFAULT; else if (copy_to_user(optval, val, len)) ret = -EFAULT; else ret = 0; no_copy: release_sock(&jsk->sk); return ret; } static int j1939_sk_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct sk_buff *skb; struct j1939_sk_buff_cb *skcb; int ret = 0; if (flags & ~(MSG_DONTWAIT | MSG_ERRQUEUE | MSG_CMSG_COMPAT)) return -EINVAL; if (flags & MSG_ERRQUEUE) return sock_recv_errqueue(sock->sk, msg, size, SOL_CAN_J1939, SCM_J1939_ERRQUEUE); skb = skb_recv_datagram(sk, flags, &ret); if (!skb) return ret; if (size < skb->len) msg->msg_flags |= MSG_TRUNC; else size = skb->len; ret = memcpy_to_msg(msg, skb->data, size); if (ret < 0) { skb_free_datagram(sk, skb); return ret; } skcb = j1939_skb_to_cb(skb); if (j1939_address_is_valid(skcb->addr.da)) put_cmsg(msg, SOL_CAN_J1939, SCM_J1939_DEST_ADDR, sizeof(skcb->addr.da), &skcb->addr.da); if (skcb->addr.dst_name) put_cmsg(msg, SOL_CAN_J1939, SCM_J1939_DEST_NAME, sizeof(skcb->addr.dst_name), &skcb->addr.dst_name); put_cmsg(msg, SOL_CAN_J1939, SCM_J1939_PRIO, sizeof(skcb->priority), &skcb->priority); if (msg->msg_name) { struct sockaddr_can *paddr = msg->msg_name; msg->msg_namelen = J1939_MIN_NAMELEN; memset(msg->msg_name, 0, msg->msg_namelen); paddr->can_family = AF_CAN; paddr->can_ifindex = skb->skb_iif; paddr->can_addr.j1939.name = skcb->addr.src_name; paddr->can_addr.j1939.addr = skcb->addr.sa; paddr->can_addr.j1939.pgn = skcb->addr.pgn; } sock_recv_cmsgs(msg, sk, skb); msg->msg_flags |= skcb->msg_flags; skb_free_datagram(sk, skb); return size; } static struct sk_buff *j1939_sk_alloc_skb(struct net_device *ndev, struct sock *sk, struct msghdr *msg, size_t size, int *errcode) { struct j1939_sock *jsk = j1939_sk(sk); struct j1939_sk_buff_cb *skcb; struct sk_buff *skb; int ret; skb = sock_alloc_send_skb(sk, size + sizeof(struct can_frame) - sizeof(((struct can_frame *)NULL)->data) + sizeof(struct can_skb_priv), msg->msg_flags & MSG_DONTWAIT, &ret); if (!skb) goto failure; can_skb_reserve(skb); can_skb_prv(skb)->ifindex = ndev->ifindex; can_skb_prv(skb)->skbcnt = 0; skb_reserve(skb, offsetof(struct can_frame, data)); ret = memcpy_from_msg(skb_put(skb, size), msg, size); if (ret < 0) goto free_skb; skb->dev = ndev; skcb = j1939_skb_to_cb(skb); memset(skcb, 0, sizeof(*skcb)); skcb->addr = jsk->addr; skcb->priority = j1939_prio(READ_ONCE(sk->sk_priority)); if (msg->msg_name) { struct sockaddr_can *addr = msg->msg_name; if (addr->can_addr.j1939.name || addr->can_addr.j1939.addr != J1939_NO_ADDR) { skcb->addr.dst_name = addr->can_addr.j1939.name; skcb->addr.da = addr->can_addr.j1939.addr; } if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn)) skcb->addr.pgn = addr->can_addr.j1939.pgn; } *errcode = ret; return skb; free_skb: kfree_skb(skb); failure: *errcode = ret; return NULL; } static size_t j1939_sk_opt_stats_get_size(enum j1939_sk_errqueue_type type) { switch (type) { case J1939_ERRQUEUE_RX_RTS: return nla_total_size(sizeof(u32)) + /* J1939_NLA_TOTAL_SIZE */ nla_total_size(sizeof(u32)) + /* J1939_NLA_PGN */ nla_total_size(sizeof(u64)) + /* J1939_NLA_SRC_NAME */ nla_total_size(sizeof(u64)) + /* J1939_NLA_DEST_NAME */ nla_total_size(sizeof(u8)) + /* J1939_NLA_SRC_ADDR */ nla_total_size(sizeof(u8)) + /* J1939_NLA_DEST_ADDR */ 0; default: return nla_total_size(sizeof(u32)) + /* J1939_NLA_BYTES_ACKED */ 0; } } static struct sk_buff * j1939_sk_get_timestamping_opt_stats(struct j1939_session *session, enum j1939_sk_errqueue_type type) { struct sk_buff *stats; u32 size; stats = alloc_skb(j1939_sk_opt_stats_get_size(type), GFP_ATOMIC); if (!stats) return NULL; if (session->skcb.addr.type == J1939_SIMPLE) size = session->total_message_size; else size = min(session->pkt.tx_acked * 7, session->total_message_size); switch (type) { case J1939_ERRQUEUE_RX_RTS: nla_put_u32(stats, J1939_NLA_TOTAL_SIZE, session->total_message_size); nla_put_u32(stats, J1939_NLA_PGN, session->skcb.addr.pgn); nla_put_u64_64bit(stats, J1939_NLA_SRC_NAME, session->skcb.addr.src_name, J1939_NLA_PAD); nla_put_u64_64bit(stats, J1939_NLA_DEST_NAME, session->skcb.addr.dst_name, J1939_NLA_PAD); nla_put_u8(stats, J1939_NLA_SRC_ADDR, session->skcb.addr.sa); nla_put_u8(stats, J1939_NLA_DEST_ADDR, session->skcb.addr.da); break; default: nla_put_u32(stats, J1939_NLA_BYTES_ACKED, size); } return stats; } static void __j1939_sk_errqueue(struct j1939_session *session, struct sock *sk, enum j1939_sk_errqueue_type type) { struct j1939_priv *priv = session->priv; struct j1939_sock *jsk; struct sock_exterr_skb *serr; struct sk_buff *skb; char *state = "UNK"; u32 tsflags; int err; jsk = j1939_sk(sk); if (!(jsk->state & J1939_SOCK_ERRQUEUE)) return; tsflags = READ_ONCE(sk->sk_tsflags); switch (type) { case J1939_ERRQUEUE_TX_ACK: if (!(tsflags & SOF_TIMESTAMPING_TX_ACK)) return; break; case J1939_ERRQUEUE_TX_SCHED: if (!(tsflags & SOF_TIMESTAMPING_TX_SCHED)) return; break; case J1939_ERRQUEUE_TX_ABORT: break; case J1939_ERRQUEUE_RX_RTS: fallthrough; case J1939_ERRQUEUE_RX_DPO: fallthrough; case J1939_ERRQUEUE_RX_ABORT: if (!(tsflags & SOF_TIMESTAMPING_RX_SOFTWARE)) return; break; default: netdev_err(priv->ndev, "Unknown errqueue type %i\n", type); } skb = j1939_sk_get_timestamping_opt_stats(session, type); if (!skb) return; skb->tstamp = ktime_get_real(); BUILD_BUG_ON(sizeof(struct sock_exterr_skb) > sizeof(skb->cb)); serr = SKB_EXT_ERR(skb); memset(serr, 0, sizeof(*serr)); switch (type) { case J1939_ERRQUEUE_TX_ACK: serr->ee.ee_errno = ENOMSG; serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING; serr->ee.ee_info = SCM_TSTAMP_ACK; state = "TX ACK"; break; case J1939_ERRQUEUE_TX_SCHED: serr->ee.ee_errno = ENOMSG; serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING; serr->ee.ee_info = SCM_TSTAMP_SCHED; state = "TX SCH"; break; case J1939_ERRQUEUE_TX_ABORT: serr->ee.ee_errno = session->err; serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL; serr->ee.ee_info = J1939_EE_INFO_TX_ABORT; state = "TX ABT"; break; case J1939_ERRQUEUE_RX_RTS: serr->ee.ee_errno = ENOMSG; serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL; serr->ee.ee_info = J1939_EE_INFO_RX_RTS; state = "RX RTS"; break; case J1939_ERRQUEUE_RX_DPO: serr->ee.ee_errno = ENOMSG; serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL; serr->ee.ee_info = J1939_EE_INFO_RX_DPO; state = "RX DPO"; break; case J1939_ERRQUEUE_RX_ABORT: serr->ee.ee_errno = session->err; serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL; serr->ee.ee_info = J1939_EE_INFO_RX_ABORT; state = "RX ABT"; break; } serr->opt_stats = true; if (tsflags & SOF_TIMESTAMPING_OPT_ID) serr->ee.ee_data = session->tskey; netdev_dbg(session->priv->ndev, "%s: 0x%p tskey: %i, state: %s\n", __func__, session, session->tskey, state); err = sock_queue_err_skb(sk, skb); if (err) kfree_skb(skb); }; void j1939_sk_errqueue(struct j1939_session *session, enum j1939_sk_errqueue_type type) { struct j1939_priv *priv = session->priv; struct j1939_sock *jsk; if (session->sk) { /* send TX notifications to the socket of origin */ __j1939_sk_errqueue(session, session->sk, type); return; } /* spread RX notifications to all sockets subscribed to this session */ read_lock_bh(&priv->j1939_socks_lock); list_for_each_entry(jsk, &priv->j1939_socks, list) { if (j1939_sk_recv_match_one(jsk, &session->skcb)) __j1939_sk_errqueue(session, &jsk->sk, type); } read_unlock_bh(&priv->j1939_socks_lock); }; void j1939_sk_send_loop_abort(struct sock *sk, int err) { struct j1939_sock *jsk = j1939_sk(sk); if (jsk->state & J1939_SOCK_ERRQUEUE) return; sk->sk_err = err; sk_error_report(sk); } static int j1939_sk_send_loop(struct j1939_priv *priv, struct sock *sk, struct msghdr *msg, size_t size) { struct j1939_sock *jsk = j1939_sk(sk); struct j1939_session *session = j1939_sk_get_incomplete_session(jsk); struct sk_buff *skb; size_t segment_size, todo_size; int ret = 0; if (session && session->total_message_size != session->total_queued_size + size) { j1939_session_put(session); return -EIO; } todo_size = size; while (todo_size) { struct j1939_sk_buff_cb *skcb; segment_size = min_t(size_t, J1939_MAX_TP_PACKET_SIZE, todo_size); /* Allocate skb for one segment */ skb = j1939_sk_alloc_skb(priv->ndev, sk, msg, segment_size, &ret); if (ret) break; skcb = j1939_skb_to_cb(skb); if (!session) { /* at this point the size should be full size * of the session */ skcb->offset = 0; session = j1939_tp_send(priv, skb, size); if (IS_ERR(session)) { ret = PTR_ERR(session); goto kfree_skb; } if (j1939_sk_queue_session(session)) { /* try to activate session if we a * fist in the queue */ if (!j1939_session_activate(session)) { j1939_tp_schedule_txtimer(session, 0); } else { ret = -EBUSY; session->err = ret; j1939_sk_queue_drop_all(priv, jsk, EBUSY); break; } } } else { skcb->offset = session->total_queued_size; j1939_session_skb_queue(session, skb); } todo_size -= segment_size; session->total_queued_size += segment_size; } switch (ret) { case 0: /* OK */ if (todo_size) netdev_warn(priv->ndev, "no error found and not completely queued?! %zu\n", todo_size); ret = size; break; case -ERESTARTSYS: ret = -EINTR; fallthrough; case -EAGAIN: /* OK */ if (todo_size != size) ret = size - todo_size; break; default: /* ERROR */ break; } if (session) j1939_session_put(session); return ret; kfree_skb: kfree_skb(skb); return ret; } static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct sock *sk = sock->sk; struct j1939_sock *jsk = j1939_sk(sk); struct j1939_priv *priv; int ifindex; int ret; lock_sock(sock->sk); /* various socket state tests */ if (!(jsk->state & J1939_SOCK_BOUND)) { ret = -EBADFD; goto sendmsg_done; } priv = jsk->priv; ifindex = jsk->ifindex; if (!jsk->addr.src_name && jsk->addr.sa == J1939_NO_ADDR) { /* no source address assigned yet */ ret = -EBADFD; goto sendmsg_done; } /* deal with provided destination address info */ if (msg->msg_name) { struct sockaddr_can *addr = msg->msg_name; if (msg->msg_namelen < J1939_MIN_NAMELEN) { ret = -EINVAL; goto sendmsg_done; } if (addr->can_family != AF_CAN) { ret = -EINVAL; goto sendmsg_done; } if (addr->can_ifindex && addr->can_ifindex != ifindex) { ret = -EBADFD; goto sendmsg_done; } if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn) && !j1939_pgn_is_clean_pdu(addr->can_addr.j1939.pgn)) { ret = -EINVAL; goto sendmsg_done; } if (!addr->can_addr.j1939.name && addr->can_addr.j1939.addr == J1939_NO_ADDR && !sock_flag(sk, SOCK_BROADCAST)) { /* broadcast, but SO_BROADCAST not set */ ret = -EACCES; goto sendmsg_done; } } else { if (!jsk->addr.dst_name && jsk->addr.da == J1939_NO_ADDR && !sock_flag(sk, SOCK_BROADCAST)) { /* broadcast, but SO_BROADCAST not set */ ret = -EACCES; goto sendmsg_done; } } ret = j1939_sk_send_loop(priv, sk, msg, size); sendmsg_done: release_sock(sock->sk); return ret; } void j1939_sk_netdev_event_netdown(struct j1939_priv *priv) { struct j1939_sock *jsk; int error_code = ENETDOWN; read_lock_bh(&priv->j1939_socks_lock); list_for_each_entry(jsk, &priv->j1939_socks, list) { jsk->sk.sk_err = error_code; if (!sock_flag(&jsk->sk, SOCK_DEAD)) sk_error_report(&jsk->sk); j1939_sk_queue_drop_all(priv, jsk, error_code); } read_unlock_bh(&priv->j1939_socks_lock); } static int j1939_sk_no_ioctlcmd(struct socket *sock, unsigned int cmd, unsigned long arg) { /* no ioctls for socket layer -> hand it down to NIC layer */ return -ENOIOCTLCMD; } static const struct proto_ops j1939_ops = { .family = PF_CAN, .release = j1939_sk_release, .bind = j1939_sk_bind, .connect = j1939_sk_connect, .socketpair = sock_no_socketpair, .accept = sock_no_accept, .getname = j1939_sk_getname, .poll = datagram_poll, .ioctl = j1939_sk_no_ioctlcmd, .listen = sock_no_listen, .shutdown = sock_no_shutdown, .setsockopt = j1939_sk_setsockopt, .getsockopt = j1939_sk_getsockopt, .sendmsg = j1939_sk_sendmsg, .recvmsg = j1939_sk_recvmsg, .mmap = sock_no_mmap, }; static struct proto j1939_proto __read_mostly = { .name = "CAN_J1939", .owner = THIS_MODULE, .obj_size = sizeof(struct j1939_sock), .init = j1939_sk_init, }; const struct can_proto j1939_can_proto = { .type = SOCK_DGRAM, .protocol = CAN_J1939, .ops = &j1939_ops, .prot = &j1939_proto, }; |
| 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 | // SPDX-License-Identifier: GPL-2.0-only /* * The industrial I/O core * * Copyright (c) 2008 Jonathan Cameron * * Based on elements of hwmon and input subsystems. */ #define pr_fmt(fmt) "iio-core: " fmt #include <linux/anon_inodes.h> #include <linux/cdev.h> #include <linux/cleanup.h> #include <linux/debugfs.h> #include <linux/device.h> #include <linux/err.h> #include <linux/fs.h> #include <linux/idr.h> #include <linux/kdev_t.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/mutex.h> #include <linux/poll.h> #include <linux/property.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/wait.h> #include <linux/iio/buffer.h> #include <linux/iio/buffer_impl.h> #include <linux/iio/events.h> #include <linux/iio/iio-opaque.h> #include <linux/iio/iio.h> #include <linux/iio/sysfs.h> #include "iio_core.h" #include "iio_core_trigger.h" /* IDA to assign each registered device a unique id */ static DEFINE_IDA(iio_ida); static dev_t iio_devt; #define IIO_DEV_MAX 256 const struct bus_type iio_bus_type = { .name = "iio", }; EXPORT_SYMBOL(iio_bus_type); static struct dentry *iio_debugfs_dentry; static const char * const iio_direction[] = { [0] = "in", [1] = "out", }; static const char * const iio_chan_type_name_spec[] = { [IIO_VOLTAGE] = "voltage", [IIO_CURRENT] = "current", [IIO_POWER] = "power", [IIO_ACCEL] = "accel", [IIO_ANGL_VEL] = "anglvel", [IIO_MAGN] = "magn", [IIO_LIGHT] = "illuminance", [IIO_INTENSITY] = "intensity", [IIO_PROXIMITY] = "proximity", [IIO_TEMP] = "temp", [IIO_INCLI] = "incli", [IIO_ROT] = "rot", [IIO_ANGL] = "angl", [IIO_TIMESTAMP] = "timestamp", [IIO_CAPACITANCE] = "capacitance", [IIO_ALTVOLTAGE] = "altvoltage", [IIO_CCT] = "cct", [IIO_PRESSURE] = "pressure", [IIO_HUMIDITYRELATIVE] = "humidityrelative", [IIO_ACTIVITY] = "activity", [IIO_STEPS] = "steps", [IIO_ENERGY] = "energy", [IIO_DISTANCE] = "distance", [IIO_VELOCITY] = "velocity", [IIO_CONCENTRATION] = "concentration", [IIO_RESISTANCE] = "resistance", [IIO_PH] = "ph", [IIO_UVINDEX] = "uvindex", [IIO_ELECTRICALCONDUCTIVITY] = "electricalconductivity", [IIO_COUNT] = "count", [IIO_INDEX] = "index", [IIO_GRAVITY] = "gravity", [IIO_POSITIONRELATIVE] = "positionrelative", [IIO_PHASE] = "phase", [IIO_MASSCONCENTRATION] = "massconcentration", [IIO_DELTA_ANGL] = "deltaangl", [IIO_DELTA_VELOCITY] = "deltavelocity", [IIO_COLORTEMP] = "colortemp", [IIO_CHROMATICITY] = "chromaticity", }; static const char * const iio_modifier_names[] = { [IIO_MOD_X] = "x", [IIO_MOD_Y] = "y", [IIO_MOD_Z] = "z", [IIO_MOD_X_AND_Y] = "x&y", [IIO_MOD_X_AND_Z] = "x&z", [IIO_MOD_Y_AND_Z] = "y&z", [IIO_MOD_X_AND_Y_AND_Z] = "x&y&z", [IIO_MOD_X_OR_Y] = "x|y", [IIO_MOD_X_OR_Z] = "x|z", [IIO_MOD_Y_OR_Z] = "y|z", [IIO_MOD_X_OR_Y_OR_Z] = "x|y|z", [IIO_MOD_ROOT_SUM_SQUARED_X_Y] = "sqrt(x^2+y^2)", [IIO_MOD_SUM_SQUARED_X_Y_Z] = "x^2+y^2+z^2", [IIO_MOD_LIGHT_BOTH] = "both", [IIO_MOD_LIGHT_IR] = "ir", [IIO_MOD_LIGHT_CLEAR] = "clear", [IIO_MOD_LIGHT_RED] = "red", [IIO_MOD_LIGHT_GREEN] = "green", [IIO_MOD_LIGHT_BLUE] = "blue", [IIO_MOD_LIGHT_UV] = "uv", [IIO_MOD_LIGHT_UVA] = "uva", [IIO_MOD_LIGHT_UVB] = "uvb", [IIO_MOD_LIGHT_DUV] = "duv", [IIO_MOD_QUATERNION] = "quaternion", [IIO_MOD_TEMP_AMBIENT] = "ambient", [IIO_MOD_TEMP_OBJECT] = "object", [IIO_MOD_NORTH_MAGN] = "from_north_magnetic", [IIO_MOD_NORTH_TRUE] = "from_north_true", [IIO_MOD_NORTH_MAGN_TILT_COMP] = "from_north_magnetic_tilt_comp", [IIO_MOD_NORTH_TRUE_TILT_COMP] = "from_north_true_tilt_comp", [IIO_MOD_RUNNING] = "running", [IIO_MOD_JOGGING] = "jogging", [IIO_MOD_WALKING] = "walking", [IIO_MOD_STILL] = "still", [IIO_MOD_ROOT_SUM_SQUARED_X_Y_Z] = "sqrt(x^2+y^2+z^2)", [IIO_MOD_I] = "i", [IIO_MOD_Q] = "q", [IIO_MOD_CO2] = "co2", [IIO_MOD_VOC] = "voc", [IIO_MOD_PM1] = "pm1", [IIO_MOD_PM2P5] = "pm2p5", [IIO_MOD_PM4] = "pm4", [IIO_MOD_PM10] = "pm10", [IIO_MOD_ETHANOL] = "ethanol", [IIO_MOD_H2] = "h2", [IIO_MOD_O2] = "o2", [IIO_MOD_LINEAR_X] = "linear_x", [IIO_MOD_LINEAR_Y] = "linear_y", [IIO_MOD_LINEAR_Z] = "linear_z", [IIO_MOD_PITCH] = "pitch", [IIO_MOD_YAW] = "yaw", [IIO_MOD_ROLL] = "roll", }; /* relies on pairs of these shared then separate */ static const char * const iio_chan_info_postfix[] = { [IIO_CHAN_INFO_RAW] = "raw", [IIO_CHAN_INFO_PROCESSED] = "input", [IIO_CHAN_INFO_SCALE] = "scale", [IIO_CHAN_INFO_OFFSET] = "offset", [IIO_CHAN_INFO_CALIBSCALE] = "calibscale", [IIO_CHAN_INFO_CALIBBIAS] = "calibbias", [IIO_CHAN_INFO_PEAK] = "peak_raw", [IIO_CHAN_INFO_PEAK_SCALE] = "peak_scale", [IIO_CHAN_INFO_QUADRATURE_CORRECTION_RAW] = "quadrature_correction_raw", [IIO_CHAN_INFO_AVERAGE_RAW] = "mean_raw", [IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY] = "filter_low_pass_3db_frequency", [IIO_CHAN_INFO_HIGH_PASS_FILTER_3DB_FREQUENCY] = "filter_high_pass_3db_frequency", [IIO_CHAN_INFO_SAMP_FREQ] = "sampling_frequency", [IIO_CHAN_INFO_FREQUENCY] = "frequency", [IIO_CHAN_INFO_PHASE] = "phase", [IIO_CHAN_INFO_HARDWAREGAIN] = "hardwaregain", [IIO_CHAN_INFO_HYSTERESIS] = "hysteresis", [IIO_CHAN_INFO_HYSTERESIS_RELATIVE] = "hysteresis_relative", [IIO_CHAN_INFO_INT_TIME] = "integration_time", [IIO_CHAN_INFO_ENABLE] = "en", [IIO_CHAN_INFO_CALIBHEIGHT] = "calibheight", [IIO_CHAN_INFO_CALIBWEIGHT] = "calibweight", [IIO_CHAN_INFO_DEBOUNCE_COUNT] = "debounce_count", [IIO_CHAN_INFO_DEBOUNCE_TIME] = "debounce_time", [IIO_CHAN_INFO_CALIBEMISSIVITY] = "calibemissivity", [IIO_CHAN_INFO_OVERSAMPLING_RATIO] = "oversampling_ratio", [IIO_CHAN_INFO_THERMOCOUPLE_TYPE] = "thermocouple_type", [IIO_CHAN_INFO_CALIBAMBIENT] = "calibambient", [IIO_CHAN_INFO_ZEROPOINT] = "zeropoint", [IIO_CHAN_INFO_TROUGH] = "trough_raw", }; /** * iio_device_id() - query the unique ID for the device * @indio_dev: Device structure whose ID is being queried * * The IIO device ID is a unique index used for example for the naming * of the character device /dev/iio\:device[ID]. * * Returns: Unique ID for the device. */ int iio_device_id(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); return iio_dev_opaque->id; } EXPORT_SYMBOL_GPL(iio_device_id); /** * iio_buffer_enabled() - helper function to test if the buffer is enabled * @indio_dev: IIO device structure for device * * Returns: True, if the buffer is enabled. */ bool iio_buffer_enabled(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); return iio_dev_opaque->currentmode & INDIO_ALL_BUFFER_MODES; } EXPORT_SYMBOL_GPL(iio_buffer_enabled); #if defined(CONFIG_DEBUG_FS) /* * There's also a CONFIG_DEBUG_FS guard in include/linux/iio/iio.h for * iio_get_debugfs_dentry() to make it inline if CONFIG_DEBUG_FS is undefined */ struct dentry *iio_get_debugfs_dentry(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); return iio_dev_opaque->debugfs_dentry; } EXPORT_SYMBOL_GPL(iio_get_debugfs_dentry); #endif /** * iio_find_channel_from_si() - get channel from its scan index * @indio_dev: device * @si: scan index to match * * Returns: * Constant pointer to iio_chan_spec, if scan index matches, NULL on failure. */ const struct iio_chan_spec *iio_find_channel_from_si(struct iio_dev *indio_dev, int si) { int i; for (i = 0; i < indio_dev->num_channels; i++) if (indio_dev->channels[i].scan_index == si) return &indio_dev->channels[i]; return NULL; } /* This turns up an awful lot */ ssize_t iio_read_const_attr(struct device *dev, struct device_attribute *attr, char *buf) { return sysfs_emit(buf, "%s\n", to_iio_const_attr(attr)->string); } EXPORT_SYMBOL(iio_read_const_attr); /** * iio_device_set_clock() - Set current timestamping clock for the device * @indio_dev: IIO device structure containing the device * @clock_id: timestamping clock POSIX identifier to set. * * Returns: 0 on success, or a negative error code. */ int iio_device_set_clock(struct iio_dev *indio_dev, clockid_t clock_id) { int ret; struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); const struct iio_event_interface *ev_int = iio_dev_opaque->event_interface; ret = mutex_lock_interruptible(&iio_dev_opaque->mlock); if (ret) return ret; if ((ev_int && iio_event_enabled(ev_int)) || iio_buffer_enabled(indio_dev)) { mutex_unlock(&iio_dev_opaque->mlock); return -EBUSY; } iio_dev_opaque->clock_id = clock_id; mutex_unlock(&iio_dev_opaque->mlock); return 0; } EXPORT_SYMBOL(iio_device_set_clock); /** * iio_device_get_clock() - Retrieve current timestamping clock for the device * @indio_dev: IIO device structure containing the device * * Returns: Clock ID of the current timestamping clock for the device. */ clockid_t iio_device_get_clock(const struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); return iio_dev_opaque->clock_id; } EXPORT_SYMBOL(iio_device_get_clock); /** * iio_get_time_ns() - utility function to get a time stamp for events etc * @indio_dev: device * * Returns: Timestamp of the event in nanoseconds. */ s64 iio_get_time_ns(const struct iio_dev *indio_dev) { struct timespec64 tp; switch (iio_device_get_clock(indio_dev)) { case CLOCK_REALTIME: return ktime_get_real_ns(); case CLOCK_MONOTONIC: return ktime_get_ns(); case CLOCK_MONOTONIC_RAW: return ktime_get_raw_ns(); case CLOCK_REALTIME_COARSE: return ktime_to_ns(ktime_get_coarse_real()); case CLOCK_MONOTONIC_COARSE: ktime_get_coarse_ts64(&tp); return timespec64_to_ns(&tp); case CLOCK_BOOTTIME: return ktime_get_boottime_ns(); case CLOCK_TAI: return ktime_get_clocktai_ns(); default: BUG(); } } EXPORT_SYMBOL(iio_get_time_ns); static int __init iio_init(void) { int ret; /* Register sysfs bus */ ret = bus_register(&iio_bus_type); if (ret < 0) { pr_err("could not register bus type\n"); goto error_nothing; } ret = alloc_chrdev_region(&iio_devt, 0, IIO_DEV_MAX, "iio"); if (ret < 0) { pr_err("failed to allocate char dev region\n"); goto error_unregister_bus_type; } iio_debugfs_dentry = debugfs_create_dir("iio", NULL); return 0; error_unregister_bus_type: bus_unregister(&iio_bus_type); error_nothing: return ret; } static void __exit iio_exit(void) { if (iio_devt) unregister_chrdev_region(iio_devt, IIO_DEV_MAX); bus_unregister(&iio_bus_type); debugfs_remove(iio_debugfs_dentry); } #if defined(CONFIG_DEBUG_FS) static ssize_t iio_debugfs_read_reg(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { struct iio_dev *indio_dev = file->private_data; struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); unsigned int val = 0; int ret; if (*ppos > 0) return simple_read_from_buffer(userbuf, count, ppos, iio_dev_opaque->read_buf, iio_dev_opaque->read_buf_len); ret = indio_dev->info->debugfs_reg_access(indio_dev, iio_dev_opaque->cached_reg_addr, 0, &val); if (ret) { dev_err(indio_dev->dev.parent, "%s: read failed\n", __func__); return ret; } iio_dev_opaque->read_buf_len = snprintf(iio_dev_opaque->read_buf, sizeof(iio_dev_opaque->read_buf), "0x%X\n", val); return simple_read_from_buffer(userbuf, count, ppos, iio_dev_opaque->read_buf, iio_dev_opaque->read_buf_len); } static ssize_t iio_debugfs_write_reg(struct file *file, const char __user *userbuf, size_t count, loff_t *ppos) { struct iio_dev *indio_dev = file->private_data; struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); unsigned int reg, val; char buf[80]; int ret; count = min(count, sizeof(buf) - 1); if (copy_from_user(buf, userbuf, count)) return -EFAULT; buf[count] = 0; ret = sscanf(buf, "%i %i", ®, &val); switch (ret) { case 1: iio_dev_opaque->cached_reg_addr = reg; break; case 2: iio_dev_opaque->cached_reg_addr = reg; ret = indio_dev->info->debugfs_reg_access(indio_dev, reg, val, NULL); if (ret) { dev_err(indio_dev->dev.parent, "%s: write failed\n", __func__); return ret; } break; default: return -EINVAL; } return count; } static const struct file_operations iio_debugfs_reg_fops = { .open = simple_open, .read = iio_debugfs_read_reg, .write = iio_debugfs_write_reg, }; static void iio_device_unregister_debugfs(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); debugfs_remove_recursive(iio_dev_opaque->debugfs_dentry); } static void iio_device_register_debugfs(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque; if (indio_dev->info->debugfs_reg_access == NULL) return; if (!iio_debugfs_dentry) return; iio_dev_opaque = to_iio_dev_opaque(indio_dev); iio_dev_opaque->debugfs_dentry = debugfs_create_dir(dev_name(&indio_dev->dev), iio_debugfs_dentry); debugfs_create_file("direct_reg_access", 0644, iio_dev_opaque->debugfs_dentry, indio_dev, &iio_debugfs_reg_fops); } #else static void iio_device_register_debugfs(struct iio_dev *indio_dev) { } static void iio_device_unregister_debugfs(struct iio_dev *indio_dev) { } #endif /* CONFIG_DEBUG_FS */ static ssize_t iio_read_channel_ext_info(struct device *dev, struct device_attribute *attr, char *buf) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct iio_dev_attr *this_attr = to_iio_dev_attr(attr); const struct iio_chan_spec_ext_info *ext_info; ext_info = &this_attr->c->ext_info[this_attr->address]; return ext_info->read(indio_dev, ext_info->private, this_attr->c, buf); } static ssize_t iio_write_channel_ext_info(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct iio_dev_attr *this_attr = to_iio_dev_attr(attr); const struct iio_chan_spec_ext_info *ext_info; ext_info = &this_attr->c->ext_info[this_attr->address]; return ext_info->write(indio_dev, ext_info->private, this_attr->c, buf, len); } ssize_t iio_enum_available_read(struct iio_dev *indio_dev, uintptr_t priv, const struct iio_chan_spec *chan, char *buf) { const struct iio_enum *e = (const struct iio_enum *)priv; unsigned int i; size_t len = 0; if (!e->num_items) return 0; for (i = 0; i < e->num_items; ++i) { if (!e->items[i]) continue; len += sysfs_emit_at(buf, len, "%s ", e->items[i]); } /* replace last space with a newline */ buf[len - 1] = '\n'; return len; } EXPORT_SYMBOL_GPL(iio_enum_available_read); ssize_t iio_enum_read(struct iio_dev *indio_dev, uintptr_t priv, const struct iio_chan_spec *chan, char *buf) { const struct iio_enum *e = (const struct iio_enum *)priv; int i; if (!e->get) return -EINVAL; i = e->get(indio_dev, chan); if (i < 0) return i; if (i >= e->num_items || !e->items[i]) return -EINVAL; return sysfs_emit(buf, "%s\n", e->items[i]); } EXPORT_SYMBOL_GPL(iio_enum_read); ssize_t iio_enum_write(struct iio_dev *indio_dev, uintptr_t priv, const struct iio_chan_spec *chan, const char *buf, size_t len) { const struct iio_enum *e = (const struct iio_enum *)priv; int ret; if (!e->set) return -EINVAL; ret = __sysfs_match_string(e->items, e->num_items, buf); if (ret < 0) return ret; ret = e->set(indio_dev, chan, ret); return ret ? ret : len; } EXPORT_SYMBOL_GPL(iio_enum_write); static const struct iio_mount_matrix iio_mount_idmatrix = { .rotation = { "1", "0", "0", "0", "1", "0", "0", "0", "1" } }; static int iio_setup_mount_idmatrix(const struct device *dev, struct iio_mount_matrix *matrix) { *matrix = iio_mount_idmatrix; dev_info(dev, "mounting matrix not found: using identity...\n"); return 0; } ssize_t iio_show_mount_matrix(struct iio_dev *indio_dev, uintptr_t priv, const struct iio_chan_spec *chan, char *buf) { const struct iio_mount_matrix *mtx; mtx = ((iio_get_mount_matrix_t *)priv)(indio_dev, chan); if (IS_ERR(mtx)) return PTR_ERR(mtx); if (!mtx) mtx = &iio_mount_idmatrix; return sysfs_emit(buf, "%s, %s, %s; %s, %s, %s; %s, %s, %s\n", mtx->rotation[0], mtx->rotation[1], mtx->rotation[2], mtx->rotation[3], mtx->rotation[4], mtx->rotation[5], mtx->rotation[6], mtx->rotation[7], mtx->rotation[8]); } EXPORT_SYMBOL_GPL(iio_show_mount_matrix); /** * iio_read_mount_matrix() - retrieve iio device mounting matrix from * device "mount-matrix" property * @dev: device the mounting matrix property is assigned to * @matrix: where to store retrieved matrix * * If device is assigned no mounting matrix property, a default 3x3 identity * matrix will be filled in. * * Returns: 0 if success, or a negative error code on failure. */ int iio_read_mount_matrix(struct device *dev, struct iio_mount_matrix *matrix) { size_t len = ARRAY_SIZE(iio_mount_idmatrix.rotation); int err; err = device_property_read_string_array(dev, "mount-matrix", matrix->rotation, len); if (err == len) return 0; if (err >= 0) /* Invalid number of matrix entries. */ return -EINVAL; if (err != -EINVAL) /* Invalid matrix declaration format. */ return err; /* Matrix was not declared at all: fallback to identity. */ return iio_setup_mount_idmatrix(dev, matrix); } EXPORT_SYMBOL(iio_read_mount_matrix); static ssize_t __iio_format_value(char *buf, size_t offset, unsigned int type, int size, const int *vals) { int tmp0, tmp1; s64 tmp2; bool scale_db = false; switch (type) { case IIO_VAL_INT: return sysfs_emit_at(buf, offset, "%d", vals[0]); case IIO_VAL_INT_PLUS_MICRO_DB: scale_db = true; fallthrough; case IIO_VAL_INT_PLUS_MICRO: if (vals[1] < 0) return sysfs_emit_at(buf, offset, "-%d.%06u%s", abs(vals[0]), -vals[1], scale_db ? " dB" : ""); else return sysfs_emit_at(buf, offset, "%d.%06u%s", vals[0], vals[1], scale_db ? " dB" : ""); case IIO_VAL_INT_PLUS_NANO: if (vals[1] < 0) return sysfs_emit_at(buf, offset, "-%d.%09u", abs(vals[0]), -vals[1]); else return sysfs_emit_at(buf, offset, "%d.%09u", vals[0], vals[1]); case IIO_VAL_FRACTIONAL: tmp2 = div_s64((s64)vals[0] * 1000000000LL, vals[1]); tmp1 = vals[1]; tmp0 = (int)div_s64_rem(tmp2, 1000000000, &tmp1); if ((tmp2 < 0) && (tmp0 == 0)) return sysfs_emit_at(buf, offset, "-0.%09u", abs(tmp1)); else return sysfs_emit_at(buf, offset, "%d.%09u", tmp0, abs(tmp1)); case IIO_VAL_FRACTIONAL_LOG2: tmp2 = shift_right((s64)vals[0] * 1000000000LL, vals[1]); tmp0 = (int)div_s64_rem(tmp2, 1000000000LL, &tmp1); if (tmp0 == 0 && tmp2 < 0) return sysfs_emit_at(buf, offset, "-0.%09u", abs(tmp1)); else return sysfs_emit_at(buf, offset, "%d.%09u", tmp0, abs(tmp1)); case IIO_VAL_INT_MULTIPLE: { int i; int l = 0; for (i = 0; i < size; ++i) l += sysfs_emit_at(buf, offset + l, "%d ", vals[i]); return l; } case IIO_VAL_CHAR: return sysfs_emit_at(buf, offset, "%c", (char)vals[0]); case IIO_VAL_INT_64: tmp2 = (s64)((((u64)vals[1]) << 32) | (u32)vals[0]); return sysfs_emit_at(buf, offset, "%lld", tmp2); default: return 0; } } /** * iio_format_value() - Formats a IIO value into its string representation * @buf: The buffer to which the formatted value gets written * which is assumed to be big enough (i.e. PAGE_SIZE). * @type: One of the IIO_VAL_* constants. This decides how the val * and val2 parameters are formatted. * @size: Number of IIO value entries contained in vals * @vals: Pointer to the values, exact meaning depends on the * type parameter. * * Returns: * 0 by default, a negative number on failure or the total number of characters * written for a type that belongs to the IIO_VAL_* constant. */ ssize_t iio_format_value(char *buf, unsigned int type, int size, int *vals) { ssize_t len; len = __iio_format_value(buf, 0, type, size, vals); if (len >= PAGE_SIZE - 1) return -EFBIG; return len + sysfs_emit_at(buf, len, "\n"); } EXPORT_SYMBOL_GPL(iio_format_value); ssize_t do_iio_read_channel_label(struct iio_dev *indio_dev, const struct iio_chan_spec *c, char *buf) { if (indio_dev->info->read_label) return indio_dev->info->read_label(indio_dev, c, buf); if (c->extend_name) return sysfs_emit(buf, "%s\n", c->extend_name); return -EINVAL; } static ssize_t iio_read_channel_label(struct device *dev, struct device_attribute *attr, char *buf) { return do_iio_read_channel_label(dev_to_iio_dev(dev), to_iio_dev_attr(attr)->c, buf); } static ssize_t iio_read_channel_info(struct device *dev, struct device_attribute *attr, char *buf) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct iio_dev_attr *this_attr = to_iio_dev_attr(attr); int vals[INDIO_MAX_RAW_ELEMENTS]; int ret; int val_len = 2; if (indio_dev->info->read_raw_multi) ret = indio_dev->info->read_raw_multi(indio_dev, this_attr->c, INDIO_MAX_RAW_ELEMENTS, vals, &val_len, this_attr->address); else if (indio_dev->info->read_raw) ret = indio_dev->info->read_raw(indio_dev, this_attr->c, &vals[0], &vals[1], this_attr->address); else return -EINVAL; if (ret < 0) return ret; return iio_format_value(buf, ret, val_len, vals); } static ssize_t iio_format_list(char *buf, const int *vals, int type, int length, const char *prefix, const char *suffix) { ssize_t len; int stride; int i; switch (type) { case IIO_VAL_INT: stride = 1; break; default: stride = 2; break; } len = sysfs_emit(buf, prefix); for (i = 0; i <= length - stride; i += stride) { if (i != 0) { len += sysfs_emit_at(buf, len, " "); if (len >= PAGE_SIZE) return -EFBIG; } len += __iio_format_value(buf, len, type, stride, &vals[i]); if (len >= PAGE_SIZE) return -EFBIG; } len += sysfs_emit_at(buf, len, "%s\n", suffix); return len; } static ssize_t iio_format_avail_list(char *buf, const int *vals, int type, int length) { return iio_format_list(buf, vals, type, length, "", ""); } static ssize_t iio_format_avail_range(char *buf, const int *vals, int type) { int length; /* * length refers to the array size , not the number of elements. * The purpose is to print the range [min , step ,max] so length should * be 3 in case of int, and 6 for other types. */ switch (type) { case IIO_VAL_INT: length = 3; break; default: length = 6; break; } return iio_format_list(buf, vals, type, length, "[", "]"); } static ssize_t iio_read_channel_info_avail(struct device *dev, struct device_attribute *attr, char *buf) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct iio_dev_attr *this_attr = to_iio_dev_attr(attr); const int *vals; int ret; int length; int type; if (!indio_dev->info->read_avail) return -EINVAL; ret = indio_dev->info->read_avail(indio_dev, this_attr->c, &vals, &type, &length, this_attr->address); if (ret < 0) return ret; switch (ret) { case IIO_AVAIL_LIST: return iio_format_avail_list(buf, vals, type, length); case IIO_AVAIL_RANGE: return iio_format_avail_range(buf, vals, type); default: return -EINVAL; } } /** * __iio_str_to_fixpoint() - Parse a fixed-point number from a string * @str: The string to parse * @fract_mult: Multiplier for the first decimal place, should be a power of 10 * @integer: The integer part of the number * @fract: The fractional part of the number * @scale_db: True if this should parse as dB * * Returns: * 0 on success, or a negative error code if the string could not be parsed. */ static int __iio_str_to_fixpoint(const char *str, int fract_mult, int *integer, int *fract, bool scale_db) { int i = 0, f = 0; bool integer_part = true, negative = false; if (fract_mult == 0) { *fract = 0; return kstrtoint(str, 0, integer); } if (str[0] == '-') { negative = true; str++; } else if (str[0] == '+') { str++; } while (*str) { if ('0' <= *str && *str <= '9') { if (integer_part) { i = i * 10 + *str - '0'; } else { f += fract_mult * (*str - '0'); fract_mult /= 10; } } else if (*str == '\n') { if (*(str + 1) == '\0') break; return -EINVAL; } else if (!strncmp(str, " dB", sizeof(" dB") - 1) && scale_db) { /* Ignore the dB suffix */ str += sizeof(" dB") - 1; continue; } else if (!strncmp(str, "dB", sizeof("dB") - 1) && scale_db) { /* Ignore the dB suffix */ str += sizeof("dB") - 1; continue; } else if (*str == '.' && integer_part) { integer_part = false; } else { return -EINVAL; } str++; } if (negative) { if (i) i = -i; else f = -f; } *integer = i; *fract = f; return 0; } /** * iio_str_to_fixpoint() - Parse a fixed-point number from a string * @str: The string to parse * @fract_mult: Multiplier for the first decimal place, should be a power of 10 * @integer: The integer part of the number * @fract: The fractional part of the number * * Returns: * 0 on success, or a negative error code if the string could not be parsed. */ int iio_str_to_fixpoint(const char *str, int fract_mult, int *integer, int *fract) { return __iio_str_to_fixpoint(str, fract_mult, integer, fract, false); } EXPORT_SYMBOL_GPL(iio_str_to_fixpoint); static ssize_t iio_write_channel_info(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); struct iio_dev_attr *this_attr = to_iio_dev_attr(attr); int ret, fract_mult = 100000; int integer, fract = 0; bool is_char = false; bool scale_db = false; /* Assumes decimal - precision based on number of digits */ if (!indio_dev->info->write_raw) return -EINVAL; if (indio_dev->info->write_raw_get_fmt) switch (indio_dev->info->write_raw_get_fmt(indio_dev, this_attr->c, this_attr->address)) { case IIO_VAL_INT: fract_mult = 0; break; case IIO_VAL_INT_PLUS_MICRO_DB: scale_db = true; fallthrough; case IIO_VAL_INT_PLUS_MICRO: fract_mult = 100000; break; case IIO_VAL_INT_PLUS_NANO: fract_mult = 100000000; break; case IIO_VAL_CHAR: is_char = true; break; default: return -EINVAL; } if (is_char) { char ch; if (sscanf(buf, "%c", &ch) != 1) return -EINVAL; integer = ch; } else { ret = __iio_str_to_fixpoint(buf, fract_mult, &integer, &fract, scale_db); if (ret) return ret; } ret = indio_dev->info->write_raw(indio_dev, this_attr->c, integer, fract, this_attr->address); if (ret) return ret; return len; } static int __iio_device_attr_init(struct device_attribute *dev_attr, const char *postfix, struct iio_chan_spec const *chan, ssize_t (*readfunc)(struct device *dev, struct device_attribute *attr, char *buf), ssize_t (*writefunc)(struct device *dev, struct device_attribute *attr, const char *buf, size_t len), enum iio_shared_by shared_by) { int ret = 0; char *name = NULL; char *full_postfix; sysfs_attr_init(&dev_attr->attr); /* Build up postfix of <extend_name>_<modifier>_postfix */ if (chan->modified && (shared_by == IIO_SEPARATE)) { if (chan->extend_name) full_postfix = kasprintf(GFP_KERNEL, "%s_%s_%s", iio_modifier_names[chan->channel2], chan->extend_name, postfix); else full_postfix = kasprintf(GFP_KERNEL, "%s_%s", iio_modifier_names[chan->channel2], postfix); } else { if (chan->extend_name == NULL || shared_by != IIO_SEPARATE) full_postfix = kstrdup(postfix, GFP_KERNEL); else full_postfix = kasprintf(GFP_KERNEL, "%s_%s", chan->extend_name, postfix); } if (full_postfix == NULL) return -ENOMEM; if (chan->differential) { /* Differential can not have modifier */ switch (shared_by) { case IIO_SHARED_BY_ALL: name = kasprintf(GFP_KERNEL, "%s", full_postfix); break; case IIO_SHARED_BY_DIR: name = kasprintf(GFP_KERNEL, "%s_%s", iio_direction[chan->output], full_postfix); break; case IIO_SHARED_BY_TYPE: name = kasprintf(GFP_KERNEL, "%s_%s-%s_%s", iio_direction[chan->output], iio_chan_type_name_spec[chan->type], iio_chan_type_name_spec[chan->type], full_postfix); break; case IIO_SEPARATE: if (!chan->indexed) { WARN(1, "Differential channels must be indexed\n"); ret = -EINVAL; goto error_free_full_postfix; } name = kasprintf(GFP_KERNEL, "%s_%s%d-%s%d_%s", iio_direction[chan->output], iio_chan_type_name_spec[chan->type], chan->channel, iio_chan_type_name_spec[chan->type], chan->channel2, full_postfix); break; } } else { /* Single ended */ switch (shared_by) { case IIO_SHARED_BY_ALL: name = kasprintf(GFP_KERNEL, "%s", full_postfix); break; case IIO_SHARED_BY_DIR: name = kasprintf(GFP_KERNEL, "%s_%s", iio_direction[chan->output], full_postfix); break; case IIO_SHARED_BY_TYPE: name = kasprintf(GFP_KERNEL, "%s_%s_%s", iio_direction[chan->output], iio_chan_type_name_spec[chan->type], full_postfix); break; case IIO_SEPARATE: if (chan->indexed) name = kasprintf(GFP_KERNEL, "%s_%s%d_%s", iio_direction[chan->output], iio_chan_type_name_spec[chan->type], chan->channel, full_postfix); else name = kasprintf(GFP_KERNEL, "%s_%s_%s", iio_direction[chan->output], iio_chan_type_name_spec[chan->type], full_postfix); break; } } if (name == NULL) { ret = -ENOMEM; goto error_free_full_postfix; } dev_attr->attr.name = name; if (readfunc) { dev_attr->attr.mode |= 0444; dev_attr->show = readfunc; } if (writefunc) { dev_attr->attr.mode |= 0200; dev_attr->store = writefunc; } error_free_full_postfix: kfree(full_postfix); return ret; } static void __iio_device_attr_deinit(struct device_attribute *dev_attr) { kfree(dev_attr->attr.name); } int __iio_add_chan_devattr(const char *postfix, struct iio_chan_spec const *chan, ssize_t (*readfunc)(struct device *dev, struct device_attribute *attr, char *buf), ssize_t (*writefunc)(struct device *dev, struct device_attribute *attr, const char *buf, size_t len), u64 mask, enum iio_shared_by shared_by, struct device *dev, struct iio_buffer *buffer, struct list_head *attr_list) { int ret; struct iio_dev_attr *iio_attr, *t; iio_attr = kzalloc(sizeof(*iio_attr), GFP_KERNEL); if (iio_attr == NULL) return -ENOMEM; ret = __iio_device_attr_init(&iio_attr->dev_attr, postfix, chan, readfunc, writefunc, shared_by); if (ret) goto error_iio_dev_attr_free; iio_attr->c = chan; iio_attr->address = mask; iio_attr->buffer = buffer; list_for_each_entry(t, attr_list, l) if (strcmp(t->dev_attr.attr.name, iio_attr->dev_attr.attr.name) == 0) { if (shared_by == IIO_SEPARATE) dev_err(dev, "tried to double register : %s\n", t->dev_attr.attr.name); ret = -EBUSY; goto error_device_attr_deinit; } list_add(&iio_attr->l, attr_list); return 0; error_device_attr_deinit: __iio_device_attr_deinit(&iio_attr->dev_attr); error_iio_dev_attr_free: kfree(iio_attr); return ret; } static int iio_device_add_channel_label(struct iio_dev *indio_dev, struct iio_chan_spec const *chan) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); int ret; if (!indio_dev->info->read_label && !chan->extend_name) return 0; ret = __iio_add_chan_devattr("label", chan, &iio_read_channel_label, NULL, 0, IIO_SEPARATE, &indio_dev->dev, NULL, &iio_dev_opaque->channel_attr_list); if (ret < 0) return ret; return 1; } static int iio_device_add_info_mask_type(struct iio_dev *indio_dev, struct iio_chan_spec const *chan, enum iio_shared_by shared_by, const long *infomask) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); int i, ret, attrcount = 0; for_each_set_bit(i, infomask, sizeof(*infomask)*8) { if (i >= ARRAY_SIZE(iio_chan_info_postfix)) return -EINVAL; ret = __iio_add_chan_devattr(iio_chan_info_postfix[i], chan, &iio_read_channel_info, &iio_write_channel_info, i, shared_by, &indio_dev->dev, NULL, &iio_dev_opaque->channel_attr_list); if ((ret == -EBUSY) && (shared_by != IIO_SEPARATE)) continue; if (ret < 0) return ret; attrcount++; } return attrcount; } static int iio_device_add_info_mask_type_avail(struct iio_dev *indio_dev, struct iio_chan_spec const *chan, enum iio_shared_by shared_by, const long *infomask) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); int i, ret, attrcount = 0; char *avail_postfix; for_each_set_bit(i, infomask, sizeof(*infomask) * 8) { if (i >= ARRAY_SIZE(iio_chan_info_postfix)) return -EINVAL; avail_postfix = kasprintf(GFP_KERNEL, "%s_available", iio_chan_info_postfix[i]); if (!avail_postfix) return -ENOMEM; ret = __iio_add_chan_devattr(avail_postfix, chan, &iio_read_channel_info_avail, NULL, i, shared_by, &indio_dev->dev, NULL, &iio_dev_opaque->channel_attr_list); kfree(avail_postfix); if ((ret == -EBUSY) && (shared_by != IIO_SEPARATE)) continue; if (ret < 0) return ret; attrcount++; } return attrcount; } static int iio_device_add_channel_sysfs(struct iio_dev *indio_dev, struct iio_chan_spec const *chan) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); int ret, attrcount = 0; const struct iio_chan_spec_ext_info *ext_info; if (chan->channel < 0) return 0; ret = iio_device_add_info_mask_type(indio_dev, chan, IIO_SEPARATE, &chan->info_mask_separate); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type_avail(indio_dev, chan, IIO_SEPARATE, &chan->info_mask_separate_available); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type(indio_dev, chan, IIO_SHARED_BY_TYPE, &chan->info_mask_shared_by_type); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type_avail(indio_dev, chan, IIO_SHARED_BY_TYPE, &chan->info_mask_shared_by_type_available); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type(indio_dev, chan, IIO_SHARED_BY_DIR, &chan->info_mask_shared_by_dir); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type_avail(indio_dev, chan, IIO_SHARED_BY_DIR, &chan->info_mask_shared_by_dir_available); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type(indio_dev, chan, IIO_SHARED_BY_ALL, &chan->info_mask_shared_by_all); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_info_mask_type_avail(indio_dev, chan, IIO_SHARED_BY_ALL, &chan->info_mask_shared_by_all_available); if (ret < 0) return ret; attrcount += ret; ret = iio_device_add_channel_label(indio_dev, chan); if (ret < 0) return ret; attrcount += ret; if (chan->ext_info) { unsigned int i = 0; for (ext_info = chan->ext_info; ext_info->name; ext_info++) { ret = __iio_add_chan_devattr(ext_info->name, chan, ext_info->read ? &iio_read_channel_ext_info : NULL, ext_info->write ? &iio_write_channel_ext_info : NULL, i, ext_info->shared, &indio_dev->dev, NULL, &iio_dev_opaque->channel_attr_list); i++; if (ret == -EBUSY && ext_info->shared) continue; if (ret) return ret; attrcount++; } } return attrcount; } /** * iio_free_chan_devattr_list() - Free a list of IIO device attributes * @attr_list: List of IIO device attributes * * This function frees the memory allocated for each of the IIO device * attributes in the list. */ void iio_free_chan_devattr_list(struct list_head *attr_list) { struct iio_dev_attr *p, *n; list_for_each_entry_safe(p, n, attr_list, l) { kfree_const(p->dev_attr.attr.name); list_del(&p->l); kfree(p); } } static ssize_t name_show(struct device *dev, struct device_attribute *attr, char *buf) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); return sysfs_emit(buf, "%s\n", indio_dev->name); } static DEVICE_ATTR_RO(name); static ssize_t label_show(struct device *dev, struct device_attribute *attr, char *buf) { struct iio_dev *indio_dev = dev_to_iio_dev(dev); return sysfs_emit(buf, "%s\n", indio_dev->label); } static DEVICE_ATTR_RO(label); static const char * const clock_names[] = { [CLOCK_REALTIME] = "realtime", [CLOCK_MONOTONIC] = "monotonic", [CLOCK_PROCESS_CPUTIME_ID] = "process_cputime_id", [CLOCK_THREAD_CPUTIME_ID] = "thread_cputime_id", [CLOCK_MONOTONIC_RAW] = "monotonic_raw", [CLOCK_REALTIME_COARSE] = "realtime_coarse", [CLOCK_MONOTONIC_COARSE] = "monotonic_coarse", [CLOCK_BOOTTIME] = "boottime", [CLOCK_REALTIME_ALARM] = "realtime_alarm", [CLOCK_BOOTTIME_ALARM] = "boottime_alarm", [CLOCK_SGI_CYCLE] = "sgi_cycle", [CLOCK_TAI] = "tai", }; static ssize_t current_timestamp_clock_show(struct device *dev, struct device_attribute *attr, char *buf) { const struct iio_dev *indio_dev = dev_to_iio_dev(dev); const clockid_t clk = iio_device_get_clock(indio_dev); switch (clk) { case CLOCK_REALTIME: case CLOCK_MONOTONIC: case CLOCK_MONOTONIC_RAW: case CLOCK_REALTIME_COARSE: case CLOCK_MONOTONIC_COARSE: case CLOCK_BOOTTIME: case CLOCK_TAI: break; default: BUG(); } return sysfs_emit(buf, "%s\n", clock_names[clk]); } static ssize_t current_timestamp_clock_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { clockid_t clk; int ret; ret = sysfs_match_string(clock_names, buf); if (ret < 0) return ret; clk = ret; switch (clk) { case CLOCK_REALTIME: case CLOCK_MONOTONIC: case CLOCK_MONOTONIC_RAW: case CLOCK_REALTIME_COARSE: case CLOCK_MONOTONIC_COARSE: case CLOCK_BOOTTIME: case CLOCK_TAI: break; default: return -EINVAL; } ret = iio_device_set_clock(dev_to_iio_dev(dev), clk); if (ret) return ret; return len; } int iio_device_register_sysfs_group(struct iio_dev *indio_dev, const struct attribute_group *group) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); const struct attribute_group **new, **old = iio_dev_opaque->groups; unsigned int cnt = iio_dev_opaque->groupcounter; new = krealloc_array(old, cnt + 2, sizeof(*new), GFP_KERNEL); if (!new) return -ENOMEM; new[iio_dev_opaque->groupcounter++] = group; new[iio_dev_opaque->groupcounter] = NULL; iio_dev_opaque->groups = new; return 0; } static DEVICE_ATTR_RW(current_timestamp_clock); static int iio_device_register_sysfs(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); int i, ret = 0, attrcount, attrn, attrcount_orig = 0; struct iio_dev_attr *p; struct attribute **attr, *clk = NULL; /* First count elements in any existing group */ if (indio_dev->info->attrs) { attr = indio_dev->info->attrs->attrs; while (*attr++ != NULL) attrcount_orig++; } attrcount = attrcount_orig; /* * New channel registration method - relies on the fact a group does * not need to be initialized if its name is NULL. */ if (indio_dev->channels) for (i = 0; i < indio_dev->num_channels; i++) { const struct iio_chan_spec *chan = &indio_dev->channels[i]; if (chan->type == IIO_TIMESTAMP) clk = &dev_attr_current_timestamp_clock.attr; ret = iio_device_add_channel_sysfs(indio_dev, chan); if (ret < 0) goto error_clear_attrs; attrcount += ret; } if (iio_dev_opaque->event_interface) clk = &dev_attr_current_timestamp_clock.attr; if (indio_dev->name) attrcount++; if (indio_dev->label) attrcount++; if (clk) attrcount++; iio_dev_opaque->chan_attr_group.attrs = kcalloc(attrcount + 1, sizeof(iio_dev_opaque->chan_attr_group.attrs[0]), GFP_KERNEL); if (iio_dev_opaque->chan_attr_group.attrs == NULL) { ret = -ENOMEM; goto error_clear_attrs; } /* Copy across original attributes, and point to original binary attributes */ if (indio_dev->info->attrs) { memcpy(iio_dev_opaque->chan_attr_group.attrs, indio_dev->info->attrs->attrs, sizeof(iio_dev_opaque->chan_attr_group.attrs[0]) *attrcount_orig); iio_dev_opaque->chan_attr_group.is_visible = indio_dev->info->attrs->is_visible; iio_dev_opaque->chan_attr_group.bin_attrs = indio_dev->info->attrs->bin_attrs; } attrn = attrcount_orig; /* Add all elements from the list. */ list_for_each_entry(p, &iio_dev_opaque->channel_attr_list, l) iio_dev_opaque->chan_attr_group.attrs[attrn++] = &p->dev_attr.attr; if (indio_dev->name) iio_dev_opaque->chan_attr_group.attrs[attrn++] = &dev_attr_name.attr; if (indio_dev->label) iio_dev_opaque->chan_attr_group.attrs[attrn++] = &dev_attr_label.attr; if (clk) iio_dev_opaque->chan_attr_group.attrs[attrn++] = clk; ret = iio_device_register_sysfs_group(indio_dev, &iio_dev_opaque->chan_attr_group); if (ret) goto error_free_chan_attrs; return 0; error_free_chan_attrs: kfree(iio_dev_opaque->chan_attr_group.attrs); iio_dev_opaque->chan_attr_group.attrs = NULL; error_clear_attrs: iio_free_chan_devattr_list(&iio_dev_opaque->channel_attr_list); return ret; } static void iio_device_unregister_sysfs(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); iio_free_chan_devattr_list(&iio_dev_opaque->channel_attr_list); kfree(iio_dev_opaque->chan_attr_group.attrs); iio_dev_opaque->chan_attr_group.attrs = NULL; kfree(iio_dev_opaque->groups); iio_dev_opaque->groups = NULL; } static void iio_dev_release(struct device *device) { struct iio_dev *indio_dev = dev_to_iio_dev(device); struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); if (indio_dev->modes & INDIO_ALL_TRIGGERED_MODES) iio_device_unregister_trigger_consumer(indio_dev); iio_device_unregister_eventset(indio_dev); iio_device_unregister_sysfs(indio_dev); iio_device_detach_buffers(indio_dev); lockdep_unregister_key(&iio_dev_opaque->mlock_key); ida_free(&iio_ida, iio_dev_opaque->id); kfree(iio_dev_opaque); } const struct device_type iio_device_type = { .name = "iio_device", .release = iio_dev_release, }; /** * iio_device_alloc() - allocate an iio_dev from a driver * @parent: Parent device. * @sizeof_priv: Space to allocate for private structure. * * Returns: * Pointer to allocated iio_dev on success, NULL on failure. */ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv) { struct iio_dev_opaque *iio_dev_opaque; struct iio_dev *indio_dev; size_t alloc_size; if (sizeof_priv) alloc_size = ALIGN(sizeof(*iio_dev_opaque), IIO_DMA_MINALIGN) + sizeof_priv; else alloc_size = sizeof(*iio_dev_opaque); iio_dev_opaque = kzalloc(alloc_size, GFP_KERNEL); if (!iio_dev_opaque) return NULL; indio_dev = &iio_dev_opaque->indio_dev; if (sizeof_priv) indio_dev->priv = (char *)iio_dev_opaque + ALIGN(sizeof(*iio_dev_opaque), IIO_DMA_MINALIGN); indio_dev->dev.parent = parent; indio_dev->dev.type = &iio_device_type; indio_dev->dev.bus = &iio_bus_type; device_initialize(&indio_dev->dev); mutex_init(&iio_dev_opaque->mlock); mutex_init(&iio_dev_opaque->info_exist_lock); INIT_LIST_HEAD(&iio_dev_opaque->channel_attr_list); iio_dev_opaque->id = ida_alloc(&iio_ida, GFP_KERNEL); if (iio_dev_opaque->id < 0) { /* cannot use a dev_err as the name isn't available */ pr_err("failed to get device id\n"); kfree(iio_dev_opaque); return NULL; } if (dev_set_name(&indio_dev->dev, "iio:device%d", iio_dev_opaque->id)) { ida_free(&iio_ida, iio_dev_opaque->id); kfree(iio_dev_opaque); return NULL; } INIT_LIST_HEAD(&iio_dev_opaque->buffer_list); INIT_LIST_HEAD(&iio_dev_opaque->ioctl_handlers); lockdep_register_key(&iio_dev_opaque->mlock_key); lockdep_set_class(&iio_dev_opaque->mlock, &iio_dev_opaque->mlock_key); return indio_dev; } EXPORT_SYMBOL(iio_device_alloc); /** * iio_device_free() - free an iio_dev from a driver * @dev: the iio_dev associated with the device */ void iio_device_free(struct iio_dev *dev) { if (dev) put_device(&dev->dev); } EXPORT_SYMBOL(iio_device_free); static void devm_iio_device_release(void *iio_dev) { iio_device_free(iio_dev); } /** * devm_iio_device_alloc - Resource-managed iio_device_alloc() * @parent: Device to allocate iio_dev for, and parent for this IIO device * @sizeof_priv: Space to allocate for private structure. * * Managed iio_device_alloc. iio_dev allocated with this function is * automatically freed on driver detach. * * Returns: * Pointer to allocated iio_dev on success, NULL on failure. */ struct iio_dev *devm_iio_device_alloc(struct device *parent, int sizeof_priv) { struct iio_dev *iio_dev; int ret; iio_dev = iio_device_alloc(parent, sizeof_priv); if (!iio_dev) return NULL; ret = devm_add_action_or_reset(parent, devm_iio_device_release, iio_dev); if (ret) return NULL; return iio_dev; } EXPORT_SYMBOL_GPL(devm_iio_device_alloc); /** * iio_chrdev_open() - chrdev file open for buffer access and ioctls * @inode: Inode structure for identifying the device in the file system * @filp: File structure for iio device used to keep and later access * private data * * Returns: 0 on success or -EBUSY if the device is already opened */ static int iio_chrdev_open(struct inode *inode, struct file *filp) { struct iio_dev_opaque *iio_dev_opaque = container_of(inode->i_cdev, struct iio_dev_opaque, chrdev); struct iio_dev *indio_dev = &iio_dev_opaque->indio_dev; struct iio_dev_buffer_pair *ib; if (test_and_set_bit(IIO_BUSY_BIT_POS, &iio_dev_opaque->flags)) return -EBUSY; iio_device_get(indio_dev); ib = kmalloc(sizeof(*ib), GFP_KERNEL); if (!ib) { iio_device_put(indio_dev); clear_bit(IIO_BUSY_BIT_POS, &iio_dev_opaque->flags); return -ENOMEM; } ib->indio_dev = indio_dev; ib->buffer = indio_dev->buffer; filp->private_data = ib; return 0; } /** * iio_chrdev_release() - chrdev file close buffer access and ioctls * @inode: Inode structure pointer for the char device * @filp: File structure pointer for the char device * * Returns: 0 for successful release. */ static int iio_chrdev_release(struct inode *inode, struct file *filp) { struct iio_dev_buffer_pair *ib = filp->private_data; struct iio_dev_opaque *iio_dev_opaque = container_of(inode->i_cdev, struct iio_dev_opaque, chrdev); struct iio_dev *indio_dev = &iio_dev_opaque->indio_dev; kfree(ib); clear_bit(IIO_BUSY_BIT_POS, &iio_dev_opaque->flags); iio_device_put(indio_dev); return 0; } void iio_device_ioctl_handler_register(struct iio_dev *indio_dev, struct iio_ioctl_handler *h) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); list_add_tail(&h->entry, &iio_dev_opaque->ioctl_handlers); } void iio_device_ioctl_handler_unregister(struct iio_ioctl_handler *h) { list_del(&h->entry); } static long iio_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct iio_dev_buffer_pair *ib = filp->private_data; struct iio_dev *indio_dev = ib->indio_dev; struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); struct iio_ioctl_handler *h; int ret; guard(mutex)(&iio_dev_opaque->info_exist_lock); /* * The NULL check here is required to prevent crashing when a device * is being removed while userspace would still have open file handles * to try to access this device. */ if (!indio_dev->info) return -ENODEV; list_for_each_entry(h, &iio_dev_opaque->ioctl_handlers, entry) { ret = h->ioctl(indio_dev, filp, cmd, arg); if (ret != IIO_IOCTL_UNHANDLED) return ret; } return -ENODEV; } static const struct file_operations iio_buffer_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read_outer_addr, .write = iio_buffer_write_outer_addr, .poll = iio_buffer_poll_addr, .unlocked_ioctl = iio_ioctl, .compat_ioctl = compat_ptr_ioctl, .open = iio_chrdev_open, .release = iio_chrdev_release, }; static const struct file_operations iio_event_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .unlocked_ioctl = iio_ioctl, .compat_ioctl = compat_ptr_ioctl, .open = iio_chrdev_open, .release = iio_chrdev_release, }; static int iio_check_unique_scan_index(struct iio_dev *indio_dev) { int i, j; const struct iio_chan_spec *channels = indio_dev->channels; if (!(indio_dev->modes & INDIO_ALL_BUFFER_MODES)) return 0; for (i = 0; i < indio_dev->num_channels - 1; i++) { if (channels[i].scan_index < 0) continue; for (j = i + 1; j < indio_dev->num_channels; j++) if (channels[i].scan_index == channels[j].scan_index) { dev_err(&indio_dev->dev, "Duplicate scan index %d\n", channels[i].scan_index); return -EINVAL; } } return 0; } static int iio_check_extended_name(const struct iio_dev *indio_dev) { unsigned int i; if (!indio_dev->info->read_label) return 0; for (i = 0; i < indio_dev->num_channels; i++) { if (indio_dev->channels[i].extend_name) { dev_err(&indio_dev->dev, "Cannot use labels and extend_name at the same time\n"); return -EINVAL; } } return 0; } static const struct iio_buffer_setup_ops noop_ring_setup_ops; static void iio_sanity_check_avail_scan_masks(struct iio_dev *indio_dev) { unsigned int num_masks, masklength, longs_per_mask; const unsigned long *av_masks; int i; av_masks = indio_dev->available_scan_masks; masklength = indio_dev->masklength; longs_per_mask = BITS_TO_LONGS(masklength); /* * The code determining how many available_scan_masks is in the array * will be assuming the end of masks when first long with all bits * zeroed is encountered. This is incorrect for masks where mask * consists of more than one long, and where some of the available masks * has long worth of bits zeroed (but has subsequent bit(s) set). This * is a safety measure against bug where array of masks is terminated by * a single zero while mask width is greater than width of a long. */ if (longs_per_mask > 1) dev_warn(indio_dev->dev.parent, "multi long available scan masks not fully supported\n"); if (bitmap_empty(av_masks, masklength)) dev_warn(indio_dev->dev.parent, "empty scan mask\n"); for (num_masks = 0; *av_masks; num_masks++) av_masks += longs_per_mask; if (num_masks < 2) return; av_masks = indio_dev->available_scan_masks; /* * Go through all the masks from first to one before the last, and see * that no mask found later from the available_scan_masks array is a * subset of mask found earlier. If this happens, then the mask found * later will never get used because scanning the array is stopped when * the first suitable mask is found. Drivers should order the array of * available masks in the order of preference (presumably the least * costy to access masks first). */ for (i = 0; i < num_masks - 1; i++) { const unsigned long *mask1; int j; mask1 = av_masks + i * longs_per_mask; for (j = i + 1; j < num_masks; j++) { const unsigned long *mask2; mask2 = av_masks + j * longs_per_mask; if (bitmap_subset(mask2, mask1, masklength)) dev_warn(indio_dev->dev.parent, "available_scan_mask %d subset of %d. Never used\n", j, i); } } } int __iio_device_register(struct iio_dev *indio_dev, struct module *this_mod) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); struct fwnode_handle *fwnode = NULL; int ret; if (!indio_dev->info) return -EINVAL; iio_dev_opaque->driver_module = this_mod; /* If the calling driver did not initialize firmware node, do it here */ if (dev_fwnode(&indio_dev->dev)) fwnode = dev_fwnode(&indio_dev->dev); /* The default dummy IIO device has no parent */ else if (indio_dev->dev.parent) fwnode = dev_fwnode(indio_dev->dev.parent); device_set_node(&indio_dev->dev, fwnode); fwnode_property_read_string(fwnode, "label", &indio_dev->label); ret = iio_check_unique_scan_index(indio_dev); if (ret < 0) return ret; ret = iio_check_extended_name(indio_dev); if (ret < 0) return ret; iio_device_register_debugfs(indio_dev); ret = iio_buffers_alloc_sysfs_and_mask(indio_dev); if (ret) { dev_err(indio_dev->dev.parent, "Failed to create buffer sysfs interfaces\n"); goto error_unreg_debugfs; } if (indio_dev->available_scan_masks) iio_sanity_check_avail_scan_masks(indio_dev); ret = iio_device_register_sysfs(indio_dev); if (ret) { dev_err(indio_dev->dev.parent, "Failed to register sysfs interfaces\n"); goto error_buffer_free_sysfs; } ret = iio_device_register_eventset(indio_dev); if (ret) { dev_err(indio_dev->dev.parent, "Failed to register event set\n"); goto error_free_sysfs; } if (indio_dev->modes & INDIO_ALL_TRIGGERED_MODES) iio_device_register_trigger_consumer(indio_dev); if ((indio_dev->modes & INDIO_ALL_BUFFER_MODES) && indio_dev->setup_ops == NULL) indio_dev->setup_ops = &noop_ring_setup_ops; if (iio_dev_opaque->attached_buffers_cnt) cdev_init(&iio_dev_opaque->chrdev, &iio_buffer_fileops); else if (iio_dev_opaque->event_interface) cdev_init(&iio_dev_opaque->chrdev, &iio_event_fileops); if (iio_dev_opaque->attached_buffers_cnt || iio_dev_opaque->event_interface) { indio_dev->dev.devt = MKDEV(MAJOR(iio_devt), iio_dev_opaque->id); iio_dev_opaque->chrdev.owner = this_mod; } /* assign device groups now; they should be all registered now */ indio_dev->dev.groups = iio_dev_opaque->groups; ret = cdev_device_add(&iio_dev_opaque->chrdev, &indio_dev->dev); if (ret < 0) goto error_unreg_eventset; return 0; error_unreg_eventset: iio_device_unregister_eventset(indio_dev); error_free_sysfs: iio_device_unregister_sysfs(indio_dev); error_buffer_free_sysfs: iio_buffers_free_sysfs_and_mask(indio_dev); error_unreg_debugfs: iio_device_unregister_debugfs(indio_dev); return ret; } EXPORT_SYMBOL(__iio_device_register); /** * iio_device_unregister() - unregister a device from the IIO subsystem * @indio_dev: Device structure representing the device. */ void iio_device_unregister(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); cdev_device_del(&iio_dev_opaque->chrdev, &indio_dev->dev); scoped_guard(mutex, &iio_dev_opaque->info_exist_lock) { iio_device_unregister_debugfs(indio_dev); iio_disable_all_buffers(indio_dev); indio_dev->info = NULL; iio_device_wakeup_eventset(indio_dev); iio_buffer_wakeup_poll(indio_dev); } iio_buffers_free_sysfs_and_mask(indio_dev); } EXPORT_SYMBOL(iio_device_unregister); static void devm_iio_device_unreg(void *indio_dev) { iio_device_unregister(indio_dev); } int __devm_iio_device_register(struct device *dev, struct iio_dev *indio_dev, struct module *this_mod) { int ret; ret = __iio_device_register(indio_dev, this_mod); if (ret) return ret; return devm_add_action_or_reset(dev, devm_iio_device_unreg, indio_dev); } EXPORT_SYMBOL_GPL(__devm_iio_device_register); /** * iio_device_claim_direct_mode - Keep device in direct mode * @indio_dev: the iio_dev associated with the device * * If the device is in direct mode it is guaranteed to stay * that way until iio_device_release_direct_mode() is called. * * Use with iio_device_release_direct_mode() * * Returns: 0 on success, -EBUSY on failure. */ int iio_device_claim_direct_mode(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); mutex_lock(&iio_dev_opaque->mlock); if (iio_buffer_enabled(indio_dev)) { mutex_unlock(&iio_dev_opaque->mlock); return -EBUSY; } return 0; } EXPORT_SYMBOL_GPL(iio_device_claim_direct_mode); /** * iio_device_release_direct_mode - releases claim on direct mode * @indio_dev: the iio_dev associated with the device * * Release the claim. Device is no longer guaranteed to stay * in direct mode. * * Use with iio_device_claim_direct_mode() */ void iio_device_release_direct_mode(struct iio_dev *indio_dev) { mutex_unlock(&to_iio_dev_opaque(indio_dev)->mlock); } EXPORT_SYMBOL_GPL(iio_device_release_direct_mode); /** * iio_device_claim_buffer_mode - Keep device in buffer mode * @indio_dev: the iio_dev associated with the device * * If the device is in buffer mode it is guaranteed to stay * that way until iio_device_release_buffer_mode() is called. * * Use with iio_device_release_buffer_mode(). * * Returns: 0 on success, -EBUSY on failure. */ int iio_device_claim_buffer_mode(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); mutex_lock(&iio_dev_opaque->mlock); if (iio_buffer_enabled(indio_dev)) return 0; mutex_unlock(&iio_dev_opaque->mlock); return -EBUSY; } EXPORT_SYMBOL_GPL(iio_device_claim_buffer_mode); /** * iio_device_release_buffer_mode - releases claim on buffer mode * @indio_dev: the iio_dev associated with the device * * Release the claim. Device is no longer guaranteed to stay * in buffer mode. * * Use with iio_device_claim_buffer_mode(). */ void iio_device_release_buffer_mode(struct iio_dev *indio_dev) { mutex_unlock(&to_iio_dev_opaque(indio_dev)->mlock); } EXPORT_SYMBOL_GPL(iio_device_release_buffer_mode); /** * iio_device_get_current_mode() - helper function providing read-only access to * the opaque @currentmode variable * @indio_dev: IIO device structure for device */ int iio_device_get_current_mode(struct iio_dev *indio_dev) { struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev); return iio_dev_opaque->currentmode; } EXPORT_SYMBOL_GPL(iio_device_get_current_mode); subsys_initcall(iio_init); module_exit(iio_exit); MODULE_AUTHOR("Jonathan Cameron <jic23@kernel.org>"); MODULE_DESCRIPTION("Industrial I/O core"); MODULE_LICENSE("GPL"); |
| 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 24 23 24 5 24 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 | // SPDX-License-Identifier: GPL-2.0 /* * thermal.c - Generic Thermal Management Sysfs support. * * Copyright (C) 2008 Intel Corp * Copyright (C) 2008 Zhang Rui <rui.zhang@intel.com> * Copyright (C) 2008 Sujith Thomas <sujith.thomas@intel.com> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/device.h> #include <linux/err.h> #include <linux/export.h> #include <linux/slab.h> #include <linux/kdev_t.h> #include <linux/idr.h> #include <linux/list_sort.h> #include <linux/thermal.h> #include <linux/reboot.h> #include <linux/string.h> #include <linux/of.h> #include <linux/suspend.h> #define CREATE_TRACE_POINTS #include "thermal_trace.h" #include "thermal_core.h" #include "thermal_hwmon.h" static DEFINE_IDA(thermal_tz_ida); static DEFINE_IDA(thermal_cdev_ida); static LIST_HEAD(thermal_tz_list); static LIST_HEAD(thermal_cdev_list); static LIST_HEAD(thermal_governor_list); static DEFINE_MUTEX(thermal_list_lock); static DEFINE_MUTEX(thermal_governor_lock); static struct thermal_governor *def_governor; /* * Governor section: set of functions to handle thermal governors * * Functions to help in the life cycle of thermal governors within * the thermal core and by the thermal governor code. */ static struct thermal_governor *__find_governor(const char *name) { struct thermal_governor *pos; if (!name || !name[0]) return def_governor; list_for_each_entry(pos, &thermal_governor_list, governor_list) if (!strncasecmp(name, pos->name, THERMAL_NAME_LENGTH)) return pos; return NULL; } /** * bind_previous_governor() - bind the previous governor of the thermal zone * @tz: a valid pointer to a struct thermal_zone_device * @failed_gov_name: the name of the governor that failed to register * * Register the previous governor of the thermal zone after a new * governor has failed to be bound. */ static void bind_previous_governor(struct thermal_zone_device *tz, const char *failed_gov_name) { if (tz->governor && tz->governor->bind_to_tz) { if (tz->governor->bind_to_tz(tz)) { dev_err(&tz->device, "governor %s failed to bind and the previous one (%s) failed to bind again, thermal zone %s has no governor\n", failed_gov_name, tz->governor->name, tz->type); tz->governor = NULL; } } } /** * thermal_set_governor() - Switch to another governor * @tz: a valid pointer to a struct thermal_zone_device * @new_gov: pointer to the new governor * * Change the governor of thermal zone @tz. * * Return: 0 on success, an error if the new governor's bind_to_tz() failed. */ static int thermal_set_governor(struct thermal_zone_device *tz, struct thermal_governor *new_gov) { int ret = 0; if (tz->governor && tz->governor->unbind_from_tz) tz->governor->unbind_from_tz(tz); if (new_gov && new_gov->bind_to_tz) { ret = new_gov->bind_to_tz(tz); if (ret) { bind_previous_governor(tz, new_gov->name); return ret; } } tz->governor = new_gov; return ret; } int thermal_register_governor(struct thermal_governor *governor) { int err; const char *name; struct thermal_zone_device *pos; if (!governor) return -EINVAL; mutex_lock(&thermal_governor_lock); err = -EBUSY; if (!__find_governor(governor->name)) { bool match_default; err = 0; list_add(&governor->governor_list, &thermal_governor_list); match_default = !strncmp(governor->name, DEFAULT_THERMAL_GOVERNOR, THERMAL_NAME_LENGTH); if (!def_governor && match_default) def_governor = governor; } mutex_lock(&thermal_list_lock); list_for_each_entry(pos, &thermal_tz_list, node) { /* * only thermal zones with specified tz->tzp->governor_name * may run with tz->govenor unset */ if (pos->governor) continue; name = pos->tzp->governor_name; if (!strncasecmp(name, governor->name, THERMAL_NAME_LENGTH)) { int ret; ret = thermal_set_governor(pos, governor); if (ret) dev_err(&pos->device, "Failed to set governor %s for thermal zone %s: %d\n", governor->name, pos->type, ret); } } mutex_unlock(&thermal_list_lock); mutex_unlock(&thermal_governor_lock); return err; } void thermal_unregister_governor(struct thermal_governor *governor) { struct thermal_zone_device *pos; if (!governor) return; mutex_lock(&thermal_governor_lock); if (!__find_governor(governor->name)) goto exit; mutex_lock(&thermal_list_lock); list_for_each_entry(pos, &thermal_tz_list, node) { if (!strncasecmp(pos->governor->name, governor->name, THERMAL_NAME_LENGTH)) thermal_set_governor(pos, NULL); } mutex_unlock(&thermal_list_lock); list_del(&governor->governor_list); exit: mutex_unlock(&thermal_governor_lock); } int thermal_zone_device_set_policy(struct thermal_zone_device *tz, char *policy) { struct thermal_governor *gov; int ret = -EINVAL; mutex_lock(&thermal_governor_lock); mutex_lock(&tz->lock); gov = __find_governor(strim(policy)); if (!gov) goto exit; ret = thermal_set_governor(tz, gov); exit: mutex_unlock(&tz->lock); mutex_unlock(&thermal_governor_lock); thermal_notify_tz_gov_change(tz, policy); return ret; } int thermal_build_list_of_policies(char *buf) { struct thermal_governor *pos; ssize_t count = 0; mutex_lock(&thermal_governor_lock); list_for_each_entry(pos, &thermal_governor_list, governor_list) { count += sysfs_emit_at(buf, count, "%s ", pos->name); } count += sysfs_emit_at(buf, count, "\n"); mutex_unlock(&thermal_governor_lock); return count; } static void __init thermal_unregister_governors(void) { struct thermal_governor **governor; for_each_governor_table(governor) thermal_unregister_governor(*governor); } static int __init thermal_register_governors(void) { int ret = 0; struct thermal_governor **governor; for_each_governor_table(governor) { ret = thermal_register_governor(*governor); if (ret) { pr_err("Failed to register governor: '%s'", (*governor)->name); break; } pr_info("Registered thermal governor '%s'", (*governor)->name); } if (ret) { struct thermal_governor **gov; for_each_governor_table(gov) { if (gov == governor) break; thermal_unregister_governor(*gov); } } return ret; } static int __thermal_zone_device_set_mode(struct thermal_zone_device *tz, enum thermal_device_mode mode) { if (tz->ops.change_mode) { int ret; ret = tz->ops.change_mode(tz, mode); if (ret) return ret; } tz->mode = mode; return 0; } static void thermal_zone_broken_disable(struct thermal_zone_device *tz) { struct thermal_trip_desc *td; dev_err(&tz->device, "Unable to get temperature, disabling!\n"); /* * This function only runs for enabled thermal zones, so no need to * check for the current mode. */ __thermal_zone_device_set_mode(tz, THERMAL_DEVICE_DISABLED); thermal_notify_tz_disable(tz); for_each_trip_desc(tz, td) { if (td->trip.type == THERMAL_TRIP_CRITICAL && td->trip.temperature > THERMAL_TEMP_INVALID) { dev_crit(&tz->device, "Disabled thermal zone with critical trip point\n"); return; } } } /* * Zone update section: main control loop applied to each zone while monitoring * in polling mode. The monitoring is done using a workqueue. * Same update may be done on a zone by calling thermal_zone_device_update(). * * An update means: * - Non-critical trips will invoke the governor responsible for that zone; * - Hot trips will produce a notification to userspace; * - Critical trip point will cause a system shutdown. */ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz, unsigned long delay) { if (delay) mod_delayed_work(system_freezable_power_efficient_wq, &tz->poll_queue, delay); else cancel_delayed_work(&tz->poll_queue); } static void thermal_zone_recheck(struct thermal_zone_device *tz, int error) { if (error == -EAGAIN) { thermal_zone_device_set_polling(tz, THERMAL_RECHECK_DELAY); return; } /* * Print the message once to reduce log noise. It will be followed by * another one if the temperature cannot be determined after multiple * attempts. */ if (tz->recheck_delay_jiffies == THERMAL_RECHECK_DELAY) dev_info(&tz->device, "Temperature check failed (%d)\n", error); thermal_zone_device_set_polling(tz, tz->recheck_delay_jiffies); tz->recheck_delay_jiffies += max(tz->recheck_delay_jiffies >> 1, 1ULL); if (tz->recheck_delay_jiffies > THERMAL_MAX_RECHECK_DELAY) { thermal_zone_broken_disable(tz); /* * Restore the original recheck delay value to allow the thermal * zone to try to recover when it is reenabled by user space. */ tz->recheck_delay_jiffies = THERMAL_RECHECK_DELAY; } } static void monitor_thermal_zone(struct thermal_zone_device *tz) { if (tz->mode != THERMAL_DEVICE_ENABLED) thermal_zone_device_set_polling(tz, 0); else if (tz->passive > 0) thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies); else if (tz->polling_delay_jiffies) thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies); } static struct thermal_governor *thermal_get_tz_governor(struct thermal_zone_device *tz) { if (tz->governor) return tz->governor; return def_governor; } void thermal_governor_update_tz(struct thermal_zone_device *tz, enum thermal_notify_event reason) { if (!tz->governor || !tz->governor->update_tz) return; tz->governor->update_tz(tz, reason); } static void thermal_zone_device_halt(struct thermal_zone_device *tz, bool shutdown) { /* * poweroff_delay_ms must be a carefully profiled positive value. * Its a must for forced_emergency_poweroff_work to be scheduled. */ int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; const char *msg = "Temperature too high"; dev_emerg(&tz->device, "%s: critical temperature reached\n", tz->type); if (shutdown) hw_protection_shutdown(msg, poweroff_delay_ms); else hw_protection_reboot(msg, poweroff_delay_ms); } void thermal_zone_device_critical(struct thermal_zone_device *tz) { thermal_zone_device_halt(tz, true); } EXPORT_SYMBOL(thermal_zone_device_critical); void thermal_zone_device_critical_reboot(struct thermal_zone_device *tz) { thermal_zone_device_halt(tz, false); } static void handle_critical_trips(struct thermal_zone_device *tz, const struct thermal_trip *trip) { trace_thermal_zone_trip(tz, thermal_zone_trip_id(tz, trip), trip->type); if (trip->type == THERMAL_TRIP_CRITICAL) tz->ops.critical(tz); else if (tz->ops.hot) tz->ops.hot(tz); } static void handle_thermal_trip(struct thermal_zone_device *tz, struct thermal_trip_desc *td, struct list_head *way_up_list, struct list_head *way_down_list) { const struct thermal_trip *trip = &td->trip; int old_threshold; if (trip->temperature == THERMAL_TEMP_INVALID) return; /* * If the trip temperature or hysteresis has been updated recently, * the threshold needs to be computed again using the new values. * However, its initial value still reflects the old ones and that * is what needs to be compared with the previous zone temperature * to decide which action to take. */ old_threshold = td->threshold; td->threshold = trip->temperature; if (tz->last_temperature >= old_threshold && tz->last_temperature != THERMAL_TEMP_INIT) { /* * Mitigation is under way, so it needs to stop if the zone * temperature falls below the low temperature of the trip. * In that case, the trip temperature becomes the new threshold. */ if (tz->temperature < trip->temperature - trip->hysteresis) { list_add(&td->notify_list_node, way_down_list); td->notify_temp = trip->temperature - trip->hysteresis; if (trip->type == THERMAL_TRIP_PASSIVE) { tz->passive--; WARN_ON(tz->passive < 0); } } else { td->threshold -= trip->hysteresis; } } else if (tz->temperature >= trip->temperature) { /* * There is no mitigation under way, so it needs to be started * if the zone temperature exceeds the trip one. The new * threshold is then set to the low temperature of the trip. */ list_add_tail(&td->notify_list_node, way_up_list); td->notify_temp = trip->temperature; td->threshold -= trip->hysteresis; if (trip->type == THERMAL_TRIP_PASSIVE) tz->passive++; else if (trip->type == THERMAL_TRIP_CRITICAL || trip->type == THERMAL_TRIP_HOT) handle_critical_trips(tz, trip); } } static void thermal_zone_device_check(struct work_struct *work) { struct thermal_zone_device *tz = container_of(work, struct thermal_zone_device, poll_queue.work); thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED); } static void thermal_zone_device_init(struct thermal_zone_device *tz) { struct thermal_instance *pos; INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_check); tz->temperature = THERMAL_TEMP_INIT; tz->passive = 0; tz->prev_low_trip = -INT_MAX; tz->prev_high_trip = INT_MAX; list_for_each_entry(pos, &tz->thermal_instances, tz_node) pos->initialized = false; } static void thermal_governor_trip_crossed(struct thermal_governor *governor, struct thermal_zone_device *tz, const struct thermal_trip *trip, bool crossed_up) { if (trip->type == THERMAL_TRIP_HOT || trip->type == THERMAL_TRIP_CRITICAL) return; if (governor->trip_crossed) governor->trip_crossed(tz, trip, crossed_up); } static void thermal_trip_crossed(struct thermal_zone_device *tz, const struct thermal_trip *trip, struct thermal_governor *governor, bool crossed_up) { if (crossed_up) { thermal_notify_tz_trip_up(tz, trip); thermal_debug_tz_trip_up(tz, trip); } else { thermal_notify_tz_trip_down(tz, trip); thermal_debug_tz_trip_down(tz, trip); } thermal_governor_trip_crossed(governor, tz, trip, crossed_up); } static int thermal_trip_notify_cmp(void *not_used, const struct list_head *a, const struct list_head *b) { struct thermal_trip_desc *tda = container_of(a, struct thermal_trip_desc, notify_list_node); struct thermal_trip_desc *tdb = container_of(b, struct thermal_trip_desc, notify_list_node); return tda->notify_temp - tdb->notify_temp; } void __thermal_zone_device_update(struct thermal_zone_device *tz, enum thermal_notify_event event) { struct thermal_governor *governor = thermal_get_tz_governor(tz); struct thermal_trip_desc *td; LIST_HEAD(way_down_list); LIST_HEAD(way_up_list); int temp, ret; if (tz->suspended) return; if (!thermal_zone_device_is_enabled(tz)) return; ret = __thermal_zone_get_temp(tz, &temp); if (ret) { thermal_zone_recheck(tz, ret); return; } else if (temp <= THERMAL_TEMP_INVALID) { /* * Special case: No valid temperature value is available, but * the zone owner does not want the core to do anything about * it. Continue regular zone polling if needed, so that this * function can be called again, but skip everything else. */ goto monitor; } tz->recheck_delay_jiffies = THERMAL_RECHECK_DELAY; tz->last_temperature = tz->temperature; tz->temperature = temp; trace_thermal_temperature(tz); thermal_genl_sampling_temp(tz->id, temp); tz->notify_event = event; for_each_trip_desc(tz, td) handle_thermal_trip(tz, td, &way_up_list, &way_down_list); thermal_zone_set_trips(tz); list_sort(NULL, &way_up_list, thermal_trip_notify_cmp); list_for_each_entry(td, &way_up_list, notify_list_node) thermal_trip_crossed(tz, &td->trip, governor, true); list_sort(NULL, &way_down_list, thermal_trip_notify_cmp); list_for_each_entry_reverse(td, &way_down_list, notify_list_node) thermal_trip_crossed(tz, &td->trip, governor, false); if (governor->manage) governor->manage(tz); thermal_debug_update_trip_stats(tz); monitor: monitor_thermal_zone(tz); } static int thermal_zone_device_set_mode(struct thermal_zone_device *tz, enum thermal_device_mode mode) { int ret; mutex_lock(&tz->lock); /* do nothing if mode isn't changing */ if (mode == tz->mode) { mutex_unlock(&tz->lock); return 0; } ret = __thermal_zone_device_set_mode(tz, mode); if (ret) { mutex_unlock(&tz->lock); return ret; } __thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED); mutex_unlock(&tz->lock); if (mode == THERMAL_DEVICE_ENABLED) thermal_notify_tz_enable(tz); else thermal_notify_tz_disable(tz); return 0; } int thermal_zone_device_enable(struct thermal_zone_device *tz) { return thermal_zone_device_set_mode(tz, THERMAL_DEVICE_ENABLED); } EXPORT_SYMBOL_GPL(thermal_zone_device_enable); int thermal_zone_device_disable(struct thermal_zone_device *tz) { return thermal_zone_device_set_mode(tz, THERMAL_DEVICE_DISABLED); } EXPORT_SYMBOL_GPL(thermal_zone_device_disable); int thermal_zone_device_is_enabled(struct thermal_zone_device *tz) { lockdep_assert_held(&tz->lock); return tz->mode == THERMAL_DEVICE_ENABLED; } static bool thermal_zone_is_present(struct thermal_zone_device *tz) { return !list_empty(&tz->node); } void thermal_zone_device_update(struct thermal_zone_device *tz, enum thermal_notify_event event) { mutex_lock(&tz->lock); if (thermal_zone_is_present(tz)) __thermal_zone_device_update(tz, event); mutex_unlock(&tz->lock); } EXPORT_SYMBOL_GPL(thermal_zone_device_update); void thermal_zone_trip_down(struct thermal_zone_device *tz, const struct thermal_trip *trip) { thermal_trip_crossed(tz, trip, thermal_get_tz_governor(tz), false); } int for_each_thermal_governor(int (*cb)(struct thermal_governor *, void *), void *data) { struct thermal_governor *gov; int ret = 0; mutex_lock(&thermal_governor_lock); list_for_each_entry(gov, &thermal_governor_list, governor_list) { ret = cb(gov, data); if (ret) break; } mutex_unlock(&thermal_governor_lock); return ret; } int for_each_thermal_cooling_device(int (*cb)(struct thermal_cooling_device *, void *), void *data) { struct thermal_cooling_device *cdev; int ret = 0; mutex_lock(&thermal_list_lock); list_for_each_entry(cdev, &thermal_cdev_list, node) { ret = cb(cdev, data); if (ret) break; } mutex_unlock(&thermal_list_lock); return ret; } int for_each_thermal_zone(int (*cb)(struct thermal_zone_device *, void *), void *data) { struct thermal_zone_device *tz; int ret = 0; mutex_lock(&thermal_list_lock); list_for_each_entry(tz, &thermal_tz_list, node) { ret = cb(tz, data); if (ret) break; } mutex_unlock(&thermal_list_lock); return ret; } struct thermal_zone_device *thermal_zone_get_by_id(int id) { struct thermal_zone_device *tz, *match = NULL; mutex_lock(&thermal_list_lock); list_for_each_entry(tz, &thermal_tz_list, node) { if (tz->id == id) { match = tz; break; } } mutex_unlock(&thermal_list_lock); return match; } /* * Device management section: cooling devices, zones devices, and binding * * Set of functions provided by the thermal core for: * - cooling devices lifecycle: registration, unregistration, * binding, and unbinding. * - thermal zone devices lifecycle: registration, unregistration, * binding, and unbinding. */ /** * thermal_bind_cdev_to_trip - bind a cooling device to a thermal zone * @tz: pointer to struct thermal_zone_device * @trip: trip point the cooling devices is associated with in this zone. * @cdev: pointer to struct thermal_cooling_device * @upper: the Maximum cooling state for this trip point. * THERMAL_NO_LIMIT means no upper limit, * and the cooling device can be in max_state. * @lower: the Minimum cooling state can be used for this trip point. * THERMAL_NO_LIMIT means no lower limit, * and the cooling device can be in cooling state 0. * @weight: The weight of the cooling device to be bound to the * thermal zone. Use THERMAL_WEIGHT_DEFAULT for the * default value * * This interface function bind a thermal cooling device to the certain trip * point of a thermal zone device. * This function is usually called in the thermal zone device .bind callback. * * Return: 0 on success, the proper error value otherwise. */ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz, const struct thermal_trip *trip, struct thermal_cooling_device *cdev, unsigned long upper, unsigned long lower, unsigned int weight) { struct thermal_instance *dev; struct thermal_instance *pos; struct thermal_zone_device *pos1; struct thermal_cooling_device *pos2; bool upper_no_limit; int result; list_for_each_entry(pos1, &thermal_tz_list, node) { if (pos1 == tz) break; } list_for_each_entry(pos2, &thermal_cdev_list, node) { if (pos2 == cdev) break; } if (tz != pos1 || cdev != pos2) return -EINVAL; /* lower default 0, upper default max_state */ lower = lower == THERMAL_NO_LIMIT ? 0 : lower; if (upper == THERMAL_NO_LIMIT) { upper = cdev->max_state; upper_no_limit = true; } else { upper_no_limit = false; } if (lower > upper || upper > cdev->max_state) return -EINVAL; dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) return -ENOMEM; dev->tz = tz; dev->cdev = cdev; dev->trip = trip; dev->upper = upper; dev->upper_no_limit = upper_no_limit; dev->lower = lower; dev->target = THERMAL_NO_TARGET; dev->weight = weight; result = ida_alloc(&tz->ida, GFP_KERNEL); if (result < 0) goto free_mem; dev->id = result; sprintf(dev->name, "cdev%d", dev->id); result = sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name); if (result) goto release_ida; snprintf(dev->attr_name, sizeof(dev->attr_name), "cdev%d_trip_point", dev->id); sysfs_attr_init(&dev->attr.attr); dev->attr.attr.name = dev->attr_name; dev->attr.attr.mode = 0444; dev->attr.show = trip_point_show; result = device_create_file(&tz->device, &dev->attr); if (result) goto remove_symbol_link; snprintf(dev->weight_attr_name, sizeof(dev->weight_attr_name), "cdev%d_weight", dev->id); sysfs_attr_init(&dev->weight_attr.attr); dev->weight_attr.attr.name = dev->weight_attr_name; dev->weight_attr.attr.mode = S_IWUSR | S_IRUGO; dev->weight_attr.show = weight_show; dev->weight_attr.store = weight_store; result = device_create_file(&tz->device, &dev->weight_attr); if (result) goto remove_trip_file; mutex_lock(&tz->lock); mutex_lock(&cdev->lock); list_for_each_entry(pos, &tz->thermal_instances, tz_node) if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { result = -EEXIST; break; } if (!result) { list_add_tail(&dev->tz_node, &tz->thermal_instances); list_add_tail(&dev->cdev_node, &cdev->thermal_instances); atomic_set(&tz->need_update, 1); thermal_governor_update_tz(tz, THERMAL_TZ_BIND_CDEV); } mutex_unlock(&cdev->lock); mutex_unlock(&tz->lock); if (!result) return 0; device_remove_file(&tz->device, &dev->weight_attr); remove_trip_file: device_remove_file(&tz->device, &dev->attr); remove_symbol_link: sysfs_remove_link(&tz->device.kobj, dev->name); release_ida: ida_free(&tz->ida, dev->id); free_mem: kfree(dev); return result; } EXPORT_SYMBOL_GPL(thermal_bind_cdev_to_trip); int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, int trip_index, struct thermal_cooling_device *cdev, unsigned long upper, unsigned long lower, unsigned int weight) { if (trip_index < 0 || trip_index >= tz->num_trips) return -EINVAL; return thermal_bind_cdev_to_trip(tz, &tz->trips[trip_index].trip, cdev, upper, lower, weight); } EXPORT_SYMBOL_GPL(thermal_zone_bind_cooling_device); /** * thermal_unbind_cdev_from_trip - unbind a cooling device from a thermal zone. * @tz: pointer to a struct thermal_zone_device. * @trip: trip point the cooling devices is associated with in this zone. * @cdev: pointer to a struct thermal_cooling_device. * * This interface function unbind a thermal cooling device from the certain * trip point of a thermal zone device. * This function is usually called in the thermal zone device .unbind callback. * * Return: 0 on success, the proper error value otherwise. */ int thermal_unbind_cdev_from_trip(struct thermal_zone_device *tz, const struct thermal_trip *trip, struct thermal_cooling_device *cdev) { struct thermal_instance *pos, *next; mutex_lock(&tz->lock); mutex_lock(&cdev->lock); list_for_each_entry_safe(pos, next, &tz->thermal_instances, tz_node) { if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) { list_del(&pos->tz_node); list_del(&pos->cdev_node); thermal_governor_update_tz(tz, THERMAL_TZ_UNBIND_CDEV); mutex_unlock(&cdev->lock); mutex_unlock(&tz->lock); goto unbind; } } mutex_unlock(&cdev->lock); mutex_unlock(&tz->lock); return -ENODEV; unbind: device_remove_file(&tz->device, &pos->weight_attr); device_remove_file(&tz->device, &pos->attr); sysfs_remove_link(&tz->device.kobj, pos->name); ida_free(&tz->ida, pos->id); kfree(pos); return 0; } EXPORT_SYMBOL_GPL(thermal_unbind_cdev_from_trip); int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, int trip_index, struct thermal_cooling_device *cdev) { if (trip_index < 0 || trip_index >= tz->num_trips) return -EINVAL; return thermal_unbind_cdev_from_trip(tz, &tz->trips[trip_index].trip, cdev); } EXPORT_SYMBOL_GPL(thermal_zone_unbind_cooling_device); static void thermal_release(struct device *dev) { struct thermal_zone_device *tz; struct thermal_cooling_device *cdev; if (!strncmp(dev_name(dev), "thermal_zone", sizeof("thermal_zone") - 1)) { tz = to_thermal_zone(dev); thermal_zone_destroy_device_groups(tz); mutex_destroy(&tz->lock); complete(&tz->removal); } else if (!strncmp(dev_name(dev), "cooling_device", sizeof("cooling_device") - 1)) { cdev = to_cooling_device(dev); thermal_cooling_device_destroy_sysfs(cdev); kfree_const(cdev->type); ida_free(&thermal_cdev_ida, cdev->id); kfree(cdev); } } static struct class *thermal_class; static inline void print_bind_err_msg(struct thermal_zone_device *tz, struct thermal_cooling_device *cdev, int ret) { dev_err(&tz->device, "binding zone %s with cdev %s failed:%d\n", tz->type, cdev->type, ret); } static void bind_cdev(struct thermal_cooling_device *cdev) { int ret; struct thermal_zone_device *pos = NULL; list_for_each_entry(pos, &thermal_tz_list, node) { if (pos->ops.bind) { ret = pos->ops.bind(pos, cdev); if (ret) print_bind_err_msg(pos, cdev, ret); } } } /** * __thermal_cooling_device_register() - register a new thermal cooling device * @np: a pointer to a device tree node. * @type: the thermal cooling device type. * @devdata: device private data. * @ops: standard thermal cooling devices callbacks. * * This interface function adds a new thermal cooling device (fan/processor/...) * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself * to all the thermal zone devices registered at the same time. * It also gives the opportunity to link the cooling device to a device tree * node, so that it can be bound to a thermal zone created out of device tree. * * Return: a pointer to the created struct thermal_cooling_device or an * ERR_PTR. Caller must check return value with IS_ERR*() helpers. */ static struct thermal_cooling_device * __thermal_cooling_device_register(struct device_node *np, const char *type, void *devdata, const struct thermal_cooling_device_ops *ops) { struct thermal_cooling_device *cdev; struct thermal_zone_device *pos = NULL; unsigned long current_state; int id, ret; if (!ops || !ops->get_max_state || !ops->get_cur_state || !ops->set_cur_state) return ERR_PTR(-EINVAL); if (!thermal_class) return ERR_PTR(-ENODEV); cdev = kzalloc(sizeof(*cdev), GFP_KERNEL); if (!cdev) return ERR_PTR(-ENOMEM); ret = ida_alloc(&thermal_cdev_ida, GFP_KERNEL); if (ret < 0) goto out_kfree_cdev; cdev->id = ret; id = ret; cdev->type = kstrdup_const(type ? type : "", GFP_KERNEL); if (!cdev->type) { ret = -ENOMEM; goto out_ida_remove; } mutex_init(&cdev->lock); INIT_LIST_HEAD(&cdev->thermal_instances); cdev->np = np; cdev->ops = ops; cdev->updated = false; cdev->device.class = thermal_class; cdev->devdata = devdata; ret = cdev->ops->get_max_state(cdev, &cdev->max_state); if (ret) goto out_cdev_type; /* * The cooling device's current state is only needed for debug * initialization below, so a failure to get it does not cause * the entire cooling device initialization to fail. However, * the debug will not work for the device if its initial state * cannot be determined and drivers are responsible for ensuring * that this will not happen. */ ret = cdev->ops->get_cur_state(cdev, ¤t_state); if (ret) current_state = ULONG_MAX; thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); if (ret) goto out_cooling_dev; ret = device_register(&cdev->device); if (ret) { /* thermal_release() handles rest of the cleanup */ put_device(&cdev->device); return ERR_PTR(ret); } if (current_state <= cdev->max_state) thermal_debug_cdev_add(cdev, current_state); /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock); list_add(&cdev->node, &thermal_cdev_list); /* Update binding information for 'this' new cdev */ bind_cdev(cdev); list_for_each_entry(pos, &thermal_tz_list, node) if (atomic_cmpxchg(&pos->need_update, 1, 0)) thermal_zone_device_update(pos, THERMAL_EVENT_UNSPECIFIED); mutex_unlock(&thermal_list_lock); return cdev; out_cooling_dev: thermal_cooling_device_destroy_sysfs(cdev); out_cdev_type: kfree_const(cdev->type); out_ida_remove: ida_free(&thermal_cdev_ida, id); out_kfree_cdev: kfree(cdev); return ERR_PTR(ret); } /** * thermal_cooling_device_register() - register a new thermal cooling device * @type: the thermal cooling device type. * @devdata: device private data. * @ops: standard thermal cooling devices callbacks. * * This interface function adds a new thermal cooling device (fan/processor/...) * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself * to all the thermal zone devices registered at the same time. * * Return: a pointer to the created struct thermal_cooling_device or an * ERR_PTR. Caller must check return value with IS_ERR*() helpers. */ struct thermal_cooling_device * thermal_cooling_device_register(const char *type, void *devdata, const struct thermal_cooling_device_ops *ops) { return __thermal_cooling_device_register(NULL, type, devdata, ops); } EXPORT_SYMBOL_GPL(thermal_cooling_device_register); /** * thermal_of_cooling_device_register() - register an OF thermal cooling device * @np: a pointer to a device tree node. * @type: the thermal cooling device type. * @devdata: device private data. * @ops: standard thermal cooling devices callbacks. * * This function will register a cooling device with device tree node reference. * This interface function adds a new thermal cooling device (fan/processor/...) * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself * to all the thermal zone devices registered at the same time. * * Return: a pointer to the created struct thermal_cooling_device or an * ERR_PTR. Caller must check return value with IS_ERR*() helpers. */ struct thermal_cooling_device * thermal_of_cooling_device_register(struct device_node *np, const char *type, void *devdata, const struct thermal_cooling_device_ops *ops) { return __thermal_cooling_device_register(np, type, devdata, ops); } EXPORT_SYMBOL_GPL(thermal_of_cooling_device_register); static void thermal_cooling_device_release(struct device *dev, void *res) { thermal_cooling_device_unregister( *(struct thermal_cooling_device **)res); } /** * devm_thermal_of_cooling_device_register() - register an OF thermal cooling * device * @dev: a valid struct device pointer of a sensor device. * @np: a pointer to a device tree node. * @type: the thermal cooling device type. * @devdata: device private data. * @ops: standard thermal cooling devices callbacks. * * This function will register a cooling device with device tree node reference. * This interface function adds a new thermal cooling device (fan/processor/...) * to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself * to all the thermal zone devices registered at the same time. * * Return: a pointer to the created struct thermal_cooling_device or an * ERR_PTR. Caller must check return value with IS_ERR*() helpers. */ struct thermal_cooling_device * devm_thermal_of_cooling_device_register(struct device *dev, struct device_node *np, const char *type, void *devdata, const struct thermal_cooling_device_ops *ops) { struct thermal_cooling_device **ptr, *tcd; ptr = devres_alloc(thermal_cooling_device_release, sizeof(*ptr), GFP_KERNEL); if (!ptr) return ERR_PTR(-ENOMEM); tcd = __thermal_cooling_device_register(np, type, devdata, ops); if (IS_ERR(tcd)) { devres_free(ptr); return tcd; } *ptr = tcd; devres_add(dev, ptr); return tcd; } EXPORT_SYMBOL_GPL(devm_thermal_of_cooling_device_register); static bool thermal_cooling_device_present(struct thermal_cooling_device *cdev) { struct thermal_cooling_device *pos = NULL; list_for_each_entry(pos, &thermal_cdev_list, node) { if (pos == cdev) return true; } return false; } /** * thermal_cooling_device_update - Update a cooling device object * @cdev: Target cooling device. * * Update @cdev to reflect a change of the underlying hardware or platform. * * Must be called when the maximum cooling state of @cdev becomes invalid and so * its .get_max_state() callback needs to be run to produce the new maximum * cooling state value. */ void thermal_cooling_device_update(struct thermal_cooling_device *cdev) { struct thermal_instance *ti; unsigned long state; if (IS_ERR_OR_NULL(cdev)) return; /* * Hold thermal_list_lock throughout the update to prevent the device * from going away while being updated. */ mutex_lock(&thermal_list_lock); if (!thermal_cooling_device_present(cdev)) goto unlock_list; /* * Update under the cdev lock to prevent the state from being set beyond * the new limit concurrently. */ mutex_lock(&cdev->lock); if (cdev->ops->get_max_state(cdev, &cdev->max_state)) goto unlock; thermal_cooling_device_stats_reinit(cdev); list_for_each_entry(ti, &cdev->thermal_instances, cdev_node) { if (ti->upper == cdev->max_state) continue; if (ti->upper < cdev->max_state) { if (ti->upper_no_limit) ti->upper = cdev->max_state; continue; } ti->upper = cdev->max_state; if (ti->lower > ti->upper) ti->lower = ti->upper; if (ti->target == THERMAL_NO_TARGET) continue; if (ti->target > ti->upper) ti->target = ti->upper; } if (cdev->ops->get_cur_state(cdev, &state) || state > cdev->max_state) goto unlock; thermal_cooling_device_stats_update(cdev, state); unlock: mutex_unlock(&cdev->lock); unlock_list: mutex_unlock(&thermal_list_lock); } EXPORT_SYMBOL_GPL(thermal_cooling_device_update); /** * thermal_cooling_device_unregister - removes a thermal cooling device * @cdev: the thermal cooling device to remove. * * thermal_cooling_device_unregister() must be called when a registered * thermal cooling device is no longer needed. */ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) { struct thermal_zone_device *tz; if (!cdev) return; thermal_debug_cdev_remove(cdev); mutex_lock(&thermal_list_lock); if (!thermal_cooling_device_present(cdev)) { mutex_unlock(&thermal_list_lock); return; } list_del(&cdev->node); /* Unbind all thermal zones associated with 'this' cdev */ list_for_each_entry(tz, &thermal_tz_list, node) { if (tz->ops.unbind) tz->ops.unbind(tz, cdev); } mutex_unlock(&thermal_list_lock); device_unregister(&cdev->device); } EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister); static void bind_tz(struct thermal_zone_device *tz) { int ret; struct thermal_cooling_device *pos = NULL; if (!tz->ops.bind) return; mutex_lock(&thermal_list_lock); list_for_each_entry(pos, &thermal_cdev_list, node) { ret = tz->ops.bind(tz, pos); if (ret) print_bind_err_msg(tz, pos, ret); } mutex_unlock(&thermal_list_lock); } static void thermal_set_delay_jiffies(unsigned long *delay_jiffies, int delay_ms) { *delay_jiffies = msecs_to_jiffies(delay_ms); if (delay_ms > 1000) *delay_jiffies = round_jiffies(*delay_jiffies); } int thermal_zone_get_crit_temp(struct thermal_zone_device *tz, int *temp) { const struct thermal_trip_desc *td; int ret = -EINVAL; if (tz->ops.get_crit_temp) return tz->ops.get_crit_temp(tz, temp); mutex_lock(&tz->lock); for_each_trip_desc(tz, td) { const struct thermal_trip *trip = &td->trip; if (trip->type == THERMAL_TRIP_CRITICAL) { *temp = trip->temperature; ret = 0; break; } } mutex_unlock(&tz->lock); return ret; } EXPORT_SYMBOL_GPL(thermal_zone_get_crit_temp); /** * thermal_zone_device_register_with_trips() - register a new thermal zone device * @type: the thermal zone device type * @trips: a pointer to an array of thermal trips * @num_trips: the number of trip points the thermal zone support * @devdata: private device data * @ops: standard thermal zone device callbacks * @tzp: thermal zone platform parameters * @passive_delay: number of milliseconds to wait between polls when * performing passive cooling * @polling_delay: number of milliseconds to wait between polls when checking * whether trip points have been crossed (0 for interrupt * driven systems) * * This interface function adds a new thermal zone device (sensor) to * /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the * thermal cooling devices registered at the same time. * thermal_zone_device_unregister() must be called when the device is no * longer needed. The passive cooling depends on the .get_trend() return value. * * Return: a pointer to the created struct thermal_zone_device or an * in case of error, an ERR_PTR. Caller must check return value with * IS_ERR*() helpers. */ struct thermal_zone_device * thermal_zone_device_register_with_trips(const char *type, const struct thermal_trip *trips, int num_trips, void *devdata, const struct thermal_zone_device_ops *ops, const struct thermal_zone_params *tzp, unsigned int passive_delay, unsigned int polling_delay) { const struct thermal_trip *trip = trips; struct thermal_zone_device *tz; struct thermal_trip_desc *td; int id; int result; struct thermal_governor *governor; if (!type || strlen(type) == 0) { pr_err("No thermal zone type defined\n"); return ERR_PTR(-EINVAL); } if (strlen(type) >= THERMAL_NAME_LENGTH) { pr_err("Thermal zone name (%s) too long, should be under %d chars\n", type, THERMAL_NAME_LENGTH); return ERR_PTR(-EINVAL); } if (num_trips < 0) { pr_err("Incorrect number of thermal trips\n"); return ERR_PTR(-EINVAL); } if (!ops || !ops->get_temp) { pr_err("Thermal zone device ops not defined\n"); return ERR_PTR(-EINVAL); } if (num_trips > 0 && !trips) return ERR_PTR(-EINVAL); if (polling_delay) { if (passive_delay > polling_delay) return ERR_PTR(-EINVAL); if (!passive_delay) passive_delay = polling_delay; } if (!thermal_class) return ERR_PTR(-ENODEV); tz = kzalloc(struct_size(tz, trips, num_trips), GFP_KERNEL); if (!tz) return ERR_PTR(-ENOMEM); if (tzp) { tz->tzp = kmemdup(tzp, sizeof(*tzp), GFP_KERNEL); if (!tz->tzp) { result = -ENOMEM; goto free_tz; } } INIT_LIST_HEAD(&tz->thermal_instances); INIT_LIST_HEAD(&tz->node); ida_init(&tz->ida); mutex_init(&tz->lock); init_completion(&tz->removal); init_completion(&tz->resume); id = ida_alloc(&thermal_tz_ida, GFP_KERNEL); if (id < 0) { result = id; goto free_tzp; } tz->id = id; strscpy(tz->type, type, sizeof(tz->type)); tz->ops = *ops; if (!tz->ops.critical) tz->ops.critical = thermal_zone_device_critical; tz->device.class = thermal_class; tz->devdata = devdata; tz->num_trips = num_trips; for_each_trip_desc(tz, td) { td->trip = *trip++; /* * Mark all thresholds as invalid to start with even though * this only matters for the trips that start as invalid and * become valid later. */ td->threshold = INT_MAX; } thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay); thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay); tz->recheck_delay_jiffies = THERMAL_RECHECK_DELAY; /* sys I/F */ /* Add nodes that are always present via .groups */ result = thermal_zone_create_device_groups(tz); if (result) goto remove_id; /* A new thermal zone needs to be updated anyway. */ atomic_set(&tz->need_update, 1); result = dev_set_name(&tz->device, "thermal_zone%d", tz->id); if (result) { thermal_zone_destroy_device_groups(tz); goto remove_id; } result = device_register(&tz->device); if (result) goto release_device; /* Update 'this' zone's governor information */ mutex_lock(&thermal_governor_lock); if (tz->tzp) governor = __find_governor(tz->tzp->governor_name); else governor = def_governor; result = thermal_set_governor(tz, governor); if (result) { mutex_unlock(&thermal_governor_lock); goto unregister; } mutex_unlock(&thermal_governor_lock); if (!tz->tzp || !tz->tzp->no_hwmon) { result = thermal_add_hwmon_sysfs(tz); if (result) goto unregister; } mutex_lock(&thermal_list_lock); mutex_lock(&tz->lock); list_add_tail(&tz->node, &thermal_tz_list); mutex_unlock(&tz->lock); mutex_unlock(&thermal_list_lock); /* Bind cooling devices for this zone */ bind_tz(tz); thermal_zone_device_init(tz); /* Update the new thermal zone and mark it as already updated. */ if (atomic_cmpxchg(&tz->need_update, 1, 0)) thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED); thermal_notify_tz_create(tz); thermal_debug_tz_add(tz); return tz; unregister: device_del(&tz->device); release_device: put_device(&tz->device); remove_id: ida_free(&thermal_tz_ida, id); free_tzp: kfree(tz->tzp); free_tz: kfree(tz); return ERR_PTR(result); } EXPORT_SYMBOL_GPL(thermal_zone_device_register_with_trips); struct thermal_zone_device *thermal_tripless_zone_device_register( const char *type, void *devdata, const struct thermal_zone_device_ops *ops, const struct thermal_zone_params *tzp) { return thermal_zone_device_register_with_trips(type, NULL, 0, devdata, ops, tzp, 0, 0); } EXPORT_SYMBOL_GPL(thermal_tripless_zone_device_register); void *thermal_zone_device_priv(struct thermal_zone_device *tzd) { return tzd->devdata; } EXPORT_SYMBOL_GPL(thermal_zone_device_priv); const char *thermal_zone_device_type(struct thermal_zone_device *tzd) { return tzd->type; } EXPORT_SYMBOL_GPL(thermal_zone_device_type); int thermal_zone_device_id(struct thermal_zone_device *tzd) { return tzd->id; } EXPORT_SYMBOL_GPL(thermal_zone_device_id); struct device *thermal_zone_device(struct thermal_zone_device *tzd) { return &tzd->device; } EXPORT_SYMBOL_GPL(thermal_zone_device); /** * thermal_zone_device_unregister - removes the registered thermal zone device * @tz: the thermal zone device to remove */ void thermal_zone_device_unregister(struct thermal_zone_device *tz) { struct thermal_cooling_device *cdev; struct thermal_zone_device *pos = NULL; if (!tz) return; thermal_debug_tz_remove(tz); mutex_lock(&thermal_list_lock); list_for_each_entry(pos, &thermal_tz_list, node) if (pos == tz) break; if (pos != tz) { /* thermal zone device not found */ mutex_unlock(&thermal_list_lock); return; } mutex_lock(&tz->lock); list_del(&tz->node); mutex_unlock(&tz->lock); /* Unbind all cdevs associated with 'this' thermal zone */ list_for_each_entry(cdev, &thermal_cdev_list, node) if (tz->ops.unbind) tz->ops.unbind(tz, cdev); mutex_unlock(&thermal_list_lock); cancel_delayed_work_sync(&tz->poll_queue); thermal_set_governor(tz, NULL); thermal_remove_hwmon_sysfs(tz); ida_free(&thermal_tz_ida, tz->id); ida_destroy(&tz->ida); device_del(&tz->device); kfree(tz->tzp); put_device(&tz->device); thermal_notify_tz_delete(tz); wait_for_completion(&tz->removal); kfree(tz); } EXPORT_SYMBOL_GPL(thermal_zone_device_unregister); /** * thermal_zone_get_zone_by_name() - search for a zone and returns its ref * @name: thermal zone name to fetch the temperature * * When only one zone is found with the passed name, returns a reference to it. * * Return: On success returns a reference to an unique thermal zone with * matching name equals to @name, an ERR_PTR otherwise (-EINVAL for invalid * paramenters, -ENODEV for not found and -EEXIST for multiple matches). */ struct thermal_zone_device *thermal_zone_get_zone_by_name(const char *name) { struct thermal_zone_device *pos = NULL, *ref = ERR_PTR(-EINVAL); unsigned int found = 0; if (!name) goto exit; mutex_lock(&thermal_list_lock); list_for_each_entry(pos, &thermal_tz_list, node) if (!strncasecmp(name, pos->type, THERMAL_NAME_LENGTH)) { found++; ref = pos; } mutex_unlock(&thermal_list_lock); /* nothing has been found, thus an error code for it */ if (found == 0) ref = ERR_PTR(-ENODEV); else if (found > 1) /* Success only when an unique zone is found */ ref = ERR_PTR(-EEXIST); exit: return ref; } EXPORT_SYMBOL_GPL(thermal_zone_get_zone_by_name); static void thermal_zone_device_resume(struct work_struct *work) { struct thermal_zone_device *tz; tz = container_of(work, struct thermal_zone_device, poll_queue.work); mutex_lock(&tz->lock); tz->suspended = false; thermal_debug_tz_resume(tz); thermal_zone_device_init(tz); thermal_governor_update_tz(tz, THERMAL_TZ_RESUME); __thermal_zone_device_update(tz, THERMAL_TZ_RESUME); complete(&tz->resume); tz->resuming = false; mutex_unlock(&tz->lock); } static int thermal_pm_notify(struct notifier_block *nb, unsigned long mode, void *_unused) { struct thermal_zone_device *tz; switch (mode) { case PM_HIBERNATION_PREPARE: case PM_RESTORE_PREPARE: case PM_SUSPEND_PREPARE: mutex_lock(&thermal_list_lock); list_for_each_entry(tz, &thermal_tz_list, node) { mutex_lock(&tz->lock); if (tz->resuming) { /* * thermal_zone_device_resume() queued up for * this zone has not acquired the lock yet, so * release it to let the function run and wait * util it has done the work. */ mutex_unlock(&tz->lock); wait_for_completion(&tz->resume); mutex_lock(&tz->lock); } tz->suspended = true; mutex_unlock(&tz->lock); } mutex_unlock(&thermal_list_lock); break; case PM_POST_HIBERNATION: case PM_POST_RESTORE: case PM_POST_SUSPEND: mutex_lock(&thermal_list_lock); list_for_each_entry(tz, &thermal_tz_list, node) { mutex_lock(&tz->lock); cancel_delayed_work(&tz->poll_queue); reinit_completion(&tz->resume); tz->resuming = true; /* * Replace the work function with the resume one, which * will restore the original work function and schedule * the polling work if needed. */ INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_resume); /* Queue up the work without a delay. */ mod_delayed_work(system_freezable_power_efficient_wq, &tz->poll_queue, 0); mutex_unlock(&tz->lock); } mutex_unlock(&thermal_list_lock); break; default: break; } return 0; } static struct notifier_block thermal_pm_nb = { .notifier_call = thermal_pm_notify, /* * Run at the lowest priority to avoid interference between the thermal * zone resume work items spawned by thermal_pm_notify() and the other * PM notifiers. */ .priority = INT_MIN, }; static int __init thermal_init(void) { int result; thermal_debug_init(); result = thermal_netlink_init(); if (result) goto error; result = thermal_register_governors(); if (result) goto unregister_netlink; thermal_class = kzalloc(sizeof(*thermal_class), GFP_KERNEL); if (!thermal_class) { result = -ENOMEM; goto unregister_governors; } thermal_class->name = "thermal"; thermal_class->dev_release = thermal_release; result = class_register(thermal_class); if (result) { kfree(thermal_class); thermal_class = NULL; goto unregister_governors; } result = register_pm_notifier(&thermal_pm_nb); if (result) pr_warn("Thermal: Can not register suspend notifier, return %d\n", result); return 0; unregister_governors: thermal_unregister_governors(); unregister_netlink: thermal_netlink_exit(); error: mutex_destroy(&thermal_list_lock); mutex_destroy(&thermal_governor_lock); return result; } postcore_initcall(thermal_init); |
| 46 46 28 12 48 48 48 46 46 16 46 52 48 47 47 47 47 47 46 46 46 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 | // SPDX-License-Identifier: GPL-2.0-or-later /* * ALSA sequencer device management * Copyright (c) 1999 by Takashi Iwai <tiwai@suse.de> * *---------------------------------------------------------------- * * This device handler separates the card driver module from sequencer * stuff (sequencer core, synth drivers, etc), so that user can avoid * to spend unnecessary resources e.g. if he needs only listening to * MP3s. * * The card (or lowlevel) driver creates a sequencer device entry * via snd_seq_device_new(). This is an entry pointer to communicate * with the sequencer device "driver", which is involved with the * actual part to communicate with the sequencer core. * Each sequencer device entry has an id string and the corresponding * driver with the same id is loaded when required. For example, * lowlevel codes to access emu8000 chip on sbawe card are included in * emu8000-synth module. To activate this module, the hardware * resources like i/o port are passed via snd_seq_device argument. */ #include <linux/device.h> #include <linux/init.h> #include <linux/module.h> #include <sound/core.h> #include <sound/info.h> #include <sound/seq_device.h> #include <sound/seq_kernel.h> #include <sound/initval.h> #include <linux/kmod.h> #include <linux/slab.h> #include <linux/mutex.h> MODULE_AUTHOR("Takashi Iwai <tiwai@suse.de>"); MODULE_DESCRIPTION("ALSA sequencer device management"); MODULE_LICENSE("GPL"); /* * bus definition */ static int snd_seq_bus_match(struct device *dev, const struct device_driver *drv) { struct snd_seq_device *sdev = to_seq_dev(dev); struct snd_seq_driver *sdrv = to_seq_drv(drv); return strcmp(sdrv->id, sdev->id) == 0 && sdrv->argsize == sdev->argsize; } static const struct bus_type snd_seq_bus_type = { .name = "snd_seq", .match = snd_seq_bus_match, }; /* * proc interface -- just for compatibility */ #ifdef CONFIG_SND_PROC_FS static struct snd_info_entry *info_entry; static int print_dev_info(struct device *dev, void *data) { struct snd_seq_device *sdev = to_seq_dev(dev); struct snd_info_buffer *buffer = data; snd_iprintf(buffer, "snd-%s,%s,%d\n", sdev->id, dev->driver ? "loaded" : "empty", dev->driver ? 1 : 0); return 0; } static void snd_seq_device_info(struct snd_info_entry *entry, struct snd_info_buffer *buffer) { bus_for_each_dev(&snd_seq_bus_type, NULL, buffer, print_dev_info); } #endif /* * load all registered drivers (called from seq_clientmgr.c) */ #ifdef CONFIG_MODULES /* flag to block auto-loading */ static atomic_t snd_seq_in_init = ATOMIC_INIT(1); /* blocked as default */ static int request_seq_drv(struct device *dev, void *data) { struct snd_seq_device *sdev = to_seq_dev(dev); if (!dev->driver) request_module("snd-%s", sdev->id); return 0; } static void autoload_drivers(struct work_struct *work) { /* avoid reentrance */ if (atomic_inc_return(&snd_seq_in_init) == 1) bus_for_each_dev(&snd_seq_bus_type, NULL, NULL, request_seq_drv); atomic_dec(&snd_seq_in_init); } static DECLARE_WORK(autoload_work, autoload_drivers); static void queue_autoload_drivers(void) { schedule_work(&autoload_work); } void snd_seq_autoload_init(void) { atomic_dec(&snd_seq_in_init); #ifdef CONFIG_SND_SEQUENCER_MODULE /* initial autoload only when snd-seq is a module */ queue_autoload_drivers(); #endif } EXPORT_SYMBOL(snd_seq_autoload_init); void snd_seq_autoload_exit(void) { atomic_inc(&snd_seq_in_init); } EXPORT_SYMBOL(snd_seq_autoload_exit); void snd_seq_device_load_drivers(void) { queue_autoload_drivers(); flush_work(&autoload_work); } EXPORT_SYMBOL(snd_seq_device_load_drivers); static inline void cancel_autoload_drivers(void) { cancel_work_sync(&autoload_work); } #else static inline void queue_autoload_drivers(void) { } static inline void cancel_autoload_drivers(void) { } #endif /* * device management */ static int snd_seq_device_dev_free(struct snd_device *device) { struct snd_seq_device *dev = device->device_data; cancel_autoload_drivers(); if (dev->private_free) dev->private_free(dev); put_device(&dev->dev); return 0; } static int snd_seq_device_dev_register(struct snd_device *device) { struct snd_seq_device *dev = device->device_data; int err; err = device_add(&dev->dev); if (err < 0) return err; if (!dev->dev.driver) queue_autoload_drivers(); return 0; } static int snd_seq_device_dev_disconnect(struct snd_device *device) { struct snd_seq_device *dev = device->device_data; device_del(&dev->dev); return 0; } static void snd_seq_dev_release(struct device *dev) { kfree(to_seq_dev(dev)); } /* * register a sequencer device * card = card info * device = device number (if any) * id = id of driver * result = return pointer (NULL allowed if unnecessary) */ int snd_seq_device_new(struct snd_card *card, int device, const char *id, int argsize, struct snd_seq_device **result) { struct snd_seq_device *dev; int err; static const struct snd_device_ops dops = { .dev_free = snd_seq_device_dev_free, .dev_register = snd_seq_device_dev_register, .dev_disconnect = snd_seq_device_dev_disconnect, }; if (result) *result = NULL; if (snd_BUG_ON(!id)) return -EINVAL; dev = kzalloc(sizeof(*dev) + argsize, GFP_KERNEL); if (!dev) return -ENOMEM; /* set up device info */ dev->card = card; dev->device = device; dev->id = id; dev->argsize = argsize; device_initialize(&dev->dev); dev->dev.parent = &card->card_dev; dev->dev.bus = &snd_seq_bus_type; dev->dev.release = snd_seq_dev_release; dev_set_name(&dev->dev, "%s-%d-%d", dev->id, card->number, device); /* add this device to the list */ err = snd_device_new(card, SNDRV_DEV_SEQUENCER, dev, &dops); if (err < 0) { put_device(&dev->dev); return err; } if (result) *result = dev; return 0; } EXPORT_SYMBOL(snd_seq_device_new); /* * driver registration */ int __snd_seq_driver_register(struct snd_seq_driver *drv, struct module *mod) { if (WARN_ON(!drv->driver.name || !drv->id)) return -EINVAL; drv->driver.bus = &snd_seq_bus_type; drv->driver.owner = mod; return driver_register(&drv->driver); } EXPORT_SYMBOL_GPL(__snd_seq_driver_register); void snd_seq_driver_unregister(struct snd_seq_driver *drv) { driver_unregister(&drv->driver); } EXPORT_SYMBOL_GPL(snd_seq_driver_unregister); /* * module part */ static int __init seq_dev_proc_init(void) { #ifdef CONFIG_SND_PROC_FS info_entry = snd_info_create_module_entry(THIS_MODULE, "drivers", snd_seq_root); if (info_entry == NULL) return -ENOMEM; info_entry->content = SNDRV_INFO_CONTENT_TEXT; info_entry->c.text.read = snd_seq_device_info; if (snd_info_register(info_entry) < 0) { snd_info_free_entry(info_entry); return -ENOMEM; } #endif return 0; } static int __init alsa_seq_device_init(void) { int err; err = bus_register(&snd_seq_bus_type); if (err < 0) return err; err = seq_dev_proc_init(); if (err < 0) bus_unregister(&snd_seq_bus_type); return err; } static void __exit alsa_seq_device_exit(void) { #ifdef CONFIG_MODULES cancel_work_sync(&autoload_work); #endif #ifdef CONFIG_SND_PROC_FS snd_info_free_entry(info_entry); #endif bus_unregister(&snd_seq_bus_type); } subsys_initcall(alsa_seq_device_init) module_exit(alsa_seq_device_exit) |
| 6 4 4 2 1 1 2 6 2 2 2 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 | /* SPDX-License-Identifier: GPL-2.0 */ #include <linux/module.h> #include <linux/netfilter/nf_tables.h> #include <net/netfilter/nf_tables.h> #include <net/netfilter/nf_tables_core.h> #include <net/netfilter/nf_socket.h> #include <net/inet_sock.h> #include <net/tcp.h> struct nft_socket { enum nft_socket_keys key:8; u8 level; u8 len; union { u8 dreg; }; }; static void nft_socket_wildcard(const struct nft_pktinfo *pkt, struct nft_regs *regs, struct sock *sk, u32 *dest) { switch (nft_pf(pkt)) { case NFPROTO_IPV4: nft_reg_store8(dest, inet_sk(sk)->inet_rcv_saddr == 0); break; #if IS_ENABLED(CONFIG_NF_TABLES_IPV6) case NFPROTO_IPV6: nft_reg_store8(dest, ipv6_addr_any(&sk->sk_v6_rcv_saddr)); break; #endif default: regs->verdict.code = NFT_BREAK; return; } } #ifdef CONFIG_SOCK_CGROUP_DATA static noinline bool nft_sock_get_eval_cgroupv2(u32 *dest, struct sock *sk, const struct nft_pktinfo *pkt, u32 level) { struct cgroup *cgrp; u64 cgid; if (!sk_fullsock(sk)) return false; cgrp = cgroup_ancestor(sock_cgroup_ptr(&sk->sk_cgrp_data), level); if (!cgrp) return false; cgid = cgroup_id(cgrp); memcpy(dest, &cgid, sizeof(u64)); return true; } #endif static struct sock *nft_socket_do_lookup(const struct nft_pktinfo *pkt) { const struct net_device *indev = nft_in(pkt); const struct sk_buff *skb = pkt->skb; struct sock *sk = NULL; if (!indev) return NULL; switch (nft_pf(pkt)) { case NFPROTO_IPV4: sk = nf_sk_lookup_slow_v4(nft_net(pkt), skb, indev); break; #if IS_ENABLED(CONFIG_NF_TABLES_IPV6) case NFPROTO_IPV6: sk = nf_sk_lookup_slow_v6(nft_net(pkt), skb, indev); break; #endif default: WARN_ON_ONCE(1); break; } return sk; } static void nft_socket_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { const struct nft_socket *priv = nft_expr_priv(expr); struct sk_buff *skb = pkt->skb; struct sock *sk = skb->sk; u32 *dest = ®s->data[priv->dreg]; if (sk && !net_eq(nft_net(pkt), sock_net(sk))) sk = NULL; if (!sk) sk = nft_socket_do_lookup(pkt); if (!sk) { regs->verdict.code = NFT_BREAK; return; } switch(priv->key) { case NFT_SOCKET_TRANSPARENT: nft_reg_store8(dest, inet_sk_transparent(sk)); break; case NFT_SOCKET_MARK: if (sk_fullsock(sk)) { *dest = READ_ONCE(sk->sk_mark); } else { regs->verdict.code = NFT_BREAK; return; } break; case NFT_SOCKET_WILDCARD: if (!sk_fullsock(sk)) { regs->verdict.code = NFT_BREAK; return; } nft_socket_wildcard(pkt, regs, sk, dest); break; #ifdef CONFIG_SOCK_CGROUP_DATA case NFT_SOCKET_CGROUPV2: if (!nft_sock_get_eval_cgroupv2(dest, sk, pkt, priv->level)) { regs->verdict.code = NFT_BREAK; return; } break; #endif default: WARN_ON(1); regs->verdict.code = NFT_BREAK; } if (sk != skb->sk) sock_gen_put(sk); } static const struct nla_policy nft_socket_policy[NFTA_SOCKET_MAX + 1] = { [NFTA_SOCKET_KEY] = NLA_POLICY_MAX(NLA_BE32, 255), [NFTA_SOCKET_DREG] = { .type = NLA_U32 }, [NFTA_SOCKET_LEVEL] = NLA_POLICY_MAX(NLA_BE32, 255), }; static int nft_socket_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { struct nft_socket *priv = nft_expr_priv(expr); unsigned int len; if (!tb[NFTA_SOCKET_DREG] || !tb[NFTA_SOCKET_KEY]) return -EINVAL; switch(ctx->family) { case NFPROTO_IPV4: #if IS_ENABLED(CONFIG_NF_TABLES_IPV6) case NFPROTO_IPV6: #endif case NFPROTO_INET: break; default: return -EOPNOTSUPP; } priv->key = ntohl(nla_get_be32(tb[NFTA_SOCKET_KEY])); switch(priv->key) { case NFT_SOCKET_TRANSPARENT: case NFT_SOCKET_WILDCARD: len = sizeof(u8); break; case NFT_SOCKET_MARK: len = sizeof(u32); break; #ifdef CONFIG_CGROUPS case NFT_SOCKET_CGROUPV2: { unsigned int level; if (!tb[NFTA_SOCKET_LEVEL]) return -EINVAL; level = ntohl(nla_get_be32(tb[NFTA_SOCKET_LEVEL])); if (level > 255) return -EOPNOTSUPP; priv->level = level; len = sizeof(u64); break; } #endif default: return -EOPNOTSUPP; } priv->len = len; return nft_parse_register_store(ctx, tb[NFTA_SOCKET_DREG], &priv->dreg, NULL, NFT_DATA_VALUE, len); } static int nft_socket_dump(struct sk_buff *skb, const struct nft_expr *expr, bool reset) { const struct nft_socket *priv = nft_expr_priv(expr); if (nla_put_be32(skb, NFTA_SOCKET_KEY, htonl(priv->key))) return -1; if (nft_dump_register(skb, NFTA_SOCKET_DREG, priv->dreg)) return -1; if (priv->key == NFT_SOCKET_CGROUPV2 && nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level))) return -1; return 0; } static bool nft_socket_reduce(struct nft_regs_track *track, const struct nft_expr *expr) { const struct nft_socket *priv = nft_expr_priv(expr); const struct nft_socket *socket; if (!nft_reg_track_cmp(track, expr, priv->dreg)) { nft_reg_track_update(track, expr, priv->dreg, priv->len); return false; } socket = nft_expr_priv(track->regs[priv->dreg].selector); if (priv->key != socket->key || priv->dreg != socket->dreg || priv->level != socket->level) { nft_reg_track_update(track, expr, priv->dreg, priv->len); return false; } if (!track->regs[priv->dreg].bitwise) return true; return nft_expr_reduce_bitwise(track, expr); } static int nft_socket_validate(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nft_data **data) { if (ctx->family != NFPROTO_IPV4 && ctx->family != NFPROTO_IPV6 && ctx->family != NFPROTO_INET) return -EOPNOTSUPP; return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_IN) | (1 << NF_INET_LOCAL_OUT)); } static struct nft_expr_type nft_socket_type; static const struct nft_expr_ops nft_socket_ops = { .type = &nft_socket_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_socket)), .eval = nft_socket_eval, .init = nft_socket_init, .dump = nft_socket_dump, .validate = nft_socket_validate, .reduce = nft_socket_reduce, }; static struct nft_expr_type nft_socket_type __read_mostly = { .name = "socket", .ops = &nft_socket_ops, .policy = nft_socket_policy, .maxattr = NFTA_SOCKET_MAX, .owner = THIS_MODULE, }; static int __init nft_socket_module_init(void) { return nft_register_expr(&nft_socket_type); } static void __exit nft_socket_module_exit(void) { nft_unregister_expr(&nft_socket_type); } module_init(nft_socket_module_init); module_exit(nft_socket_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Máté Eckl"); MODULE_DESCRIPTION("nf_tables socket match module"); MODULE_ALIAS_NFT_EXPR("socket"); |
| 17 16 2 4 9 9 8 5 7 10 14 13 10 11 12 15 15 11 12 6 12 10 10 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | // SPDX-License-Identifier: GPL-2.0-only /* * vMTRR implementation * * Copyright (C) 2006 Qumranet, Inc. * Copyright 2010 Red Hat, Inc. and/or its affiliates. * Copyright(C) 2015 Intel Corporation. * * Authors: * Yaniv Kamay <yaniv@qumranet.com> * Avi Kivity <avi@qumranet.com> * Marcelo Tosatti <mtosatti@redhat.com> * Paolo Bonzini <pbonzini@redhat.com> * Xiao Guangrong <guangrong.xiao@linux.intel.com> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/kvm_host.h> #include <asm/mtrr.h> #include "cpuid.h" static u64 *find_mtrr(struct kvm_vcpu *vcpu, unsigned int msr) { int index; switch (msr) { case MTRRphysBase_MSR(0) ... MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1): index = msr - MTRRphysBase_MSR(0); return &vcpu->arch.mtrr_state.var[index]; case MSR_MTRRfix64K_00000: return &vcpu->arch.mtrr_state.fixed_64k; case MSR_MTRRfix16K_80000: case MSR_MTRRfix16K_A0000: index = msr - MSR_MTRRfix16K_80000; return &vcpu->arch.mtrr_state.fixed_16k[index]; case MSR_MTRRfix4K_C0000: case MSR_MTRRfix4K_C8000: case MSR_MTRRfix4K_D0000: case MSR_MTRRfix4K_D8000: case MSR_MTRRfix4K_E0000: case MSR_MTRRfix4K_E8000: case MSR_MTRRfix4K_F0000: case MSR_MTRRfix4K_F8000: index = msr - MSR_MTRRfix4K_C0000; return &vcpu->arch.mtrr_state.fixed_4k[index]; case MSR_MTRRdefType: return &vcpu->arch.mtrr_state.deftype; default: break; } return NULL; } static bool valid_mtrr_type(unsigned t) { return t < 8 && (1 << t) & 0x73; /* 0, 1, 4, 5, 6 */ } static bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data) { int i; u64 mask; if (msr == MSR_MTRRdefType) { if (data & ~0xcff) return false; return valid_mtrr_type(data & 0xff); } else if (msr >= MSR_MTRRfix64K_00000 && msr <= MSR_MTRRfix4K_F8000) { for (i = 0; i < 8 ; i++) if (!valid_mtrr_type((data >> (i * 8)) & 0xff)) return false; return true; } /* variable MTRRs */ if (WARN_ON_ONCE(!(msr >= MTRRphysBase_MSR(0) && msr <= MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1)))) return false; mask = kvm_vcpu_reserved_gpa_bits_raw(vcpu); if ((msr & 1) == 0) { /* MTRR base */ if (!valid_mtrr_type(data & 0xff)) return false; mask |= 0xf00; } else { /* MTRR mask */ mask |= 0x7ff; } return (data & mask) == 0; } int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data) { u64 *mtrr; mtrr = find_mtrr(vcpu, msr); if (!mtrr) return 1; if (!kvm_mtrr_valid(vcpu, msr, data)) return 1; *mtrr = data; return 0; } int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) { u64 *mtrr; /* MSR_MTRRcap is a readonly MSR. */ if (msr == MSR_MTRRcap) { /* * SMRR = 0 * WC = 1 * FIX = 1 * VCNT = KVM_NR_VAR_MTRR */ *pdata = 0x500 | KVM_NR_VAR_MTRR; return 0; } mtrr = find_mtrr(vcpu, msr); if (!mtrr) return 1; *pdata = *mtrr; return 0; } |
| 11 2 5 4 11 4 4 4 4 4 4 7 6 7 5 7 89 91 21 11 6 11 84 84 43 77 77 81 81 1 80 1 4 78 76 38 73 80 17 17 17 14 13 1 27 59 2 2 58 57 10 10 64 5 4 3 80 80 7 80 77 77 71 70 1 71 76 77 71 69 62 13 11 12 3 67 77 10 71 28 71 66 67 36 7 6 6 6 5 5 5 3 1 1 7 1 1 3 2 3 9 2 9 8 8 7 7 3 31 38 38 35 35 34 44 6 6 4 4 3 6 9 9 9 9 9 5 8 8 2 8 7 7 7 1 7 7 7 7 7 6 5 4 4 4 2 1 4 4 1 1 7 7 5 4 8 8 5 4 3 5 2 2 2 2 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 | // SPDX-License-Identifier: GPL-2.0-only /* * sysctl.c: General linux system control interface * * Begun 24 March 1995, Stephen Tweedie * Added /proc support, Dec 1995 * Added bdflush entry and intvec min/max checking, 2/23/96, Tom Dyas. * Added hooks for /proc/sys/net (minor, minor patch), 96/4/1, Mike Shaver. * Added kernel/java-{interpreter,appletviewer}, 96/5/10, Mike Shaver. * Dynamic registration fixes, Stephen Tweedie. * Added kswapd-interval, ctrl-alt-del, printk stuff, 1/8/97, Chris Horn. * Made sysctl support optional via CONFIG_SYSCTL, 1/10/97, Chris * Horn. * Added proc_doulongvec_ms_jiffies_minmax, 09/08/99, Carlos H. Bauer. * Added proc_doulongvec_minmax, 09/08/99, Carlos H. Bauer. * Changed linked lists to use list.h instead of lists.h, 02/24/00, Bill * Wendling. * The list_for_each() macro wasn't appropriate for the sysctl loop. * Removed it and replaced it with older style, 03/23/00, Bill Wendling */ #include <linux/module.h> #include <linux/mm.h> #include <linux/swap.h> #include <linux/slab.h> #include <linux/sysctl.h> #include <linux/bitmap.h> #include <linux/signal.h> #include <linux/panic.h> #include <linux/printk.h> #include <linux/proc_fs.h> #include <linux/security.h> #include <linux/ctype.h> #include <linux/kmemleak.h> #include <linux/filter.h> #include <linux/fs.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/kobject.h> #include <linux/net.h> #include <linux/sysrq.h> #include <linux/highuid.h> #include <linux/writeback.h> #include <linux/ratelimit.h> #include <linux/hugetlb.h> #include <linux/initrd.h> #include <linux/key.h> #include <linux/times.h> #include <linux/limits.h> #include <linux/dcache.h> #include <linux/syscalls.h> #include <linux/vmstat.h> #include <linux/nfs_fs.h> #include <linux/acpi.h> #include <linux/reboot.h> #include <linux/ftrace.h> #include <linux/perf_event.h> #include <linux/oom.h> #include <linux/kmod.h> #include <linux/capability.h> #include <linux/binfmts.h> #include <linux/sched/sysctl.h> #include <linux/mount.h> #include <linux/userfaultfd_k.h> #include <linux/pid.h> #include "../lib/kstrtox.h" #include <linux/uaccess.h> #include <asm/processor.h> #ifdef CONFIG_X86 #include <asm/nmi.h> #include <asm/stacktrace.h> #include <asm/io.h> #endif #ifdef CONFIG_SPARC #include <asm/setup.h> #endif #ifdef CONFIG_RT_MUTEXES #include <linux/rtmutex.h> #endif /* shared constants to be used in various sysctls */ const int sysctl_vals[] = { 0, 1, 2, 3, 4, 100, 200, 1000, 3000, INT_MAX, 65535, -1 }; EXPORT_SYMBOL(sysctl_vals); const unsigned long sysctl_long_vals[] = { 0, 1, LONG_MAX }; EXPORT_SYMBOL_GPL(sysctl_long_vals); #if defined(CONFIG_SYSCTL) /* Constants used for minimum and maximum */ #ifdef CONFIG_PERF_EVENTS static const int six_hundred_forty_kb = 640 * 1024; #endif static const int ngroups_max = NGROUPS_MAX; static const int cap_last_cap = CAP_LAST_CAP; #ifdef CONFIG_PROC_SYSCTL /** * enum sysctl_writes_mode - supported sysctl write modes * * @SYSCTL_WRITES_LEGACY: each write syscall must fully contain the sysctl value * to be written, and multiple writes on the same sysctl file descriptor * will rewrite the sysctl value, regardless of file position. No warning * is issued when the initial position is not 0. * @SYSCTL_WRITES_WARN: same as above but warn when the initial file position is * not 0. * @SYSCTL_WRITES_STRICT: writes to numeric sysctl entries must always be at * file position 0 and the value must be fully contained in the buffer * sent to the write syscall. If dealing with strings respect the file * position, but restrict this to the max length of the buffer, anything * passed the max length will be ignored. Multiple writes will append * to the buffer. * * These write modes control how current file position affects the behavior of * updating sysctl values through the proc interface on each write. */ enum sysctl_writes_mode { SYSCTL_WRITES_LEGACY = -1, SYSCTL_WRITES_WARN = 0, SYSCTL_WRITES_STRICT = 1, }; static enum sysctl_writes_mode sysctl_writes_strict = SYSCTL_WRITES_STRICT; #endif /* CONFIG_PROC_SYSCTL */ #if defined(HAVE_ARCH_PICK_MMAP_LAYOUT) || \ defined(CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) int sysctl_legacy_va_layout; #endif #endif /* CONFIG_SYSCTL */ /* * /proc/sys support */ #ifdef CONFIG_PROC_SYSCTL static int _proc_do_string(char *data, int maxlen, int write, char *buffer, size_t *lenp, loff_t *ppos) { size_t len; char c, *p; if (!data || !maxlen || !*lenp) { *lenp = 0; return 0; } if (write) { if (sysctl_writes_strict == SYSCTL_WRITES_STRICT) { /* Only continue writes not past the end of buffer. */ len = strlen(data); if (len > maxlen - 1) len = maxlen - 1; if (*ppos > len) return 0; len = *ppos; } else { /* Start writing from beginning of buffer. */ len = 0; } *ppos += *lenp; p = buffer; while ((p - buffer) < *lenp && len < maxlen - 1) { c = *(p++); if (c == 0 || c == '\n') break; data[len++] = c; } data[len] = 0; } else { len = strlen(data); if (len > maxlen) len = maxlen; if (*ppos > len) { *lenp = 0; return 0; } data += *ppos; len -= *ppos; if (len > *lenp) len = *lenp; if (len) memcpy(buffer, data, len); if (len < *lenp) { buffer[len] = '\n'; len++; } *lenp = len; *ppos += len; } return 0; } static void warn_sysctl_write(const struct ctl_table *table) { pr_warn_once("%s wrote to %s when file position was not 0!\n" "This will not be supported in the future. To silence this\n" "warning, set kernel.sysctl_writes_strict = -1\n", current->comm, table->procname); } /** * proc_first_pos_non_zero_ignore - check if first position is allowed * @ppos: file position * @table: the sysctl table * * Returns true if the first position is non-zero and the sysctl_writes_strict * mode indicates this is not allowed for numeric input types. String proc * handlers can ignore the return value. */ static bool proc_first_pos_non_zero_ignore(loff_t *ppos, const struct ctl_table *table) { if (!*ppos) return false; switch (sysctl_writes_strict) { case SYSCTL_WRITES_STRICT: return true; case SYSCTL_WRITES_WARN: warn_sysctl_write(table); return false; default: return false; } } /** * proc_dostring - read a string sysctl * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes a string from/to the user buffer. If the kernel * buffer provided is not large enough to hold the string, the * string is truncated. The copied string is %NULL-terminated. * If the string is being read by the user process, it is copied * and a newline '\n' is added. It is truncated if the buffer is * not large enough. * * Returns 0 on success. */ int proc_dostring(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { if (write) proc_first_pos_non_zero_ignore(ppos, table); return _proc_do_string(table->data, table->maxlen, write, buffer, lenp, ppos); } static void proc_skip_spaces(char **buf, size_t *size) { while (*size) { if (!isspace(**buf)) break; (*size)--; (*buf)++; } } static void proc_skip_char(char **buf, size_t *size, const char v) { while (*size) { if (**buf != v) break; (*size)--; (*buf)++; } } /** * strtoul_lenient - parse an ASCII formatted integer from a buffer and only * fail on overflow * * @cp: kernel buffer containing the string to parse * @endp: pointer to store the trailing characters * @base: the base to use * @res: where the parsed integer will be stored * * In case of success 0 is returned and @res will contain the parsed integer, * @endp will hold any trailing characters. * This function will fail the parse on overflow. If there wasn't an overflow * the function will defer the decision what characters count as invalid to the * caller. */ static int strtoul_lenient(const char *cp, char **endp, unsigned int base, unsigned long *res) { unsigned long long result; unsigned int rv; cp = _parse_integer_fixup_radix(cp, &base); rv = _parse_integer(cp, base, &result); if ((rv & KSTRTOX_OVERFLOW) || (result != (unsigned long)result)) return -ERANGE; cp += rv; if (endp) *endp = (char *)cp; *res = (unsigned long)result; return 0; } #define TMPBUFLEN 22 /** * proc_get_long - reads an ASCII formatted integer from a user buffer * * @buf: a kernel buffer * @size: size of the kernel buffer * @val: this is where the number will be stored * @neg: set to %TRUE if number is negative * @perm_tr: a vector which contains the allowed trailers * @perm_tr_len: size of the perm_tr vector * @tr: pointer to store the trailer character * * In case of success %0 is returned and @buf and @size are updated with * the amount of bytes read. If @tr is non-NULL and a trailing * character exists (size is non-zero after returning from this * function), @tr is updated with the trailing character. */ static int proc_get_long(char **buf, size_t *size, unsigned long *val, bool *neg, const char *perm_tr, unsigned perm_tr_len, char *tr) { char *p, tmp[TMPBUFLEN]; ssize_t len = *size; if (len <= 0) return -EINVAL; if (len > TMPBUFLEN - 1) len = TMPBUFLEN - 1; memcpy(tmp, *buf, len); tmp[len] = 0; p = tmp; if (*p == '-' && *size > 1) { *neg = true; p++; } else *neg = false; if (!isdigit(*p)) return -EINVAL; if (strtoul_lenient(p, &p, 0, val)) return -EINVAL; len = p - tmp; /* We don't know if the next char is whitespace thus we may accept * invalid integers (e.g. 1234...a) or two integers instead of one * (e.g. 123...1). So lets not allow such large numbers. */ if (len == TMPBUFLEN - 1) return -EINVAL; if (len < *size && perm_tr_len && !memchr(perm_tr, *p, perm_tr_len)) return -EINVAL; if (tr && (len < *size)) *tr = *p; *buf += len; *size -= len; return 0; } /** * proc_put_long - converts an integer to a decimal ASCII formatted string * * @buf: the user buffer * @size: the size of the user buffer * @val: the integer to be converted * @neg: sign of the number, %TRUE for negative * * In case of success @buf and @size are updated with the amount of bytes * written. */ static void proc_put_long(void **buf, size_t *size, unsigned long val, bool neg) { int len; char tmp[TMPBUFLEN], *p = tmp; sprintf(p, "%s%lu", neg ? "-" : "", val); len = strlen(tmp); if (len > *size) len = *size; memcpy(*buf, tmp, len); *size -= len; *buf += len; } #undef TMPBUFLEN static void proc_put_char(void **buf, size_t *size, char c) { if (*size) { char **buffer = (char **)buf; **buffer = c; (*size)--; (*buffer)++; *buf = *buffer; } } static int do_proc_dointvec_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { if (write) { if (*negp) { if (*lvalp > (unsigned long) INT_MAX + 1) return -EINVAL; WRITE_ONCE(*valp, -*lvalp); } else { if (*lvalp > (unsigned long) INT_MAX) return -EINVAL; WRITE_ONCE(*valp, *lvalp); } } else { int val = READ_ONCE(*valp); if (val < 0) { *negp = true; *lvalp = -(unsigned long)val; } else { *negp = false; *lvalp = (unsigned long)val; } } return 0; } static int do_proc_douintvec_conv(unsigned long *lvalp, unsigned int *valp, int write, void *data) { if (write) { if (*lvalp > UINT_MAX) return -EINVAL; WRITE_ONCE(*valp, *lvalp); } else { unsigned int val = READ_ONCE(*valp); *lvalp = (unsigned long)val; } return 0; } static const char proc_wspace_sep[] = { ' ', '\t', '\n' }; static int __do_proc_dointvec(void *tbl_data, const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(bool *negp, unsigned long *lvalp, int *valp, int write, void *data), void *data) { int *i, vleft, first = 1, err = 0; size_t left; char *p; if (!tbl_data || !table->maxlen || !*lenp || (*ppos && !write)) { *lenp = 0; return 0; } i = (int *) tbl_data; vleft = table->maxlen / sizeof(*i); left = *lenp; if (!conv) conv = do_proc_dointvec_conv; if (write) { if (proc_first_pos_non_zero_ignore(ppos, table)) goto out; if (left > PAGE_SIZE - 1) left = PAGE_SIZE - 1; p = buffer; } for (; left && vleft--; i++, first=0) { unsigned long lval; bool neg; if (write) { proc_skip_spaces(&p, &left); if (!left) break; err = proc_get_long(&p, &left, &lval, &neg, proc_wspace_sep, sizeof(proc_wspace_sep), NULL); if (err) break; if (conv(&neg, &lval, i, 1, data)) { err = -EINVAL; break; } } else { if (conv(&neg, &lval, i, 0, data)) { err = -EINVAL; break; } if (!first) proc_put_char(&buffer, &left, '\t'); proc_put_long(&buffer, &left, lval, neg); } } if (!write && !first && left && !err) proc_put_char(&buffer, &left, '\n'); if (write && !err && left) proc_skip_spaces(&p, &left); if (write && first) return err ? : -EINVAL; *lenp -= left; out: *ppos += *lenp; return err; } static int do_proc_dointvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(bool *negp, unsigned long *lvalp, int *valp, int write, void *data), void *data) { return __do_proc_dointvec(table->data, table, write, buffer, lenp, ppos, conv, data); } static int do_proc_douintvec_w(unsigned int *tbl_data, const struct ctl_table *table, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(unsigned long *lvalp, unsigned int *valp, int write, void *data), void *data) { unsigned long lval; int err = 0; size_t left; bool neg; char *p = buffer; left = *lenp; if (proc_first_pos_non_zero_ignore(ppos, table)) goto bail_early; if (left > PAGE_SIZE - 1) left = PAGE_SIZE - 1; proc_skip_spaces(&p, &left); if (!left) { err = -EINVAL; goto out_free; } err = proc_get_long(&p, &left, &lval, &neg, proc_wspace_sep, sizeof(proc_wspace_sep), NULL); if (err || neg) { err = -EINVAL; goto out_free; } if (conv(&lval, tbl_data, 1, data)) { err = -EINVAL; goto out_free; } if (!err && left) proc_skip_spaces(&p, &left); out_free: if (err) return -EINVAL; return 0; /* This is in keeping with old __do_proc_dointvec() */ bail_early: *ppos += *lenp; return err; } static int do_proc_douintvec_r(unsigned int *tbl_data, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(unsigned long *lvalp, unsigned int *valp, int write, void *data), void *data) { unsigned long lval; int err = 0; size_t left; left = *lenp; if (conv(&lval, tbl_data, 0, data)) { err = -EINVAL; goto out; } proc_put_long(&buffer, &left, lval, false); if (!left) goto out; proc_put_char(&buffer, &left, '\n'); out: *lenp -= left; *ppos += *lenp; return err; } static int __do_proc_douintvec(void *tbl_data, const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(unsigned long *lvalp, unsigned int *valp, int write, void *data), void *data) { unsigned int *i, vleft; if (!tbl_data || !table->maxlen || !*lenp || (*ppos && !write)) { *lenp = 0; return 0; } i = (unsigned int *) tbl_data; vleft = table->maxlen / sizeof(*i); /* * Arrays are not supported, keep this simple. *Do not* add * support for them. */ if (vleft != 1) { *lenp = 0; return -EINVAL; } if (!conv) conv = do_proc_douintvec_conv; if (write) return do_proc_douintvec_w(i, table, buffer, lenp, ppos, conv, data); return do_proc_douintvec_r(i, buffer, lenp, ppos, conv, data); } int do_proc_douintvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(unsigned long *lvalp, unsigned int *valp, int write, void *data), void *data) { return __do_proc_douintvec(table->data, table, write, buffer, lenp, ppos, conv, data); } /** * proc_dobool - read/write a bool * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes one integer value from/to the user buffer, * treated as an ASCII string. * * table->data must point to a bool variable and table->maxlen must * be sizeof(bool). * * Returns 0 on success. */ int proc_dobool(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct ctl_table tmp; bool *data = table->data; int res, val; /* Do not support arrays yet. */ if (table->maxlen != sizeof(bool)) return -EINVAL; tmp = *table; tmp.maxlen = sizeof(val); tmp.data = &val; val = READ_ONCE(*data); res = proc_dointvec(&tmp, write, buffer, lenp, ppos); if (res) return res; if (write) WRITE_ONCE(*data, val); return 0; } /** * proc_dointvec - read a vector of integers * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) integer * values from/to the user buffer, treated as an ASCII string. * * Returns 0 on success. */ int proc_dointvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_dointvec(table, write, buffer, lenp, ppos, NULL, NULL); } /** * proc_douintvec - read a vector of unsigned integers * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) unsigned integer * values from/to the user buffer, treated as an ASCII string. * * Returns 0 on success. */ int proc_douintvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_douintvec(table, write, buffer, lenp, ppos, do_proc_douintvec_conv, NULL); } /* * Taint values can only be increased * This means we can safely use a temporary. */ static int proc_taint(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct ctl_table t; unsigned long tmptaint = get_taint(); int err; if (write && !capable(CAP_SYS_ADMIN)) return -EPERM; t = *table; t.data = &tmptaint; err = proc_doulongvec_minmax(&t, write, buffer, lenp, ppos); if (err < 0) return err; if (write) { int i; /* * If we are relying on panic_on_taint not producing * false positives due to userspace input, bail out * before setting the requested taint flags. */ if (panic_on_taint_nousertaint && (tmptaint & panic_on_taint)) return -EINVAL; /* * Poor man's atomic or. Not worth adding a primitive * to everyone's atomic.h for this */ for (i = 0; i < TAINT_FLAGS_COUNT; i++) if ((1UL << i) & tmptaint) add_taint(i, LOCKDEP_STILL_OK); } return err; } /** * struct do_proc_dointvec_minmax_conv_param - proc_dointvec_minmax() range checking structure * @min: pointer to minimum allowable value * @max: pointer to maximum allowable value * * The do_proc_dointvec_minmax_conv_param structure provides the * minimum and maximum values for doing range checking for those sysctl * parameters that use the proc_dointvec_minmax() handler. */ struct do_proc_dointvec_minmax_conv_param { int *min; int *max; }; static int do_proc_dointvec_minmax_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { int tmp, ret; struct do_proc_dointvec_minmax_conv_param *param = data; /* * If writing, first do so via a temporary local int so we can * bounds-check it before touching *valp. */ int *ip = write ? &tmp : valp; ret = do_proc_dointvec_conv(negp, lvalp, ip, write, data); if (ret) return ret; if (write) { if ((param->min && *param->min > tmp) || (param->max && *param->max < tmp)) return -EINVAL; WRITE_ONCE(*valp, tmp); } return 0; } /** * proc_dointvec_minmax - read a vector of integers with min/max values * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) integer * values from/to the user buffer, treated as an ASCII string. * * This routine will ensure the values are within the range specified by * table->extra1 (min) and table->extra2 (max). * * Returns 0 on success or -EINVAL on write when the range check fails. */ int proc_dointvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct do_proc_dointvec_minmax_conv_param param = { .min = (int *) table->extra1, .max = (int *) table->extra2, }; return do_proc_dointvec(table, write, buffer, lenp, ppos, do_proc_dointvec_minmax_conv, ¶m); } /** * struct do_proc_douintvec_minmax_conv_param - proc_douintvec_minmax() range checking structure * @min: pointer to minimum allowable value * @max: pointer to maximum allowable value * * The do_proc_douintvec_minmax_conv_param structure provides the * minimum and maximum values for doing range checking for those sysctl * parameters that use the proc_douintvec_minmax() handler. */ struct do_proc_douintvec_minmax_conv_param { unsigned int *min; unsigned int *max; }; static int do_proc_douintvec_minmax_conv(unsigned long *lvalp, unsigned int *valp, int write, void *data) { int ret; unsigned int tmp; struct do_proc_douintvec_minmax_conv_param *param = data; /* write via temporary local uint for bounds-checking */ unsigned int *up = write ? &tmp : valp; ret = do_proc_douintvec_conv(lvalp, up, write, data); if (ret) return ret; if (write) { if ((param->min && *param->min > tmp) || (param->max && *param->max < tmp)) return -ERANGE; WRITE_ONCE(*valp, tmp); } return 0; } /** * proc_douintvec_minmax - read a vector of unsigned ints with min/max values * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) unsigned integer * values from/to the user buffer, treated as an ASCII string. Negative * strings are not allowed. * * This routine will ensure the values are within the range specified by * table->extra1 (min) and table->extra2 (max). There is a final sanity * check for UINT_MAX to avoid having to support wrap around uses from * userspace. * * Returns 0 on success or -ERANGE on write when the range check fails. */ int proc_douintvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct do_proc_douintvec_minmax_conv_param param = { .min = (unsigned int *) table->extra1, .max = (unsigned int *) table->extra2, }; return do_proc_douintvec(table, write, buffer, lenp, ppos, do_proc_douintvec_minmax_conv, ¶m); } /** * proc_dou8vec_minmax - read a vector of unsigned chars with min/max values * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(u8) unsigned chars * values from/to the user buffer, treated as an ASCII string. Negative * strings are not allowed. * * This routine will ensure the values are within the range specified by * table->extra1 (min) and table->extra2 (max). * * Returns 0 on success or an error on write when the range check fails. */ int proc_dou8vec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct ctl_table tmp; unsigned int min = 0, max = 255U, val; u8 *data = table->data; struct do_proc_douintvec_minmax_conv_param param = { .min = &min, .max = &max, }; int res; /* Do not support arrays yet. */ if (table->maxlen != sizeof(u8)) return -EINVAL; if (table->extra1) min = *(unsigned int *) table->extra1; if (table->extra2) max = *(unsigned int *) table->extra2; tmp = *table; tmp.maxlen = sizeof(val); tmp.data = &val; val = READ_ONCE(*data); res = do_proc_douintvec(&tmp, write, buffer, lenp, ppos, do_proc_douintvec_minmax_conv, ¶m); if (res) return res; if (write) WRITE_ONCE(*data, val); return 0; } EXPORT_SYMBOL_GPL(proc_dou8vec_minmax); #ifdef CONFIG_MAGIC_SYSRQ static int sysrq_sysctl_handler(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { int tmp, ret; tmp = sysrq_mask(); ret = __do_proc_dointvec(&tmp, table, write, buffer, lenp, ppos, NULL, NULL); if (ret || !write) return ret; if (write) sysrq_toggle_support(tmp); return 0; } #endif static int __do_proc_doulongvec_minmax(void *data, const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, unsigned long convmul, unsigned long convdiv) { unsigned long *i, *min, *max; int vleft, first = 1, err = 0; size_t left; char *p; if (!data || !table->maxlen || !*lenp || (*ppos && !write)) { *lenp = 0; return 0; } i = data; min = table->extra1; max = table->extra2; vleft = table->maxlen / sizeof(unsigned long); left = *lenp; if (write) { if (proc_first_pos_non_zero_ignore(ppos, table)) goto out; if (left > PAGE_SIZE - 1) left = PAGE_SIZE - 1; p = buffer; } for (; left && vleft--; i++, first = 0) { unsigned long val; if (write) { bool neg; proc_skip_spaces(&p, &left); if (!left) break; err = proc_get_long(&p, &left, &val, &neg, proc_wspace_sep, sizeof(proc_wspace_sep), NULL); if (err || neg) { err = -EINVAL; break; } val = convmul * val / convdiv; if ((min && val < *min) || (max && val > *max)) { err = -EINVAL; break; } WRITE_ONCE(*i, val); } else { val = convdiv * READ_ONCE(*i) / convmul; if (!first) proc_put_char(&buffer, &left, '\t'); proc_put_long(&buffer, &left, val, false); } } if (!write && !first && left && !err) proc_put_char(&buffer, &left, '\n'); if (write && !err) proc_skip_spaces(&p, &left); if (write && first) return err ? : -EINVAL; *lenp -= left; out: *ppos += *lenp; return err; } static int do_proc_doulongvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, unsigned long convmul, unsigned long convdiv) { return __do_proc_doulongvec_minmax(table->data, table, write, buffer, lenp, ppos, convmul, convdiv); } /** * proc_doulongvec_minmax - read a vector of long integers with min/max values * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long * values from/to the user buffer, treated as an ASCII string. * * This routine will ensure the values are within the range specified by * table->extra1 (min) and table->extra2 (max). * * Returns 0 on success. */ int proc_doulongvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_doulongvec_minmax(table, write, buffer, lenp, ppos, 1l, 1l); } /** * proc_doulongvec_ms_jiffies_minmax - read a vector of millisecond values with min/max values * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned long) unsigned long * values from/to the user buffer, treated as an ASCII string. The values * are treated as milliseconds, and converted to jiffies when they are stored. * * This routine will ensure the values are within the range specified by * table->extra1 (min) and table->extra2 (max). * * Returns 0 on success. */ int proc_doulongvec_ms_jiffies_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_doulongvec_minmax(table, write, buffer, lenp, ppos, HZ, 1000l); } static int do_proc_dointvec_jiffies_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { if (write) { if (*lvalp > INT_MAX / HZ) return 1; if (*negp) WRITE_ONCE(*valp, -*lvalp * HZ); else WRITE_ONCE(*valp, *lvalp * HZ); } else { int val = READ_ONCE(*valp); unsigned long lval; if (val < 0) { *negp = true; lval = -(unsigned long)val; } else { *negp = false; lval = (unsigned long)val; } *lvalp = lval / HZ; } return 0; } static int do_proc_dointvec_userhz_jiffies_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { if (write) { if (USER_HZ < HZ && *lvalp > (LONG_MAX / HZ) * USER_HZ) return 1; *valp = clock_t_to_jiffies(*negp ? -*lvalp : *lvalp); } else { int val = *valp; unsigned long lval; if (val < 0) { *negp = true; lval = -(unsigned long)val; } else { *negp = false; lval = (unsigned long)val; } *lvalp = jiffies_to_clock_t(lval); } return 0; } static int do_proc_dointvec_ms_jiffies_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { if (write) { unsigned long jif = msecs_to_jiffies(*negp ? -*lvalp : *lvalp); if (jif > INT_MAX) return 1; WRITE_ONCE(*valp, (int)jif); } else { int val = READ_ONCE(*valp); unsigned long lval; if (val < 0) { *negp = true; lval = -(unsigned long)val; } else { *negp = false; lval = (unsigned long)val; } *lvalp = jiffies_to_msecs(lval); } return 0; } static int do_proc_dointvec_ms_jiffies_minmax_conv(bool *negp, unsigned long *lvalp, int *valp, int write, void *data) { int tmp, ret; struct do_proc_dointvec_minmax_conv_param *param = data; /* * If writing, first do so via a temporary local int so we can * bounds-check it before touching *valp. */ int *ip = write ? &tmp : valp; ret = do_proc_dointvec_ms_jiffies_conv(negp, lvalp, ip, write, data); if (ret) return ret; if (write) { if ((param->min && *param->min > tmp) || (param->max && *param->max < tmp)) return -EINVAL; *valp = tmp; } return 0; } /** * proc_dointvec_jiffies - read a vector of integers as seconds * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) integer * values from/to the user buffer, treated as an ASCII string. * The values read are assumed to be in seconds, and are converted into * jiffies. * * Returns 0 on success. */ int proc_dointvec_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_dointvec(table,write,buffer,lenp,ppos, do_proc_dointvec_jiffies_conv,NULL); } int proc_dointvec_ms_jiffies_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct do_proc_dointvec_minmax_conv_param param = { .min = (int *) table->extra1, .max = (int *) table->extra2, }; return do_proc_dointvec(table, write, buffer, lenp, ppos, do_proc_dointvec_ms_jiffies_minmax_conv, ¶m); } /** * proc_dointvec_userhz_jiffies - read a vector of integers as 1/USER_HZ seconds * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: pointer to the file position * * Reads/writes up to table->maxlen/sizeof(unsigned int) integer * values from/to the user buffer, treated as an ASCII string. * The values read are assumed to be in 1/USER_HZ seconds, and * are converted into jiffies. * * Returns 0 on success. */ int proc_dointvec_userhz_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_dointvec(table, write, buffer, lenp, ppos, do_proc_dointvec_userhz_jiffies_conv, NULL); } /** * proc_dointvec_ms_jiffies - read a vector of integers as 1 milliseconds * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * @ppos: the current position in the file * * Reads/writes up to table->maxlen/sizeof(unsigned int) integer * values from/to the user buffer, treated as an ASCII string. * The values read are assumed to be in 1/1000 seconds, and * are converted into jiffies. * * Returns 0 on success. */ int proc_dointvec_ms_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return do_proc_dointvec(table, write, buffer, lenp, ppos, do_proc_dointvec_ms_jiffies_conv, NULL); } static int proc_do_cad_pid(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct pid *new_pid; pid_t tmp; int r; tmp = pid_vnr(cad_pid); r = __do_proc_dointvec(&tmp, table, write, buffer, lenp, ppos, NULL, NULL); if (r || !write) return r; new_pid = find_get_pid(tmp); if (!new_pid) return -ESRCH; put_pid(xchg(&cad_pid, new_pid)); return 0; } /** * proc_do_large_bitmap - read/write from/to a large bitmap * @table: the sysctl table * @write: %TRUE if this is a write to the sysctl file * @buffer: the user buffer * @lenp: the size of the user buffer * @ppos: file position * * The bitmap is stored at table->data and the bitmap length (in bits) * in table->maxlen. * * We use a range comma separated format (e.g. 1,3-4,10-10) so that * large bitmaps may be represented in a compact manner. Writing into * the file will clear the bitmap then update it with the given input. * * Returns 0 on success. */ int proc_do_large_bitmap(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { int err = 0; size_t left = *lenp; unsigned long bitmap_len = table->maxlen; unsigned long *bitmap = *(unsigned long **) table->data; unsigned long *tmp_bitmap = NULL; char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c; if (!bitmap || !bitmap_len || !left || (*ppos && !write)) { *lenp = 0; return 0; } if (write) { char *p = buffer; size_t skipped = 0; if (left > PAGE_SIZE - 1) { left = PAGE_SIZE - 1; /* How much of the buffer we'll skip this pass */ skipped = *lenp - left; } tmp_bitmap = bitmap_zalloc(bitmap_len, GFP_KERNEL); if (!tmp_bitmap) return -ENOMEM; proc_skip_char(&p, &left, '\n'); while (!err && left) { unsigned long val_a, val_b; bool neg; size_t saved_left; /* In case we stop parsing mid-number, we can reset */ saved_left = left; err = proc_get_long(&p, &left, &val_a, &neg, tr_a, sizeof(tr_a), &c); /* * If we consumed the entirety of a truncated buffer or * only one char is left (may be a "-"), then stop here, * reset, & come back for more. */ if ((left <= 1) && skipped) { left = saved_left; break; } if (err) break; if (val_a >= bitmap_len || neg) { err = -EINVAL; break; } val_b = val_a; if (left) { p++; left--; } if (c == '-') { err = proc_get_long(&p, &left, &val_b, &neg, tr_b, sizeof(tr_b), &c); /* * If we consumed all of a truncated buffer or * then stop here, reset, & come back for more. */ if (!left && skipped) { left = saved_left; break; } if (err) break; if (val_b >= bitmap_len || neg || val_a > val_b) { err = -EINVAL; break; } if (left) { p++; left--; } } bitmap_set(tmp_bitmap, val_a, val_b - val_a + 1); proc_skip_char(&p, &left, '\n'); } left += skipped; } else { unsigned long bit_a, bit_b = 0; bool first = 1; while (left) { bit_a = find_next_bit(bitmap, bitmap_len, bit_b); if (bit_a >= bitmap_len) break; bit_b = find_next_zero_bit(bitmap, bitmap_len, bit_a + 1) - 1; if (!first) proc_put_char(&buffer, &left, ','); proc_put_long(&buffer, &left, bit_a, false); if (bit_a != bit_b) { proc_put_char(&buffer, &left, '-'); proc_put_long(&buffer, &left, bit_b, false); } first = 0; bit_b++; } proc_put_char(&buffer, &left, '\n'); } if (!err) { if (write) { if (*ppos) bitmap_or(bitmap, bitmap, tmp_bitmap, bitmap_len); else bitmap_copy(bitmap, tmp_bitmap, bitmap_len); } *lenp -= left; *ppos += *lenp; } bitmap_free(tmp_bitmap); return err; } #else /* CONFIG_PROC_SYSCTL */ int proc_dostring(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dobool(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_douintvec(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_douintvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dou8vec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec_ms_jiffies_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec_userhz_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_dointvec_ms_jiffies(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_doulongvec_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_doulongvec_ms_jiffies_minmax(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } int proc_do_large_bitmap(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { return -ENOSYS; } #endif /* CONFIG_PROC_SYSCTL */ #if defined(CONFIG_SYSCTL) int proc_do_static_key(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { struct static_key *key = (struct static_key *)table->data; static DEFINE_MUTEX(static_key_mutex); int val, ret; struct ctl_table tmp = { .data = &val, .maxlen = sizeof(val), .mode = table->mode, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }; if (write && !capable(CAP_SYS_ADMIN)) return -EPERM; mutex_lock(&static_key_mutex); val = static_key_enabled(key); ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); if (write && !ret) { if (val) static_key_enable(key); else static_key_disable(key); } mutex_unlock(&static_key_mutex); return ret; } static struct ctl_table kern_table[] = { { .procname = "panic", .data = &panic_timeout, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #ifdef CONFIG_PROC_SYSCTL { .procname = "tainted", .maxlen = sizeof(long), .mode = 0644, .proc_handler = proc_taint, }, { .procname = "sysctl_writes_strict", .data = &sysctl_writes_strict, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_NEG_ONE, .extra2 = SYSCTL_ONE, }, #endif { .procname = "print-fatal-signals", .data = &print_fatal_signals, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #ifdef CONFIG_SPARC { .procname = "reboot-cmd", .data = reboot_command, .maxlen = 256, .mode = 0644, .proc_handler = proc_dostring, }, { .procname = "stop-a", .data = &stop_a_enabled, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "scons-poweroff", .data = &scons_pwroff, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_SPARC64 { .procname = "tsb-ratio", .data = &sysctl_tsb_ratio, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_PARISC { .procname = "soft-power", .data = &pwrsw_enabled, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW { .procname = "unaligned-trap", .data = &unaligned_enabled, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_STACK_TRACER { .procname = "stack_tracer_enabled", .data = &stack_tracer_enabled, .maxlen = sizeof(int), .mode = 0644, .proc_handler = stack_trace_sysctl, }, #endif #ifdef CONFIG_TRACING { .procname = "ftrace_dump_on_oops", .data = &ftrace_dump_on_oops, .maxlen = MAX_TRACER_SIZE, .mode = 0644, .proc_handler = proc_dostring, }, { .procname = "traceoff_on_warning", .data = &__disable_trace_on_warning, .maxlen = sizeof(__disable_trace_on_warning), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "tracepoint_printk", .data = &tracepoint_printk, .maxlen = sizeof(tracepoint_printk), .mode = 0644, .proc_handler = tracepoint_printk_sysctl, }, #endif #ifdef CONFIG_MODULES { .procname = "modprobe", .data = &modprobe_path, .maxlen = KMOD_PATH_LEN, .mode = 0644, .proc_handler = proc_dostring, }, { .procname = "modules_disabled", .data = &modules_disabled, .maxlen = sizeof(int), .mode = 0644, /* only handle a transition from default "0" to "1" */ .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ONE, .extra2 = SYSCTL_ONE, }, #endif #ifdef CONFIG_UEVENT_HELPER { .procname = "hotplug", .data = &uevent_helper, .maxlen = UEVENT_HELPER_PATH_LEN, .mode = 0644, .proc_handler = proc_dostring, }, #endif #ifdef CONFIG_MAGIC_SYSRQ { .procname = "sysrq", .data = NULL, .maxlen = sizeof (int), .mode = 0644, .proc_handler = sysrq_sysctl_handler, }, #endif #ifdef CONFIG_PROC_SYSCTL { .procname = "cad_pid", .data = NULL, .maxlen = sizeof (int), .mode = 0600, .proc_handler = proc_do_cad_pid, }, #endif { .procname = "threads-max", .data = NULL, .maxlen = sizeof(int), .mode = 0644, .proc_handler = sysctl_max_threads, }, { .procname = "overflowuid", .data = &overflowuid, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_MAXOLDUID, }, { .procname = "overflowgid", .data = &overflowgid, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_MAXOLDUID, }, #ifdef CONFIG_S390 { .procname = "userprocess_debug", .data = &show_unhandled_signals, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif { .procname = "pid_max", .data = &pid_max, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = &pid_max_min, .extra2 = &pid_max_max, }, { .procname = "panic_on_oops", .data = &panic_on_oops, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "panic_print", .data = &panic_print, .maxlen = sizeof(unsigned long), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, { .procname = "ngroups_max", .data = (void *)&ngroups_max, .maxlen = sizeof (int), .mode = 0444, .proc_handler = proc_dointvec, }, { .procname = "cap_last_cap", .data = (void *)&cap_last_cap, .maxlen = sizeof(int), .mode = 0444, .proc_handler = proc_dointvec, }, #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) { .procname = "unknown_nmi_panic", .data = &unknown_nmi_panic, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #if (defined(CONFIG_X86_32) || defined(CONFIG_PARISC)) && \ defined(CONFIG_DEBUG_STACKOVERFLOW) { .procname = "panic_on_stackoverflow", .data = &sysctl_panic_on_stackoverflow, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #if defined(CONFIG_X86) { .procname = "panic_on_unrecovered_nmi", .data = &panic_on_unrecovered_nmi, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "panic_on_io_nmi", .data = &panic_on_io_nmi, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "bootloader_type", .data = &bootloader_type, .maxlen = sizeof (int), .mode = 0444, .proc_handler = proc_dointvec, }, { .procname = "bootloader_version", .data = &bootloader_version, .maxlen = sizeof (int), .mode = 0444, .proc_handler = proc_dointvec, }, { .procname = "io_delay_type", .data = &io_delay_type, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #if defined(CONFIG_MMU) { .procname = "randomize_va_space", .data = &randomize_va_space, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #if defined(CONFIG_S390) && defined(CONFIG_SMP) { .procname = "spin_retry", .data = &spin_retry, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #if defined(CONFIG_ACPI_SLEEP) && defined(CONFIG_X86) { .procname = "acpi_video_flags", .data = &acpi_realmode_flags, .maxlen = sizeof (unsigned long), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, #endif #ifdef CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN { .procname = "ignore-unaligned-usertrap", .data = &no_unaligned_warning, .maxlen = sizeof (int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_RT_MUTEXES { .procname = "max_lock_depth", .data = &max_lock_depth, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, #endif #ifdef CONFIG_PERF_EVENTS /* * User-space scripts rely on the existence of this file * as a feature check for perf_events being enabled. * * So it's an ABI, do not remove! */ { .procname = "perf_event_paranoid", .data = &sysctl_perf_event_paranoid, .maxlen = sizeof(sysctl_perf_event_paranoid), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "perf_event_mlock_kb", .data = &sysctl_perf_event_mlock, .maxlen = sizeof(sysctl_perf_event_mlock), .mode = 0644, .proc_handler = proc_dointvec, }, { .procname = "perf_event_max_sample_rate", .data = &sysctl_perf_event_sample_rate, .maxlen = sizeof(sysctl_perf_event_sample_rate), .mode = 0644, .proc_handler = perf_event_max_sample_rate_handler, .extra1 = SYSCTL_ONE, }, { .procname = "perf_cpu_time_max_percent", .data = &sysctl_perf_cpu_time_max_percent, .maxlen = sizeof(sysctl_perf_cpu_time_max_percent), .mode = 0644, .proc_handler = perf_cpu_time_max_percent_handler, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE_HUNDRED, }, { .procname = "perf_event_max_stack", .data = &sysctl_perf_event_max_stack, .maxlen = sizeof(sysctl_perf_event_max_stack), .mode = 0644, .proc_handler = perf_event_max_stack_handler, .extra1 = SYSCTL_ZERO, .extra2 = (void *)&six_hundred_forty_kb, }, { .procname = "perf_event_max_contexts_per_stack", .data = &sysctl_perf_event_max_contexts_per_stack, .maxlen = sizeof(sysctl_perf_event_max_contexts_per_stack), .mode = 0644, .proc_handler = perf_event_max_stack_handler, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE_THOUSAND, }, #endif { .procname = "panic_on_warn", .data = &panic_on_warn, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }, #ifdef CONFIG_TREE_RCU { .procname = "panic_on_rcu_stall", .data = &sysctl_panic_on_rcu_stall, .maxlen = sizeof(sysctl_panic_on_rcu_stall), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }, { .procname = "max_rcu_stall_to_panic", .data = &sysctl_max_rcu_stall_to_panic, .maxlen = sizeof(sysctl_max_rcu_stall_to_panic), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ONE, .extra2 = SYSCTL_INT_MAX, }, #endif }; static struct ctl_table vm_table[] = { { .procname = "overcommit_memory", .data = &sysctl_overcommit_memory, .maxlen = sizeof(sysctl_overcommit_memory), .mode = 0644, .proc_handler = overcommit_policy_handler, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, { .procname = "overcommit_ratio", .data = &sysctl_overcommit_ratio, .maxlen = sizeof(sysctl_overcommit_ratio), .mode = 0644, .proc_handler = overcommit_ratio_handler, }, { .procname = "overcommit_kbytes", .data = &sysctl_overcommit_kbytes, .maxlen = sizeof(sysctl_overcommit_kbytes), .mode = 0644, .proc_handler = overcommit_kbytes_handler, }, { .procname = "page-cluster", .data = &page_cluster, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = (void *)&page_cluster_max, }, { .procname = "dirtytime_expire_seconds", .data = &dirtytime_expire_interval, .maxlen = sizeof(dirtytime_expire_interval), .mode = 0644, .proc_handler = dirtytime_interval_handler, .extra1 = SYSCTL_ZERO, }, { .procname = "swappiness", .data = &vm_swappiness, .maxlen = sizeof(vm_swappiness), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO_HUNDRED, }, #ifdef CONFIG_NUMA { .procname = "numa_stat", .data = &sysctl_vm_numa_stat, .maxlen = sizeof(int), .mode = 0644, .proc_handler = sysctl_vm_numa_stat_handler, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }, #endif { .procname = "drop_caches", .data = &sysctl_drop_caches, .maxlen = sizeof(int), .mode = 0200, .proc_handler = drop_caches_sysctl_handler, .extra1 = SYSCTL_ONE, .extra2 = SYSCTL_FOUR, }, { .procname = "page_lock_unfairness", .data = &sysctl_page_lock_unfairness, .maxlen = sizeof(sysctl_page_lock_unfairness), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #ifdef CONFIG_MMU { .procname = "max_map_count", .data = &sysctl_max_map_count, .maxlen = sizeof(sysctl_max_map_count), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #else { .procname = "nr_trim_pages", .data = &sysctl_nr_trim_pages, .maxlen = sizeof(sysctl_nr_trim_pages), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #endif { .procname = "vfs_cache_pressure", .data = &sysctl_vfs_cache_pressure, .maxlen = sizeof(sysctl_vfs_cache_pressure), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #if defined(HAVE_ARCH_PICK_MMAP_LAYOUT) || \ defined(CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) { .procname = "legacy_va_layout", .data = &sysctl_legacy_va_layout, .maxlen = sizeof(sysctl_legacy_va_layout), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #endif #ifdef CONFIG_NUMA { .procname = "zone_reclaim_mode", .data = &node_reclaim_mode, .maxlen = sizeof(node_reclaim_mode), .mode = 0644, .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #endif #ifdef CONFIG_SMP { .procname = "stat_interval", .data = &sysctl_stat_interval, .maxlen = sizeof(sysctl_stat_interval), .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, { .procname = "stat_refresh", .data = NULL, .maxlen = 0, .mode = 0600, .proc_handler = vmstat_refresh, }, #endif #ifdef CONFIG_MMU { .procname = "mmap_min_addr", .data = &dac_mmap_min_addr, .maxlen = sizeof(unsigned long), .mode = 0644, .proc_handler = mmap_min_addr_handler, }, #endif #if (defined(CONFIG_X86_32) && !defined(CONFIG_UML))|| \ (defined(CONFIG_SUPERH) && defined(CONFIG_VSYSCALL)) { .procname = "vdso_enabled", #ifdef CONFIG_X86_32 .data = &vdso32_enabled, .maxlen = sizeof(vdso32_enabled), #else .data = &vdso_enabled, .maxlen = sizeof(vdso_enabled), #endif .mode = 0644, .proc_handler = proc_dointvec, .extra1 = SYSCTL_ZERO, }, #endif { .procname = "user_reserve_kbytes", .data = &sysctl_user_reserve_kbytes, .maxlen = sizeof(sysctl_user_reserve_kbytes), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, { .procname = "admin_reserve_kbytes", .data = &sysctl_admin_reserve_kbytes, .maxlen = sizeof(sysctl_admin_reserve_kbytes), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, #ifdef CONFIG_HAVE_ARCH_MMAP_RND_BITS { .procname = "mmap_rnd_bits", .data = &mmap_rnd_bits, .maxlen = sizeof(mmap_rnd_bits), .mode = 0600, .proc_handler = proc_dointvec_minmax, .extra1 = (void *)&mmap_rnd_bits_min, .extra2 = (void *)&mmap_rnd_bits_max, }, #endif #ifdef CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS { .procname = "mmap_rnd_compat_bits", .data = &mmap_rnd_compat_bits, .maxlen = sizeof(mmap_rnd_compat_bits), .mode = 0600, .proc_handler = proc_dointvec_minmax, .extra1 = (void *)&mmap_rnd_compat_bits_min, .extra2 = (void *)&mmap_rnd_compat_bits_max, }, #endif }; int __init sysctl_init_bases(void) { register_sysctl_init("kernel", kern_table); register_sysctl_init("vm", vm_table); return 0; } #endif /* CONFIG_SYSCTL */ /* * No sense putting this after each symbol definition, twice, * exception granted :-) */ EXPORT_SYMBOL(proc_dobool); EXPORT_SYMBOL(proc_dointvec); EXPORT_SYMBOL(proc_douintvec); EXPORT_SYMBOL(proc_dointvec_jiffies); EXPORT_SYMBOL(proc_dointvec_minmax); EXPORT_SYMBOL_GPL(proc_douintvec_minmax); EXPORT_SYMBOL(proc_dointvec_userhz_jiffies); EXPORT_SYMBOL(proc_dointvec_ms_jiffies); EXPORT_SYMBOL(proc_dostring); EXPORT_SYMBOL(proc_doulongvec_minmax); EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax); EXPORT_SYMBOL(proc_do_large_bitmap); |
| 409 410 399 397 201 202 410 418 405 215 419 418 420 420 865 864 419 420 418 418 411 419 419 388 420 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 | // SPDX-License-Identifier: GPL-2.0 #include <linux/kernel.h> #include <linux/bug.h> #include <linux/compiler.h> #include <linux/export.h> #include <linux/string.h> #include <linux/list_sort.h> #include <linux/list.h> /* * Returns a list organized in an intermediate format suited * to chaining of merge() calls: null-terminated, no reserved or * sentinel head node, "prev" links not maintained. */ __attribute__((nonnull(2,3,4))) static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head, **tail = &head; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = &a->next; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = &b->next; b = b->next; if (!b) { *tail = a; break; } } } return head; } /* * Combine final list merge with restoration of standard doubly-linked * list structure. This approach duplicates code from merge(), but * runs faster than the tidier alternatives of either a separate final * prev-link restoration pass, or maintaining the prev links * throughout. */ __attribute__((nonnull(2,3,4,5))) static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, struct list_head *a, struct list_head *b) { struct list_head *tail = head; u8 count = 0; for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { tail->next = a; a->prev = tail; tail = a; a = a->next; if (!a) break; } else { tail->next = b; b->prev = tail; tail = b; b = b->next; if (!b) { b = a; break; } } } /* Finish linking remainder of list b on to tail */ tail->next = b; do { /* * If the merge is highly unbalanced (e.g. the input is * already sorted), this loop may run many iterations. * Continue callbacks to the client even though no * element comparison is needed, so the client's cmp() * routine can invoke cond_resched() periodically. */ if (unlikely(!++count)) cmp(priv, b, b); b->prev = tail; tail = b; b = b->next; } while (b); /* And the final links to make a circular doubly-linked list */ tail->next = head; head->prev = tail; } /** * list_sort - sort a list * @priv: private data, opaque to list_sort(), passed to @cmp * @head: the list to sort * @cmp: the elements comparison function * * The comparison function @cmp must return > 0 if @a should sort after * @b ("@a > @b" if you want an ascending sort), and <= 0 if @a should * sort before @b *or* their original order should be preserved. It is * always called with the element that came first in the input in @a, * and list_sort is a stable sort, so it is not necessary to distinguish * the @a < @b and @a == @b cases. * * This is compatible with two styles of @cmp function: * - The traditional style which returns <0 / =0 / >0, or * - Returning a boolean 0/1. * The latter offers a chance to save a few cycles in the comparison * (which is used by e.g. plug_ctx_cmp() in block/blk-mq.c). * * A good way to write a multi-word comparison is:: * * if (a->high != b->high) * return a->high > b->high; * if (a->middle != b->middle) * return a->middle > b->middle; * return a->low > b->low; * * * This mergesort is as eager as possible while always performing at least * 2:1 balanced merges. Given two pending sublists of size 2^k, they are * merged to a size-2^(k+1) list as soon as we have 2^k following elements. * * Thus, it will avoid cache thrashing as long as 3*2^k elements can * fit into the cache. Not quite as good as a fully-eager bottom-up * mergesort, but it does use 0.2*n fewer comparisons, so is faster in * the common case that everything fits into L1. * * * The merging is controlled by "count", the number of elements in the * pending lists. This is beautifully simple code, but rather subtle. * * Each time we increment "count", we set one bit (bit k) and clear * bits k-1 .. 0. Each time this happens (except the very first time * for each bit, when count increments to 2^k), we merge two lists of * size 2^k into one list of size 2^(k+1). * * This merge happens exactly when the count reaches an odd multiple of * 2^k, which is when we have 2^k elements pending in smaller lists, * so it's safe to merge away two lists of size 2^k. * * After this happens twice, we have created two lists of size 2^(k+1), * which will be merged into a list of size 2^(k+2) before we create * a third list of size 2^(k+1), so there are never more than two pending. * * The number of pending lists of size 2^k is determined by the * state of bit k of "count" plus two extra pieces of information: * * - The state of bit k-1 (when k == 0, consider bit -1 always set), and * - Whether the higher-order bits are zero or non-zero (i.e. * is count >= 2^(k+1)). * * There are six states we distinguish. "x" represents some arbitrary * bits, and "y" represents some arbitrary non-zero bits: * 0: 00x: 0 pending of size 2^k; x pending of sizes < 2^k * 1: 01x: 0 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * 2: x10x: 0 pending of size 2^k; 2^k + x pending of sizes < 2^k * 3: x11x: 1 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * 4: y00x: 1 pending of size 2^k; 2^k + x pending of sizes < 2^k * 5: y01x: 2 pending of size 2^k; 2^(k-1) + x pending of sizes < 2^k * (merge and loop back to state 2) * * We gain lists of size 2^k in the 2->3 and 4->5 transitions (because * bit k-1 is set while the more significant bits are non-zero) and * merge them away in the 5->2 transition. Note in particular that just * before the 5->2 transition, all lower-order bits are 11 (state 3), * so there is one list of each smaller size. * * When we reach the end of the input, we merge all the pending * lists, from smallest to largest. If you work through cases 2 to * 5 above, you can see that the number of elements we merge with a list * of size 2^k varies from 2^(k-1) (cases 3 and 5 when x == 0) to * 2^(k+1) - 1 (second merge of case 5 when x == 2^(k-1) - 1). */ __attribute__((nonnull(2,3))) void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp) { struct list_head *list = head->next, *pending = NULL; size_t count = 0; /* Count of pending */ if (list == head->prev) /* Zero or one elements */ return; /* Convert to a null-terminated singly-linked list. */ head->prev->next = NULL; /* * Data structure invariants: * - All lists are singly linked and null-terminated; prev * pointers are not maintained. * - pending is a prev-linked "list of lists" of sorted * sublists awaiting further merging. * - Each of the sorted sublists is power-of-two in size. * - Sublists are sorted by size and age, smallest & newest at front. * - There are zero to two sublists of each size. * - A pair of pending sublists are merged as soon as the number * of following pending elements equals their size (i.e. * each time count reaches an odd multiple of that size). * That ensures each later final merge will be at worst 2:1. * - Each round consists of: * - Merging the two sublists selected by the highest bit * which flips when count is incremented, and * - Adding an element from the input as a size-1 sublist. */ do { size_t bits; struct list_head **tail = &pending; /* Find the least-significant clear bit in count */ for (bits = count; bits & 1; bits >>= 1) tail = &(*tail)->prev; /* Do the indicated merge */ if (likely(bits)) { struct list_head *a = *tail, *b = a->prev; a = merge(priv, cmp, b, a); /* Install the merged result in place of the inputs */ a->prev = b->prev; *tail = a; } /* Move one element from input list to pending */ list->prev = pending; pending = list; list = list->next; pending->next = NULL; count++; } while (list); /* End of input; merge together all the pending lists. */ list = pending; pending = pending->prev; for (;;) { struct list_head *next = pending->prev; if (!next) break; list = merge(priv, cmp, pending, list); pending = next; } /* The final merge, rebuilding prev links */ merge_final(priv, cmp, head, pending, list); } EXPORT_SYMBOL(list_sort); |
| 169 76 3 2 2 3 2 2 2 1 1 1 2 15 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 | // SPDX-License-Identifier: GPL-2.0-or-later /* * V4L2 controls framework Request API implementation. * * Copyright (C) 2018-2021 Hans Verkuil <hverkuil-cisco@xs4all.nl> */ #define pr_fmt(fmt) "v4l2-ctrls: " fmt #include <linux/export.h> #include <linux/slab.h> #include <media/v4l2-ctrls.h> #include <media/v4l2-dev.h> #include <media/v4l2-ioctl.h> #include "v4l2-ctrls-priv.h" /* Initialize the request-related fields in a control handler */ void v4l2_ctrl_handler_init_request(struct v4l2_ctrl_handler *hdl) { INIT_LIST_HEAD(&hdl->requests); INIT_LIST_HEAD(&hdl->requests_queued); hdl->request_is_queued = false; media_request_object_init(&hdl->req_obj); } /* Free the request-related fields in a control handler */ void v4l2_ctrl_handler_free_request(struct v4l2_ctrl_handler *hdl) { struct v4l2_ctrl_handler *req, *next_req; /* * Do nothing if this isn't the main handler or the main * handler is not used in any request. * * The main handler can be identified by having a NULL ops pointer in * the request object. */ if (hdl->req_obj.ops || list_empty(&hdl->requests)) return; /* * If the main handler is freed and it is used by handler objects in * outstanding requests, then unbind and put those objects before * freeing the main handler. */ list_for_each_entry_safe(req, next_req, &hdl->requests, requests) { media_request_object_unbind(&req->req_obj); media_request_object_put(&req->req_obj); } } static int v4l2_ctrl_request_clone(struct v4l2_ctrl_handler *hdl, const struct v4l2_ctrl_handler *from) { struct v4l2_ctrl_ref *ref; int err = 0; if (WARN_ON(!hdl || hdl == from)) return -EINVAL; if (hdl->error) return hdl->error; WARN_ON(hdl->lock != &hdl->_lock); mutex_lock(from->lock); list_for_each_entry(ref, &from->ctrl_refs, node) { struct v4l2_ctrl *ctrl = ref->ctrl; struct v4l2_ctrl_ref *new_ref; /* Skip refs inherited from other devices */ if (ref->from_other_dev) continue; err = handler_new_ref(hdl, ctrl, &new_ref, false, true); if (err) break; } mutex_unlock(from->lock); return err; } static void v4l2_ctrl_request_queue(struct media_request_object *obj) { struct v4l2_ctrl_handler *hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); struct v4l2_ctrl_handler *main_hdl = obj->priv; mutex_lock(main_hdl->lock); list_add_tail(&hdl->requests_queued, &main_hdl->requests_queued); hdl->request_is_queued = true; mutex_unlock(main_hdl->lock); } static void v4l2_ctrl_request_unbind(struct media_request_object *obj) { struct v4l2_ctrl_handler *hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); struct v4l2_ctrl_handler *main_hdl = obj->priv; mutex_lock(main_hdl->lock); list_del_init(&hdl->requests); if (hdl->request_is_queued) { list_del_init(&hdl->requests_queued); hdl->request_is_queued = false; } mutex_unlock(main_hdl->lock); } static void v4l2_ctrl_request_release(struct media_request_object *obj) { struct v4l2_ctrl_handler *hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); v4l2_ctrl_handler_free(hdl); kfree(hdl); } static const struct media_request_object_ops req_ops = { .queue = v4l2_ctrl_request_queue, .unbind = v4l2_ctrl_request_unbind, .release = v4l2_ctrl_request_release, }; struct v4l2_ctrl_handler *v4l2_ctrl_request_hdl_find(struct media_request *req, struct v4l2_ctrl_handler *parent) { struct media_request_object *obj; if (WARN_ON(req->state != MEDIA_REQUEST_STATE_VALIDATING && req->state != MEDIA_REQUEST_STATE_QUEUED)) return NULL; obj = media_request_object_find(req, &req_ops, parent); if (obj) return container_of(obj, struct v4l2_ctrl_handler, req_obj); return NULL; } EXPORT_SYMBOL_GPL(v4l2_ctrl_request_hdl_find); struct v4l2_ctrl * v4l2_ctrl_request_hdl_ctrl_find(struct v4l2_ctrl_handler *hdl, u32 id) { struct v4l2_ctrl_ref *ref = find_ref_lock(hdl, id); return (ref && ref->p_req_valid) ? ref->ctrl : NULL; } EXPORT_SYMBOL_GPL(v4l2_ctrl_request_hdl_ctrl_find); static int v4l2_ctrl_request_bind(struct media_request *req, struct v4l2_ctrl_handler *hdl, struct v4l2_ctrl_handler *from) { int ret; ret = v4l2_ctrl_request_clone(hdl, from); if (!ret) { ret = media_request_object_bind(req, &req_ops, from, false, &hdl->req_obj); if (!ret) { mutex_lock(from->lock); list_add_tail(&hdl->requests, &from->requests); mutex_unlock(from->lock); } } return ret; } static struct media_request_object * v4l2_ctrls_find_req_obj(struct v4l2_ctrl_handler *hdl, struct media_request *req, bool set) { struct media_request_object *obj; struct v4l2_ctrl_handler *new_hdl; int ret; if (IS_ERR(req)) return ERR_CAST(req); if (set && WARN_ON(req->state != MEDIA_REQUEST_STATE_UPDATING)) return ERR_PTR(-EBUSY); obj = media_request_object_find(req, &req_ops, hdl); if (obj) return obj; /* * If there are no controls in this completed request, * then that can only happen if: * * 1) no controls were present in the queued request, and * 2) v4l2_ctrl_request_complete() could not allocate a * control handler object to store the completed state in. * * So return ENOMEM to indicate that there was an out-of-memory * error. */ if (!set) return ERR_PTR(-ENOMEM); new_hdl = kzalloc(sizeof(*new_hdl), GFP_KERNEL); if (!new_hdl) return ERR_PTR(-ENOMEM); obj = &new_hdl->req_obj; ret = v4l2_ctrl_handler_init(new_hdl, (hdl->nr_of_buckets - 1) * 8); if (!ret) ret = v4l2_ctrl_request_bind(req, new_hdl, hdl); if (ret) { v4l2_ctrl_handler_free(new_hdl); kfree(new_hdl); return ERR_PTR(ret); } media_request_object_get(obj); return obj; } int v4l2_g_ext_ctrls_request(struct v4l2_ctrl_handler *hdl, struct video_device *vdev, struct media_device *mdev, struct v4l2_ext_controls *cs) { struct media_request_object *obj = NULL; struct media_request *req = NULL; int ret; if (!mdev || cs->request_fd < 0) return -EINVAL; req = media_request_get_by_fd(mdev, cs->request_fd); if (IS_ERR(req)) return PTR_ERR(req); if (req->state != MEDIA_REQUEST_STATE_COMPLETE) { media_request_put(req); return -EACCES; } ret = media_request_lock_for_access(req); if (ret) { media_request_put(req); return ret; } obj = v4l2_ctrls_find_req_obj(hdl, req, false); if (IS_ERR(obj)) { media_request_unlock_for_access(req); media_request_put(req); return PTR_ERR(obj); } hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); ret = v4l2_g_ext_ctrls_common(hdl, cs, vdev); media_request_unlock_for_access(req); media_request_object_put(obj); media_request_put(req); return ret; } int try_set_ext_ctrls_request(struct v4l2_fh *fh, struct v4l2_ctrl_handler *hdl, struct video_device *vdev, struct media_device *mdev, struct v4l2_ext_controls *cs, bool set) { struct media_request_object *obj = NULL; struct media_request *req = NULL; int ret; if (!mdev) { dprintk(vdev, "%s: missing media device\n", video_device_node_name(vdev)); return -EINVAL; } if (cs->request_fd < 0) { dprintk(vdev, "%s: invalid request fd %d\n", video_device_node_name(vdev), cs->request_fd); return -EINVAL; } req = media_request_get_by_fd(mdev, cs->request_fd); if (IS_ERR(req)) { dprintk(vdev, "%s: cannot find request fd %d\n", video_device_node_name(vdev), cs->request_fd); return PTR_ERR(req); } ret = media_request_lock_for_update(req); if (ret) { dprintk(vdev, "%s: cannot lock request fd %d\n", video_device_node_name(vdev), cs->request_fd); media_request_put(req); return ret; } obj = v4l2_ctrls_find_req_obj(hdl, req, set); if (IS_ERR(obj)) { dprintk(vdev, "%s: cannot find request object for request fd %d\n", video_device_node_name(vdev), cs->request_fd); media_request_unlock_for_update(req); media_request_put(req); return PTR_ERR(obj); } hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); ret = try_set_ext_ctrls_common(fh, hdl, cs, vdev, set); if (ret) dprintk(vdev, "%s: try_set_ext_ctrls_common failed (%d)\n", video_device_node_name(vdev), ret); media_request_unlock_for_update(req); media_request_object_put(obj); media_request_put(req); return ret; } void v4l2_ctrl_request_complete(struct media_request *req, struct v4l2_ctrl_handler *main_hdl) { struct media_request_object *obj; struct v4l2_ctrl_handler *hdl; struct v4l2_ctrl_ref *ref; if (!req || !main_hdl) return; /* * Note that it is valid if nothing was found. It means * that this request doesn't have any controls and so just * wants to leave the controls unchanged. */ obj = media_request_object_find(req, &req_ops, main_hdl); if (!obj) { int ret; /* Create a new request so the driver can return controls */ hdl = kzalloc(sizeof(*hdl), GFP_KERNEL); if (!hdl) return; ret = v4l2_ctrl_handler_init(hdl, (main_hdl->nr_of_buckets - 1) * 8); if (!ret) ret = v4l2_ctrl_request_bind(req, hdl, main_hdl); if (ret) { v4l2_ctrl_handler_free(hdl); kfree(hdl); return; } hdl->request_is_queued = true; obj = media_request_object_find(req, &req_ops, main_hdl); } hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); list_for_each_entry(ref, &hdl->ctrl_refs, node) { struct v4l2_ctrl *ctrl = ref->ctrl; struct v4l2_ctrl *master = ctrl->cluster[0]; unsigned int i; if (ctrl->flags & V4L2_CTRL_FLAG_VOLATILE) { v4l2_ctrl_lock(master); /* g_volatile_ctrl will update the current control values */ for (i = 0; i < master->ncontrols; i++) cur_to_new(master->cluster[i]); call_op(master, g_volatile_ctrl); new_to_req(ref); v4l2_ctrl_unlock(master); continue; } if (ref->p_req_valid) continue; /* Copy the current control value into the request */ v4l2_ctrl_lock(ctrl); cur_to_req(ref); v4l2_ctrl_unlock(ctrl); } mutex_lock(main_hdl->lock); WARN_ON(!hdl->request_is_queued); list_del_init(&hdl->requests_queued); hdl->request_is_queued = false; mutex_unlock(main_hdl->lock); media_request_object_complete(obj); media_request_object_put(obj); } EXPORT_SYMBOL(v4l2_ctrl_request_complete); int v4l2_ctrl_request_setup(struct media_request *req, struct v4l2_ctrl_handler *main_hdl) { struct media_request_object *obj; struct v4l2_ctrl_handler *hdl; struct v4l2_ctrl_ref *ref; int ret = 0; if (!req || !main_hdl) return 0; if (WARN_ON(req->state != MEDIA_REQUEST_STATE_QUEUED)) return -EBUSY; /* * Note that it is valid if nothing was found. It means * that this request doesn't have any controls and so just * wants to leave the controls unchanged. */ obj = media_request_object_find(req, &req_ops, main_hdl); if (!obj) return 0; if (obj->completed) { media_request_object_put(obj); return -EBUSY; } hdl = container_of(obj, struct v4l2_ctrl_handler, req_obj); list_for_each_entry(ref, &hdl->ctrl_refs, node) ref->req_done = false; list_for_each_entry(ref, &hdl->ctrl_refs, node) { struct v4l2_ctrl *ctrl = ref->ctrl; struct v4l2_ctrl *master = ctrl->cluster[0]; bool have_new_data = false; int i; /* * Skip if this control was already handled by a cluster. * Skip button controls and read-only controls. */ if (ref->req_done || (ctrl->flags & V4L2_CTRL_FLAG_READ_ONLY)) continue; v4l2_ctrl_lock(master); for (i = 0; i < master->ncontrols; i++) { if (master->cluster[i]) { struct v4l2_ctrl_ref *r = find_ref(hdl, master->cluster[i]->id); if (r->p_req_valid) { have_new_data = true; break; } } } if (!have_new_data) { v4l2_ctrl_unlock(master); continue; } for (i = 0; i < master->ncontrols; i++) { if (master->cluster[i]) { struct v4l2_ctrl_ref *r = find_ref(hdl, master->cluster[i]->id); ret = req_to_new(r); if (ret) { v4l2_ctrl_unlock(master); goto error; } master->cluster[i]->is_new = 1; r->req_done = true; } } /* * For volatile autoclusters that are currently in auto mode * we need to discover if it will be set to manual mode. * If so, then we have to copy the current volatile values * first since those will become the new manual values (which * may be overwritten by explicit new values from this set * of controls). */ if (master->is_auto && master->has_volatiles && !is_cur_manual(master)) { s32 new_auto_val = *master->p_new.p_s32; /* * If the new value == the manual value, then copy * the current volatile values. */ if (new_auto_val == master->manual_mode_value) update_from_auto_cluster(master); } ret = try_or_set_cluster(NULL, master, true, 0); v4l2_ctrl_unlock(master); if (ret) break; } error: media_request_object_put(obj); return ret; } EXPORT_SYMBOL(v4l2_ctrl_request_setup); |
| 146 23 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | #ifndef INTERNAL_IO_WQ_H #define INTERNAL_IO_WQ_H #include <linux/refcount.h> #include <linux/io_uring_types.h> struct io_wq; enum { IO_WQ_WORK_CANCEL = 1, IO_WQ_WORK_HASHED = 2, IO_WQ_WORK_UNBOUND = 4, IO_WQ_WORK_CONCURRENT = 16, IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */ }; enum io_wq_cancel { IO_WQ_CANCEL_OK, /* cancelled before started */ IO_WQ_CANCEL_RUNNING, /* found, running, and attempted cancelled */ IO_WQ_CANCEL_NOTFOUND, /* work not found */ }; typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *); typedef void (io_wq_work_fn)(struct io_wq_work *); struct io_wq_hash { refcount_t refs; unsigned long map; struct wait_queue_head wait; }; static inline void io_wq_put_hash(struct io_wq_hash *hash) { if (refcount_dec_and_test(&hash->refs)) kfree(hash); } struct io_wq_data { struct io_wq_hash *hash; struct task_struct *task; io_wq_work_fn *do_work; free_work_fn *free_work; }; struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data); void io_wq_exit_start(struct io_wq *wq); void io_wq_put_and_exit(struct io_wq *wq); void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work); void io_wq_hash_work(struct io_wq_work *work, void *val); int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask); int io_wq_max_workers(struct io_wq *wq, int *new_count); bool io_wq_worker_stopped(void); static inline bool io_wq_is_hashed(struct io_wq_work *work) { return atomic_read(&work->flags) & IO_WQ_WORK_HASHED; } typedef bool (work_cancel_fn)(struct io_wq_work *, void *); enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel, void *data, bool cancel_all); #if defined(CONFIG_IO_WQ) extern void io_wq_worker_sleeping(struct task_struct *); extern void io_wq_worker_running(struct task_struct *); #else static inline void io_wq_worker_sleeping(struct task_struct *tsk) { } static inline void io_wq_worker_running(struct task_struct *tsk) { } #endif static inline bool io_wq_current_is_worker(void) { return in_task() && (current->flags & PF_IO_WORKER) && current->worker_private; } #endif |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 | /* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* * cec - HDMI Consumer Electronics Control public header * * Copyright 2016 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */ #ifndef _CEC_UAPI_H #define _CEC_UAPI_H #include <linux/types.h> #include <linux/string.h> #define CEC_MAX_MSG_SIZE 16 /** * struct cec_msg - CEC message structure. * @tx_ts: Timestamp in nanoseconds using CLOCK_MONOTONIC. Set by the * driver when the message transmission has finished. * @rx_ts: Timestamp in nanoseconds using CLOCK_MONOTONIC. Set by the * driver when the message was received. * @len: Length in bytes of the message. * @timeout: The timeout (in ms) that is used to timeout CEC_RECEIVE. * Set to 0 if you want to wait forever. This timeout can also be * used with CEC_TRANSMIT as the timeout for waiting for a reply. * If 0, then it will use a 1 second timeout instead of waiting * forever as is done with CEC_RECEIVE. * @sequence: The framework assigns a sequence number to messages that are * sent. This can be used to track replies to previously sent * messages. * @flags: Set to 0. * @msg: The message payload. * @reply: This field is ignored with CEC_RECEIVE and is only used by * CEC_TRANSMIT. If non-zero, then wait for a reply with this * opcode. Set to CEC_MSG_FEATURE_ABORT if you want to wait for * a possible ABORT reply. If there was an error when sending the * msg or FeatureAbort was returned, then reply is set to 0. * If reply is non-zero upon return, then len/msg are set to * the received message. * If reply is zero upon return and status has the * CEC_TX_STATUS_FEATURE_ABORT bit set, then len/msg are set to * the received feature abort message. * If reply is zero upon return and status has the * CEC_TX_STATUS_MAX_RETRIES bit set, then no reply was seen at * all. If reply is non-zero for CEC_TRANSMIT and the message is a * broadcast, then -EINVAL is returned. * if reply is non-zero, then timeout is set to 1000 (the required * maximum response time). * @rx_status: The message receive status bits. Set by the driver. * @tx_status: The message transmit status bits. Set by the driver. * @tx_arb_lost_cnt: The number of 'Arbitration Lost' events. Set by the driver. * @tx_nack_cnt: The number of 'Not Acknowledged' events. Set by the driver. * @tx_low_drive_cnt: The number of 'Low Drive Detected' events. Set by the * driver. * @tx_error_cnt: The number of 'Error' events. Set by the driver. */ struct cec_msg { __u64 tx_ts; __u64 rx_ts; __u32 len; __u32 timeout; __u32 sequence; __u32 flags; __u8 msg[CEC_MAX_MSG_SIZE]; __u8 reply; __u8 rx_status; __u8 tx_status; __u8 tx_arb_lost_cnt; __u8 tx_nack_cnt; __u8 tx_low_drive_cnt; __u8 tx_error_cnt; }; /** * cec_msg_initiator - return the initiator's logical address. * @msg: the message structure */ static inline __u8 cec_msg_initiator(const struct cec_msg *msg) { return msg->msg[0] >> 4; } /** * cec_msg_destination - return the destination's logical address. * @msg: the message structure */ static inline __u8 cec_msg_destination(const struct cec_msg *msg) { return msg->msg[0] & 0xf; } /** * cec_msg_opcode - return the opcode of the message, -1 for poll * @msg: the message structure */ static inline int cec_msg_opcode(const struct cec_msg *msg) { return msg->len > 1 ? msg->msg[1] : -1; } /** * cec_msg_is_broadcast - return true if this is a broadcast message. * @msg: the message structure */ static inline int cec_msg_is_broadcast(const struct cec_msg *msg) { return (msg->msg[0] & 0xf) == 0xf; } /** * cec_msg_init - initialize the message structure. * @msg: the message structure * @initiator: the logical address of the initiator * @destination:the logical address of the destination (0xf for broadcast) * * The whole structure is zeroed, the len field is set to 1 (i.e. a poll * message) and the initiator and destination are filled in. */ static inline void cec_msg_init(struct cec_msg *msg, __u8 initiator, __u8 destination) { memset(msg, 0, sizeof(*msg)); msg->msg[0] = (initiator << 4) | destination; msg->len = 1; } /** * cec_msg_set_reply_to - fill in destination/initiator in a reply message. * @msg: the message structure for the reply * @orig: the original message structure * * Set the msg destination to the orig initiator and the msg initiator to the * orig destination. Note that msg and orig may be the same pointer, in which * case the change is done in place. */ static inline void cec_msg_set_reply_to(struct cec_msg *msg, struct cec_msg *orig) { /* The destination becomes the initiator and vice versa */ msg->msg[0] = (cec_msg_destination(orig) << 4) | cec_msg_initiator(orig); msg->reply = msg->timeout = 0; } /** * cec_msg_recv_is_tx_result - return true if this message contains the * result of an earlier non-blocking transmit * @msg: the message structure from CEC_RECEIVE */ static inline int cec_msg_recv_is_tx_result(const struct cec_msg *msg) { return msg->sequence && msg->tx_status && !msg->rx_status; } /** * cec_msg_recv_is_rx_result - return true if this message contains the * reply of an earlier non-blocking transmit * @msg: the message structure from CEC_RECEIVE */ static inline int cec_msg_recv_is_rx_result(const struct cec_msg *msg) { return msg->sequence && !msg->tx_status && msg->rx_status; } /* cec_msg flags field */ #define CEC_MSG_FL_REPLY_TO_FOLLOWERS (1 << 0) #define CEC_MSG_FL_RAW (1 << 1) /* cec_msg tx/rx_status field */ #define CEC_TX_STATUS_OK (1 << 0) #define CEC_TX_STATUS_ARB_LOST (1 << 1) #define CEC_TX_STATUS_NACK (1 << 2) #define CEC_TX_STATUS_LOW_DRIVE (1 << 3) #define CEC_TX_STATUS_ERROR (1 << 4) #define CEC_TX_STATUS_MAX_RETRIES (1 << 5) #define CEC_TX_STATUS_ABORTED (1 << 6) #define CEC_TX_STATUS_TIMEOUT (1 << 7) #define CEC_RX_STATUS_OK (1 << 0) #define CEC_RX_STATUS_TIMEOUT (1 << 1) #define CEC_RX_STATUS_FEATURE_ABORT (1 << 2) #define CEC_RX_STATUS_ABORTED (1 << 3) static inline int cec_msg_status_is_ok(const struct cec_msg *msg) { if (msg->tx_status && !(msg->tx_status & CEC_TX_STATUS_OK)) return 0; if (msg->rx_status && !(msg->rx_status & CEC_RX_STATUS_OK)) return 0; if (!msg->tx_status && !msg->rx_status) return 0; return !(msg->rx_status & CEC_RX_STATUS_FEATURE_ABORT); } #define CEC_LOG_ADDR_INVALID 0xff #define CEC_PHYS_ADDR_INVALID 0xffff /* * The maximum number of logical addresses one device can be assigned to. * The CEC 2.0 spec allows for only 2 logical addresses at the moment. The * Analog Devices CEC hardware supports 3. So let's go wild and go for 4. */ #define CEC_MAX_LOG_ADDRS 4 /* The logical addresses defined by CEC 2.0 */ #define CEC_LOG_ADDR_TV 0 #define CEC_LOG_ADDR_RECORD_1 1 #define CEC_LOG_ADDR_RECORD_2 2 #define CEC_LOG_ADDR_TUNER_1 3 #define CEC_LOG_ADDR_PLAYBACK_1 4 #define CEC_LOG_ADDR_AUDIOSYSTEM 5 #define CEC_LOG_ADDR_TUNER_2 6 #define CEC_LOG_ADDR_TUNER_3 7 #define CEC_LOG_ADDR_PLAYBACK_2 8 #define CEC_LOG_ADDR_RECORD_3 9 #define CEC_LOG_ADDR_TUNER_4 10 #define CEC_LOG_ADDR_PLAYBACK_3 11 #define CEC_LOG_ADDR_BACKUP_1 12 #define CEC_LOG_ADDR_BACKUP_2 13 #define CEC_LOG_ADDR_SPECIFIC 14 #define CEC_LOG_ADDR_UNREGISTERED 15 /* as initiator address */ #define CEC_LOG_ADDR_BROADCAST 15 /* as destination address */ /* The logical address types that the CEC device wants to claim */ #define CEC_LOG_ADDR_TYPE_TV 0 #define CEC_LOG_ADDR_TYPE_RECORD 1 #define CEC_LOG_ADDR_TYPE_TUNER 2 #define CEC_LOG_ADDR_TYPE_PLAYBACK 3 #define CEC_LOG_ADDR_TYPE_AUDIOSYSTEM 4 #define CEC_LOG_ADDR_TYPE_SPECIFIC 5 #define CEC_LOG_ADDR_TYPE_UNREGISTERED 6 /* * Switches should use UNREGISTERED. * Processors should use SPECIFIC. */ #define CEC_LOG_ADDR_MASK_TV (1 << CEC_LOG_ADDR_TV) #define CEC_LOG_ADDR_MASK_RECORD ((1 << CEC_LOG_ADDR_RECORD_1) | \ (1 << CEC_LOG_ADDR_RECORD_2) | \ (1 << CEC_LOG_ADDR_RECORD_3)) #define CEC_LOG_ADDR_MASK_TUNER ((1 << CEC_LOG_ADDR_TUNER_1) | \ (1 << CEC_LOG_ADDR_TUNER_2) | \ (1 << CEC_LOG_ADDR_TUNER_3) | \ (1 << CEC_LOG_ADDR_TUNER_4)) #define CEC_LOG_ADDR_MASK_PLAYBACK ((1 << CEC_LOG_ADDR_PLAYBACK_1) | \ (1 << CEC_LOG_ADDR_PLAYBACK_2) | \ (1 << CEC_LOG_ADDR_PLAYBACK_3)) #define CEC_LOG_ADDR_MASK_AUDIOSYSTEM (1 << CEC_LOG_ADDR_AUDIOSYSTEM) #define CEC_LOG_ADDR_MASK_BACKUP ((1 << CEC_LOG_ADDR_BACKUP_1) | \ (1 << CEC_LOG_ADDR_BACKUP_2)) #define CEC_LOG_ADDR_MASK_SPECIFIC (1 << CEC_LOG_ADDR_SPECIFIC) #define CEC_LOG_ADDR_MASK_UNREGISTERED (1 << CEC_LOG_ADDR_UNREGISTERED) static inline int cec_has_tv(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_TV; } static inline int cec_has_record(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_RECORD; } static inline int cec_has_tuner(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_TUNER; } static inline int cec_has_playback(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_PLAYBACK; } static inline int cec_has_audiosystem(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_AUDIOSYSTEM; } static inline int cec_has_backup(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_BACKUP; } static inline int cec_has_specific(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_SPECIFIC; } static inline int cec_is_unregistered(__u16 log_addr_mask) { return log_addr_mask & CEC_LOG_ADDR_MASK_UNREGISTERED; } static inline int cec_is_unconfigured(__u16 log_addr_mask) { return log_addr_mask == 0; } /* * Use this if there is no vendor ID (CEC_G_VENDOR_ID) or if the vendor ID * should be disabled (CEC_S_VENDOR_ID) */ #define CEC_VENDOR_ID_NONE 0xffffffff /* The message handling modes */ /* Modes for initiator */ #define CEC_MODE_NO_INITIATOR (0x0 << 0) #define CEC_MODE_INITIATOR (0x1 << 0) #define CEC_MODE_EXCL_INITIATOR (0x2 << 0) #define CEC_MODE_INITIATOR_MSK 0x0f /* Modes for follower */ #define CEC_MODE_NO_FOLLOWER (0x0 << 4) #define CEC_MODE_FOLLOWER (0x1 << 4) #define CEC_MODE_EXCL_FOLLOWER (0x2 << 4) #define CEC_MODE_EXCL_FOLLOWER_PASSTHRU (0x3 << 4) #define CEC_MODE_MONITOR_PIN (0xd << 4) #define CEC_MODE_MONITOR (0xe << 4) #define CEC_MODE_MONITOR_ALL (0xf << 4) #define CEC_MODE_FOLLOWER_MSK 0xf0 /* Userspace has to configure the physical address */ #define CEC_CAP_PHYS_ADDR (1 << 0) /* Userspace has to configure the logical addresses */ #define CEC_CAP_LOG_ADDRS (1 << 1) /* Userspace can transmit messages (and thus become follower as well) */ #define CEC_CAP_TRANSMIT (1 << 2) /* * Passthrough all messages instead of processing them. */ #define CEC_CAP_PASSTHROUGH (1 << 3) /* Supports remote control */ #define CEC_CAP_RC (1 << 4) /* Hardware can monitor all messages, not just directed and broadcast. */ #define CEC_CAP_MONITOR_ALL (1 << 5) /* Hardware can use CEC only if the HDMI HPD pin is high. */ #define CEC_CAP_NEEDS_HPD (1 << 6) /* Hardware can monitor CEC pin transitions */ #define CEC_CAP_MONITOR_PIN (1 << 7) /* CEC_ADAP_G_CONNECTOR_INFO is available */ #define CEC_CAP_CONNECTOR_INFO (1 << 8) /** * struct cec_caps - CEC capabilities structure. * @driver: name of the CEC device driver. * @name: name of the CEC device. @driver + @name must be unique. * @available_log_addrs: number of available logical addresses. * @capabilities: capabilities of the CEC adapter. * @version: version of the CEC adapter framework. */ struct cec_caps { char driver[32]; char name[32]; __u32 available_log_addrs; __u32 capabilities; __u32 version; }; /** * struct cec_log_addrs - CEC logical addresses structure. * @log_addr: the claimed logical addresses. Set by the driver. * @log_addr_mask: current logical address mask. Set by the driver. * @cec_version: the CEC version that the adapter should implement. Set by the * caller. * @num_log_addrs: how many logical addresses should be claimed. Set by the * caller. * @vendor_id: the vendor ID of the device. Set by the caller. * @flags: flags. * @osd_name: the OSD name of the device. Set by the caller. * @primary_device_type: the primary device type for each logical address. * Set by the caller. * @log_addr_type: the logical address types. Set by the caller. * @all_device_types: CEC 2.0: all device types represented by the logical * address. Set by the caller. * @features: CEC 2.0: The logical address features. Set by the caller. */ struct cec_log_addrs { __u8 log_addr[CEC_MAX_LOG_ADDRS]; __u16 log_addr_mask; __u8 cec_version; __u8 num_log_addrs; __u32 vendor_id; __u32 flags; char osd_name[15]; __u8 primary_device_type[CEC_MAX_LOG_ADDRS]; __u8 log_addr_type[CEC_MAX_LOG_ADDRS]; /* CEC 2.0 */ __u8 all_device_types[CEC_MAX_LOG_ADDRS]; __u8 features[CEC_MAX_LOG_ADDRS][12]; }; /* Allow a fallback to unregistered */ #define CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK (1 << 0) /* Passthrough RC messages to the input subsystem */ #define CEC_LOG_ADDRS_FL_ALLOW_RC_PASSTHRU (1 << 1) /* CDC-Only device: supports only CDC messages */ #define CEC_LOG_ADDRS_FL_CDC_ONLY (1 << 2) /** * struct cec_drm_connector_info - tells which drm connector is * associated with the CEC adapter. * @card_no: drm card number * @connector_id: drm connector ID */ struct cec_drm_connector_info { __u32 card_no; __u32 connector_id; }; #define CEC_CONNECTOR_TYPE_NO_CONNECTOR 0 #define CEC_CONNECTOR_TYPE_DRM 1 /** * struct cec_connector_info - tells if and which connector is * associated with the CEC adapter. * @type: connector type (if any) * @drm: drm connector info * @raw: array to pad the union */ struct cec_connector_info { __u32 type; union { struct cec_drm_connector_info drm; __u32 raw[16]; }; }; /* Events */ /* Event that occurs when the adapter state changes */ #define CEC_EVENT_STATE_CHANGE 1 /* * This event is sent when messages are lost because the application * didn't empty the message queue in time */ #define CEC_EVENT_LOST_MSGS 2 #define CEC_EVENT_PIN_CEC_LOW 3 #define CEC_EVENT_PIN_CEC_HIGH 4 #define CEC_EVENT_PIN_HPD_LOW 5 #define CEC_EVENT_PIN_HPD_HIGH 6 #define CEC_EVENT_PIN_5V_LOW 7 #define CEC_EVENT_PIN_5V_HIGH 8 #define CEC_EVENT_FL_INITIAL_STATE (1 << 0) #define CEC_EVENT_FL_DROPPED_EVENTS (1 << 1) /** * struct cec_event_state_change - used when the CEC adapter changes state. * @phys_addr: the current physical address * @log_addr_mask: the current logical address mask * @have_conn_info: if non-zero, then HDMI connector information is available. * This field is only valid if CEC_CAP_CONNECTOR_INFO is set. If that * capability is set and @have_conn_info is zero, then that indicates * that the HDMI connector device is not instantiated, either because * the HDMI driver is still configuring the device or because the HDMI * device was unbound. */ struct cec_event_state_change { __u16 phys_addr; __u16 log_addr_mask; __u16 have_conn_info; }; /** * struct cec_event_lost_msgs - tells you how many messages were lost. * @lost_msgs: how many messages were lost. */ struct cec_event_lost_msgs { __u32 lost_msgs; }; /** * struct cec_event - CEC event structure * @ts: the timestamp of when the event was sent. * @event: the event. * @flags: event flags. * @state_change: the event payload for CEC_EVENT_STATE_CHANGE. * @lost_msgs: the event payload for CEC_EVENT_LOST_MSGS. * @raw: array to pad the union. */ struct cec_event { __u64 ts; __u32 event; __u32 flags; union { struct cec_event_state_change state_change; struct cec_event_lost_msgs lost_msgs; __u32 raw[16]; }; }; /* ioctls */ /* Adapter capabilities */ #define CEC_ADAP_G_CAPS _IOWR('a', 0, struct cec_caps) /* * phys_addr is either 0 (if this is the CEC root device) * or a valid physical address obtained from the sink's EDID * as read by this CEC device (if this is a source device) * or a physical address obtained and modified from a sink * EDID and used for a sink CEC device. * If nothing is connected, then phys_addr is 0xffff. * See HDMI 1.4b, section 8.7 (Physical Address). * * The CEC_ADAP_S_PHYS_ADDR ioctl may not be available if that is handled * internally. */ #define CEC_ADAP_G_PHYS_ADDR _IOR('a', 1, __u16) #define CEC_ADAP_S_PHYS_ADDR _IOW('a', 2, __u16) /* * Configure the CEC adapter. It sets the device type and which * logical types it will try to claim. It will return which * logical addresses it could actually claim. * An error is returned if the adapter is disabled or if there * is no physical address assigned. */ #define CEC_ADAP_G_LOG_ADDRS _IOR('a', 3, struct cec_log_addrs) #define CEC_ADAP_S_LOG_ADDRS _IOWR('a', 4, struct cec_log_addrs) /* Transmit/receive a CEC command */ #define CEC_TRANSMIT _IOWR('a', 5, struct cec_msg) #define CEC_RECEIVE _IOWR('a', 6, struct cec_msg) /* Dequeue CEC events */ #define CEC_DQEVENT _IOWR('a', 7, struct cec_event) /* * Get and set the message handling mode for this filehandle. */ #define CEC_G_MODE _IOR('a', 8, __u32) #define CEC_S_MODE _IOW('a', 9, __u32) /* Get the connector info */ #define CEC_ADAP_G_CONNECTOR_INFO _IOR('a', 10, struct cec_connector_info) /* * The remainder of this header defines all CEC messages and operands. * The format matters since it the cec-ctl utility parses it to generate * code for implementing all these messages. * * Comments ending with 'Feature' group messages for each feature. * If messages are part of multiple features, then the "Has also" * comment is used to list the previously defined messages that are * supported by the feature. * * Before operands are defined a comment is added that gives the * name of the operand and in brackets the variable name of the * corresponding argument in the cec-funcs.h function. */ /* Messages */ /* One Touch Play Feature */ #define CEC_MSG_ACTIVE_SOURCE 0x82 #define CEC_MSG_IMAGE_VIEW_ON 0x04 #define CEC_MSG_TEXT_VIEW_ON 0x0d /* Routing Control Feature */ /* * Has also: * CEC_MSG_ACTIVE_SOURCE */ #define CEC_MSG_INACTIVE_SOURCE 0x9d #define CEC_MSG_REQUEST_ACTIVE_SOURCE 0x85 #define CEC_MSG_ROUTING_CHANGE 0x80 #define CEC_MSG_ROUTING_INFORMATION 0x81 #define CEC_MSG_SET_STREAM_PATH 0x86 /* Standby Feature */ #define CEC_MSG_STANDBY 0x36 /* One Touch Record Feature */ #define CEC_MSG_RECORD_OFF 0x0b #define CEC_MSG_RECORD_ON 0x09 /* Record Source Type Operand (rec_src_type) */ #define CEC_OP_RECORD_SRC_OWN 1 #define CEC_OP_RECORD_SRC_DIGITAL 2 #define CEC_OP_RECORD_SRC_ANALOG 3 #define CEC_OP_RECORD_SRC_EXT_PLUG 4 #define CEC_OP_RECORD_SRC_EXT_PHYS_ADDR 5 /* Service Identification Method Operand (service_id_method) */ #define CEC_OP_SERVICE_ID_METHOD_BY_DIG_ID 0 #define CEC_OP_SERVICE_ID_METHOD_BY_CHANNEL 1 /* Digital Service Broadcast System Operand (dig_bcast_system) */ #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ARIB_GEN 0x00 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ATSC_GEN 0x01 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_DVB_GEN 0x02 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ARIB_BS 0x08 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ARIB_CS 0x09 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ARIB_T 0x0a #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ATSC_CABLE 0x10 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ATSC_SAT 0x11 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_ATSC_T 0x12 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_DVB_C 0x18 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_DVB_S 0x19 #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_DVB_S2 0x1a #define CEC_OP_DIG_SERVICE_BCAST_SYSTEM_DVB_T 0x1b /* Analogue Broadcast Type Operand (ana_bcast_type) */ #define CEC_OP_ANA_BCAST_TYPE_CABLE 0 #define CEC_OP_ANA_BCAST_TYPE_SATELLITE 1 #define CEC_OP_ANA_BCAST_TYPE_TERRESTRIAL 2 /* Broadcast System Operand (bcast_system) */ #define CEC_OP_BCAST_SYSTEM_PAL_BG 0x00 #define CEC_OP_BCAST_SYSTEM_SECAM_LQ 0x01 /* SECAM L' */ #define CEC_OP_BCAST_SYSTEM_PAL_M 0x02 #define CEC_OP_BCAST_SYSTEM_NTSC_M 0x03 #define CEC_OP_BCAST_SYSTEM_PAL_I 0x04 #define CEC_OP_BCAST_SYSTEM_SECAM_DK 0x05 #define CEC_OP_BCAST_SYSTEM_SECAM_BG 0x06 #define CEC_OP_BCAST_SYSTEM_SECAM_L 0x07 #define CEC_OP_BCAST_SYSTEM_PAL_DK 0x08 #define CEC_OP_BCAST_SYSTEM_OTHER 0x1f /* Channel Number Format Operand (channel_number_fmt) */ #define CEC_OP_CHANNEL_NUMBER_FMT_1_PART 0x01 #define CEC_OP_CHANNEL_NUMBER_FMT_2_PART 0x02 #define CEC_MSG_RECORD_STATUS 0x0a /* Record Status Operand (rec_status) */ #define CEC_OP_RECORD_STATUS_CUR_SRC 0x01 #define CEC_OP_RECORD_STATUS_DIG_SERVICE 0x02 #define CEC_OP_RECORD_STATUS_ANA_SERVICE 0x03 #define CEC_OP_RECORD_STATUS_EXT_INPUT 0x04 #define CEC_OP_RECORD_STATUS_NO_DIG_SERVICE 0x05 #define CEC_OP_RECORD_STATUS_NO_ANA_SERVICE 0x06 #define CEC_OP_RECORD_STATUS_NO_SERVICE 0x07 #define CEC_OP_RECORD_STATUS_INVALID_EXT_PLUG 0x09 #define CEC_OP_RECORD_STATUS_INVALID_EXT_PHYS_ADDR 0x0a #define CEC_OP_RECORD_STATUS_UNSUP_CA 0x0b #define CEC_OP_RECORD_STATUS_NO_CA_ENTITLEMENTS 0x0c #define CEC_OP_RECORD_STATUS_CANT_COPY_SRC 0x0d #define CEC_OP_RECORD_STATUS_NO_MORE_COPIES 0x0e #define CEC_OP_RECORD_STATUS_NO_MEDIA 0x10 #define CEC_OP_RECORD_STATUS_PLAYING 0x11 #define CEC_OP_RECORD_STATUS_ALREADY_RECORDING 0x12 #define CEC_OP_RECORD_STATUS_MEDIA_PROT 0x13 #define CEC_OP_RECORD_STATUS_NO_SIGNAL 0x14 #define CEC_OP_RECORD_STATUS_MEDIA_PROBLEM 0x15 #define CEC_OP_RECORD_STATUS_NO_SPACE 0x16 #define CEC_OP_RECORD_STATUS_PARENTAL_LOCK 0x17 #define CEC_OP_RECORD_STATUS_TERMINATED_OK 0x1a #define CEC_OP_RECORD_STATUS_ALREADY_TERM 0x1b #define CEC_OP_RECORD_STATUS_OTHER 0x1f #define CEC_MSG_RECORD_TV_SCREEN 0x0f /* Timer Programming Feature */ #define CEC_MSG_CLEAR_ANALOGUE_TIMER 0x33 /* Recording Sequence Operand (recording_seq) */ #define CEC_OP_REC_SEQ_SUNDAY 0x01 #define CEC_OP_REC_SEQ_MONDAY 0x02 #define CEC_OP_REC_SEQ_TUESDAY 0x04 #define CEC_OP_REC_SEQ_WEDNESDAY 0x08 #define CEC_OP_REC_SEQ_THURSDAY 0x10 #define CEC_OP_REC_SEQ_FRIDAY 0x20 #define CEC_OP_REC_SEQ_SATURDAY 0x40 #define CEC_OP_REC_SEQ_ONCE_ONLY 0x00 #define CEC_MSG_CLEAR_DIGITAL_TIMER 0x99 #define CEC_MSG_CLEAR_EXT_TIMER 0xa1 /* External Source Specifier Operand (ext_src_spec) */ #define CEC_OP_EXT_SRC_PLUG 0x04 #define CEC_OP_EXT_SRC_PHYS_ADDR 0x05 #define CEC_MSG_SET_ANALOGUE_TIMER 0x34 #define CEC_MSG_SET_DIGITAL_TIMER 0x97 #define CEC_MSG_SET_EXT_TIMER 0xa2 #define CEC_MSG_SET_TIMER_PROGRAM_TITLE 0x67 #define CEC_MSG_TIMER_CLEARED_STATUS 0x43 /* Timer Cleared Status Data Operand (timer_cleared_status) */ #define CEC_OP_TIMER_CLR_STAT_RECORDING 0x00 #define CEC_OP_TIMER_CLR_STAT_NO_MATCHING 0x01 #define CEC_OP_TIMER_CLR_STAT_NO_INFO 0x02 #define CEC_OP_TIMER_CLR_STAT_CLEARED 0x80 #define CEC_MSG_TIMER_STATUS 0x35 /* Timer Overlap Warning Operand (timer_overlap_warning) */ #define CEC_OP_TIMER_OVERLAP_WARNING_NO_OVERLAP 0 #define CEC_OP_TIMER_OVERLAP_WARNING_OVERLAP 1 /* Media Info Operand (media_info) */ #define CEC_OP_MEDIA_INFO_UNPROT_MEDIA 0 #define CEC_OP_MEDIA_INFO_PROT_MEDIA 1 #define CEC_OP_MEDIA_INFO_NO_MEDIA 2 /* Programmed Indicator Operand (prog_indicator) */ #define CEC_OP_PROG_IND_NOT_PROGRAMMED 0 #define CEC_OP_PROG_IND_PROGRAMMED 1 /* Programmed Info Operand (prog_info) */ #define CEC_OP_PROG_INFO_ENOUGH_SPACE 0x08 #define CEC_OP_PROG_INFO_NOT_ENOUGH_SPACE 0x09 #define CEC_OP_PROG_INFO_MIGHT_NOT_BE_ENOUGH_SPACE 0x0b #define CEC_OP_PROG_INFO_NONE_AVAILABLE 0x0a /* Not Programmed Error Info Operand (prog_error) */ #define CEC_OP_PROG_ERROR_NO_FREE_TIMER 0x01 #define CEC_OP_PROG_ERROR_DATE_OUT_OF_RANGE 0x02 #define CEC_OP_PROG_ERROR_REC_SEQ_ERROR 0x03 #define CEC_OP_PROG_ERROR_INV_EXT_PLUG 0x04 #define CEC_OP_PROG_ERROR_INV_EXT_PHYS_ADDR 0x05 #define CEC_OP_PROG_ERROR_CA_UNSUPP 0x06 #define CEC_OP_PROG_ERROR_INSUF_CA_ENTITLEMENTS 0x07 #define CEC_OP_PROG_ERROR_RESOLUTION_UNSUPP 0x08 #define CEC_OP_PROG_ERROR_PARENTAL_LOCK 0x09 #define CEC_OP_PROG_ERROR_CLOCK_FAILURE 0x0a #define CEC_OP_PROG_ERROR_DUPLICATE 0x0e /* System Information Feature */ #define CEC_MSG_CEC_VERSION 0x9e /* CEC Version Operand (cec_version) */ #define CEC_OP_CEC_VERSION_1_3A 4 #define CEC_OP_CEC_VERSION_1_4 5 #define CEC_OP_CEC_VERSION_2_0 6 #define CEC_MSG_GET_CEC_VERSION 0x9f #define CEC_MSG_GIVE_PHYSICAL_ADDR 0x83 #define CEC_MSG_GET_MENU_LANGUAGE 0x91 #define CEC_MSG_REPORT_PHYSICAL_ADDR 0x84 /* Primary Device Type Operand (prim_devtype) */ #define CEC_OP_PRIM_DEVTYPE_TV 0 #define CEC_OP_PRIM_DEVTYPE_RECORD 1 #define CEC_OP_PRIM_DEVTYPE_TUNER 3 #define CEC_OP_PRIM_DEVTYPE_PLAYBACK 4 #define CEC_OP_PRIM_DEVTYPE_AUDIOSYSTEM 5 #define CEC_OP_PRIM_DEVTYPE_SWITCH 6 #define CEC_OP_PRIM_DEVTYPE_PROCESSOR 7 #define CEC_MSG_SET_MENU_LANGUAGE 0x32 #define CEC_MSG_REPORT_FEATURES 0xa6 /* HDMI 2.0 */ /* All Device Types Operand (all_device_types) */ #define CEC_OP_ALL_DEVTYPE_TV 0x80 #define CEC_OP_ALL_DEVTYPE_RECORD 0x40 #define CEC_OP_ALL_DEVTYPE_TUNER 0x20 #define CEC_OP_ALL_DEVTYPE_PLAYBACK 0x10 #define CEC_OP_ALL_DEVTYPE_AUDIOSYSTEM 0x08 #define CEC_OP_ALL_DEVTYPE_SWITCH 0x04 /* * And if you wondering what happened to PROCESSOR devices: those should * be mapped to a SWITCH. */ /* Valid for RC Profile and Device Feature operands */ #define CEC_OP_FEAT_EXT 0x80 /* Extension bit */ /* RC Profile Operand (rc_profile) */ #define CEC_OP_FEAT_RC_TV_PROFILE_NONE 0x00 #define CEC_OP_FEAT_RC_TV_PROFILE_1 0x02 #define CEC_OP_FEAT_RC_TV_PROFILE_2 0x06 #define CEC_OP_FEAT_RC_TV_PROFILE_3 0x0a #define CEC_OP_FEAT_RC_TV_PROFILE_4 0x0e #define CEC_OP_FEAT_RC_SRC_HAS_DEV_ROOT_MENU 0x50 #define CEC_OP_FEAT_RC_SRC_HAS_DEV_SETUP_MENU 0x48 #define CEC_OP_FEAT_RC_SRC_HAS_CONTENTS_MENU 0x44 #define CEC_OP_FEAT_RC_SRC_HAS_MEDIA_TOP_MENU 0x42 #define CEC_OP_FEAT_RC_SRC_HAS_MEDIA_CONTEXT_MENU 0x41 /* Device Feature Operand (dev_features) */ #define CEC_OP_FEAT_DEV_HAS_RECORD_TV_SCREEN 0x40 #define CEC_OP_FEAT_DEV_HAS_SET_OSD_STRING 0x20 #define CEC_OP_FEAT_DEV_HAS_DECK_CONTROL 0x10 #define CEC_OP_FEAT_DEV_HAS_SET_AUDIO_RATE 0x08 #define CEC_OP_FEAT_DEV_SINK_HAS_ARC_TX 0x04 #define CEC_OP_FEAT_DEV_SOURCE_HAS_ARC_RX 0x02 #define CEC_OP_FEAT_DEV_HAS_SET_AUDIO_VOLUME_LEVEL 0x01 #define CEC_MSG_GIVE_FEATURES 0xa5 /* HDMI 2.0 */ /* Deck Control Feature */ #define CEC_MSG_DECK_CONTROL 0x42 /* Deck Control Mode Operand (deck_control_mode) */ #define CEC_OP_DECK_CTL_MODE_SKIP_FWD 1 #define CEC_OP_DECK_CTL_MODE_SKIP_REV 2 #define CEC_OP_DECK_CTL_MODE_STOP 3 #define CEC_OP_DECK_CTL_MODE_EJECT 4 #define CEC_MSG_DECK_STATUS 0x1b /* Deck Info Operand (deck_info) */ #define CEC_OP_DECK_INFO_PLAY 0x11 #define CEC_OP_DECK_INFO_RECORD 0x12 #define CEC_OP_DECK_INFO_PLAY_REV 0x13 #define CEC_OP_DECK_INFO_STILL 0x14 #define CEC_OP_DECK_INFO_SLOW 0x15 #define CEC_OP_DECK_INFO_SLOW_REV 0x16 #define CEC_OP_DECK_INFO_FAST_FWD 0x17 #define CEC_OP_DECK_INFO_FAST_REV 0x18 #define CEC_OP_DECK_INFO_NO_MEDIA 0x19 #define CEC_OP_DECK_INFO_STOP 0x1a #define CEC_OP_DECK_INFO_SKIP_FWD 0x1b #define CEC_OP_DECK_INFO_SKIP_REV 0x1c #define CEC_OP_DECK_INFO_INDEX_SEARCH_FWD 0x1d #define CEC_OP_DECK_INFO_INDEX_SEARCH_REV 0x1e #define CEC_OP_DECK_INFO_OTHER 0x1f #define CEC_MSG_GIVE_DECK_STATUS 0x1a /* Status Request Operand (status_req) */ #define CEC_OP_STATUS_REQ_ON 1 #define CEC_OP_STATUS_REQ_OFF 2 #define CEC_OP_STATUS_REQ_ONCE 3 #define CEC_MSG_PLAY 0x41 /* Play Mode Operand (play_mode) */ #define CEC_OP_PLAY_MODE_PLAY_FWD 0x24 #define CEC_OP_PLAY_MODE_PLAY_REV 0x20 #define CEC_OP_PLAY_MODE_PLAY_STILL 0x25 #define CEC_OP_PLAY_MODE_PLAY_FAST_FWD_MIN 0x05 #define CEC_OP_PLAY_MODE_PLAY_FAST_FWD_MED 0x06 #define CEC_OP_PLAY_MODE_PLAY_FAST_FWD_MAX 0x07 #define CEC_OP_PLAY_MODE_PLAY_FAST_REV_MIN 0x09 #define CEC_OP_PLAY_MODE_PLAY_FAST_REV_MED 0x0a #define CEC_OP_PLAY_MODE_PLAY_FAST_REV_MAX 0x0b #define CEC_OP_PLAY_MODE_PLAY_SLOW_FWD_MIN 0x15 #define CEC_OP_PLAY_MODE_PLAY_SLOW_FWD_MED 0x16 #define CEC_OP_PLAY_MODE_PLAY_SLOW_FWD_MAX 0x17 #define CEC_OP_PLAY_MODE_PLAY_SLOW_REV_MIN 0x19 #define CEC_OP_PLAY_MODE_PLAY_SLOW_REV_MED 0x1a #define CEC_OP_PLAY_MODE_PLAY_SLOW_REV_MAX 0x1b /* Tuner Control Feature */ #define CEC_MSG_GIVE_TUNER_DEVICE_STATUS 0x08 #define CEC_MSG_SELECT_ANALOGUE_SERVICE 0x92 #define CEC_MSG_SELECT_DIGITAL_SERVICE 0x93 #define CEC_MSG_TUNER_DEVICE_STATUS 0x07 /* Recording Flag Operand (rec_flag) */ #define CEC_OP_REC_FLAG_NOT_USED 0 #define CEC_OP_REC_FLAG_USED 1 /* Tuner Display Info Operand (tuner_display_info) */ #define CEC_OP_TUNER_DISPLAY_INFO_DIGITAL 0 #define CEC_OP_TUNER_DISPLAY_INFO_NONE 1 #define CEC_OP_TUNER_DISPLAY_INFO_ANALOGUE 2 #define CEC_MSG_TUNER_STEP_DECREMENT 0x06 #define CEC_MSG_TUNER_STEP_INCREMENT 0x05 /* Vendor Specific Commands Feature */ /* * Has also: * CEC_MSG_CEC_VERSION * CEC_MSG_GET_CEC_VERSION */ #define CEC_MSG_DEVICE_VENDOR_ID 0x87 #define CEC_MSG_GIVE_DEVICE_VENDOR_ID 0x8c #define CEC_MSG_VENDOR_COMMAND 0x89 #define CEC_MSG_VENDOR_COMMAND_WITH_ID 0xa0 #define CEC_MSG_VENDOR_REMOTE_BUTTON_DOWN 0x8a #define CEC_MSG_VENDOR_REMOTE_BUTTON_UP 0x8b /* OSD Display Feature */ #define CEC_MSG_SET_OSD_STRING 0x64 /* Display Control Operand (disp_ctl) */ #define CEC_OP_DISP_CTL_DEFAULT 0x00 #define CEC_OP_DISP_CTL_UNTIL_CLEARED 0x40 #define CEC_OP_DISP_CTL_CLEAR 0x80 /* Device OSD Transfer Feature */ #define CEC_MSG_GIVE_OSD_NAME 0x46 #define CEC_MSG_SET_OSD_NAME 0x47 /* Device Menu Control Feature */ #define CEC_MSG_MENU_REQUEST 0x8d /* Menu Request Type Operand (menu_req) */ #define CEC_OP_MENU_REQUEST_ACTIVATE 0x00 #define CEC_OP_MENU_REQUEST_DEACTIVATE 0x01 #define CEC_OP_MENU_REQUEST_QUERY 0x02 #define CEC_MSG_MENU_STATUS 0x8e /* Menu State Operand (menu_state) */ #define CEC_OP_MENU_STATE_ACTIVATED 0x00 #define CEC_OP_MENU_STATE_DEACTIVATED 0x01 #define CEC_MSG_USER_CONTROL_PRESSED 0x44 /* UI Command Operand (ui_cmd) */ #define CEC_OP_UI_CMD_SELECT 0x00 #define CEC_OP_UI_CMD_UP 0x01 #define CEC_OP_UI_CMD_DOWN 0x02 #define CEC_OP_UI_CMD_LEFT 0x03 #define CEC_OP_UI_CMD_RIGHT 0x04 #define CEC_OP_UI_CMD_RIGHT_UP 0x05 #define CEC_OP_UI_CMD_RIGHT_DOWN 0x06 #define CEC_OP_UI_CMD_LEFT_UP 0x07 #define CEC_OP_UI_CMD_LEFT_DOWN 0x08 #define CEC_OP_UI_CMD_DEVICE_ROOT_MENU 0x09 #define CEC_OP_UI_CMD_DEVICE_SETUP_MENU 0x0a #define CEC_OP_UI_CMD_CONTENTS_MENU 0x0b #define CEC_OP_UI_CMD_FAVORITE_MENU 0x0c #define CEC_OP_UI_CMD_BACK 0x0d #define CEC_OP_UI_CMD_MEDIA_TOP_MENU 0x10 #define CEC_OP_UI_CMD_MEDIA_CONTEXT_SENSITIVE_MENU 0x11 #define CEC_OP_UI_CMD_NUMBER_ENTRY_MODE 0x1d #define CEC_OP_UI_CMD_NUMBER_11 0x1e #define CEC_OP_UI_CMD_NUMBER_12 0x1f #define CEC_OP_UI_CMD_NUMBER_0_OR_NUMBER_10 0x20 #define CEC_OP_UI_CMD_NUMBER_1 0x21 #define CEC_OP_UI_CMD_NUMBER_2 0x22 #define CEC_OP_UI_CMD_NUMBER_3 0x23 #define CEC_OP_UI_CMD_NUMBER_4 0x24 #define CEC_OP_UI_CMD_NUMBER_5 0x25 #define CEC_OP_UI_CMD_NUMBER_6 0x26 #define CEC_OP_UI_CMD_NUMBER_7 0x27 #define CEC_OP_UI_CMD_NUMBER_8 0x28 #define CEC_OP_UI_CMD_NUMBER_9 0x29 #define CEC_OP_UI_CMD_DOT 0x2a #define CEC_OP_UI_CMD_ENTER 0x2b #define CEC_OP_UI_CMD_CLEAR 0x2c #define CEC_OP_UI_CMD_NEXT_FAVORITE 0x2f #define CEC_OP_UI_CMD_CHANNEL_UP 0x30 #define CEC_OP_UI_CMD_CHANNEL_DOWN 0x31 #define CEC_OP_UI_CMD_PREVIOUS_CHANNEL 0x32 #define CEC_OP_UI_CMD_SOUND_SELECT 0x33 #define CEC_OP_UI_CMD_INPUT_SELECT 0x34 #define CEC_OP_UI_CMD_DISPLAY_INFORMATION 0x35 #define CEC_OP_UI_CMD_HELP 0x36 #define CEC_OP_UI_CMD_PAGE_UP 0x37 #define CEC_OP_UI_CMD_PAGE_DOWN 0x38 #define CEC_OP_UI_CMD_POWER 0x40 #define CEC_OP_UI_CMD_VOLUME_UP 0x41 #define CEC_OP_UI_CMD_VOLUME_DOWN 0x42 #define CEC_OP_UI_CMD_MUTE 0x43 #define CEC_OP_UI_CMD_PLAY 0x44 #define CEC_OP_UI_CMD_STOP 0x45 #define CEC_OP_UI_CMD_PAUSE 0x46 #define CEC_OP_UI_CMD_RECORD 0x47 #define CEC_OP_UI_CMD_REWIND 0x48 #define CEC_OP_UI_CMD_FAST_FORWARD 0x49 #define CEC_OP_UI_CMD_EJECT 0x4a #define CEC_OP_UI_CMD_SKIP_FORWARD 0x4b #define CEC_OP_UI_CMD_SKIP_BACKWARD 0x4c #define CEC_OP_UI_CMD_STOP_RECORD 0x4d #define CEC_OP_UI_CMD_PAUSE_RECORD 0x4e #define CEC_OP_UI_CMD_ANGLE 0x50 #define CEC_OP_UI_CMD_SUB_PICTURE 0x51 #define CEC_OP_UI_CMD_VIDEO_ON_DEMAND 0x52 #define CEC_OP_UI_CMD_ELECTRONIC_PROGRAM_GUIDE 0x53 #define CEC_OP_UI_CMD_TIMER_PROGRAMMING 0x54 #define CEC_OP_UI_CMD_INITIAL_CONFIGURATION 0x55 #define CEC_OP_UI_CMD_SELECT_BROADCAST_TYPE 0x56 #define CEC_OP_UI_CMD_SELECT_SOUND_PRESENTATION 0x57 #define CEC_OP_UI_CMD_AUDIO_DESCRIPTION 0x58 #define CEC_OP_UI_CMD_INTERNET 0x59 #define CEC_OP_UI_CMD_3D_MODE 0x5a #define CEC_OP_UI_CMD_PLAY_FUNCTION 0x60 #define CEC_OP_UI_CMD_PAUSE_PLAY_FUNCTION 0x61 #define CEC_OP_UI_CMD_RECORD_FUNCTION 0x62 #define CEC_OP_UI_CMD_PAUSE_RECORD_FUNCTION 0x63 #define CEC_OP_UI_CMD_STOP_FUNCTION 0x64 #define CEC_OP_UI_CMD_MUTE_FUNCTION 0x65 #define CEC_OP_UI_CMD_RESTORE_VOLUME_FUNCTION 0x66 #define CEC_OP_UI_CMD_TUNE_FUNCTION 0x67 #define CEC_OP_UI_CMD_SELECT_MEDIA_FUNCTION 0x68 #define CEC_OP_UI_CMD_SELECT_AV_INPUT_FUNCTION 0x69 #define CEC_OP_UI_CMD_SELECT_AUDIO_INPUT_FUNCTION 0x6a #define CEC_OP_UI_CMD_POWER_TOGGLE_FUNCTION 0x6b #define CEC_OP_UI_CMD_POWER_OFF_FUNCTION 0x6c #define CEC_OP_UI_CMD_POWER_ON_FUNCTION 0x6d #define CEC_OP_UI_CMD_F1_BLUE 0x71 #define CEC_OP_UI_CMD_F2_RED 0x72 #define CEC_OP_UI_CMD_F3_GREEN 0x73 #define CEC_OP_UI_CMD_F4_YELLOW 0x74 #define CEC_OP_UI_CMD_F5 0x75 #define CEC_OP_UI_CMD_DATA 0x76 /* UI Broadcast Type Operand (ui_bcast_type) */ #define CEC_OP_UI_BCAST_TYPE_TOGGLE_ALL 0x00 #define CEC_OP_UI_BCAST_TYPE_TOGGLE_DIG_ANA 0x01 #define CEC_OP_UI_BCAST_TYPE_ANALOGUE 0x10 #define CEC_OP_UI_BCAST_TYPE_ANALOGUE_T 0x20 #define CEC_OP_UI_BCAST_TYPE_ANALOGUE_CABLE 0x30 #define CEC_OP_UI_BCAST_TYPE_ANALOGUE_SAT 0x40 #define CEC_OP_UI_BCAST_TYPE_DIGITAL 0x50 #define CEC_OP_UI_BCAST_TYPE_DIGITAL_T 0x60 #define CEC_OP_UI_BCAST_TYPE_DIGITAL_CABLE 0x70 #define CEC_OP_UI_BCAST_TYPE_DIGITAL_SAT 0x80 #define CEC_OP_UI_BCAST_TYPE_DIGITAL_COM_SAT 0x90 #define CEC_OP_UI_BCAST_TYPE_DIGITAL_COM_SAT2 0x91 #define CEC_OP_UI_BCAST_TYPE_IP 0xa0 /* UI Sound Presentation Control Operand (ui_snd_pres_ctl) */ #define CEC_OP_UI_SND_PRES_CTL_DUAL_MONO 0x10 #define CEC_OP_UI_SND_PRES_CTL_KARAOKE 0x20 #define CEC_OP_UI_SND_PRES_CTL_DOWNMIX 0x80 #define CEC_OP_UI_SND_PRES_CTL_REVERB 0x90 #define CEC_OP_UI_SND_PRES_CTL_EQUALIZER 0xa0 #define CEC_OP_UI_SND_PRES_CTL_BASS_UP 0xb1 #define CEC_OP_UI_SND_PRES_CTL_BASS_NEUTRAL 0xb2 #define CEC_OP_UI_SND_PRES_CTL_BASS_DOWN 0xb3 #define CEC_OP_UI_SND_PRES_CTL_TREBLE_UP 0xc1 #define CEC_OP_UI_SND_PRES_CTL_TREBLE_NEUTRAL 0xc2 #define CEC_OP_UI_SND_PRES_CTL_TREBLE_DOWN 0xc3 #define CEC_MSG_USER_CONTROL_RELEASED 0x45 /* Remote Control Passthrough Feature */ /* * Has also: * CEC_MSG_USER_CONTROL_PRESSED * CEC_MSG_USER_CONTROL_RELEASED */ /* Power Status Feature */ #define CEC_MSG_GIVE_DEVICE_POWER_STATUS 0x8f #define CEC_MSG_REPORT_POWER_STATUS 0x90 /* Power Status Operand (pwr_state) */ #define CEC_OP_POWER_STATUS_ON 0 #define CEC_OP_POWER_STATUS_STANDBY 1 #define CEC_OP_POWER_STATUS_TO_ON 2 #define CEC_OP_POWER_STATUS_TO_STANDBY 3 /* General Protocol Messages */ #define CEC_MSG_FEATURE_ABORT 0x00 /* Abort Reason Operand (reason) */ #define CEC_OP_ABORT_UNRECOGNIZED_OP 0 #define CEC_OP_ABORT_INCORRECT_MODE 1 #define CEC_OP_ABORT_NO_SOURCE 2 #define CEC_OP_ABORT_INVALID_OP 3 #define CEC_OP_ABORT_REFUSED 4 #define CEC_OP_ABORT_UNDETERMINED 5 #define CEC_MSG_ABORT 0xff /* System Audio Control Feature */ /* * Has also: * CEC_MSG_USER_CONTROL_PRESSED * CEC_MSG_USER_CONTROL_RELEASED */ #define CEC_MSG_GIVE_AUDIO_STATUS 0x71 #define CEC_MSG_GIVE_SYSTEM_AUDIO_MODE_STATUS 0x7d #define CEC_MSG_REPORT_AUDIO_STATUS 0x7a /* Audio Mute Status Operand (aud_mute_status) */ #define CEC_OP_AUD_MUTE_STATUS_OFF 0 #define CEC_OP_AUD_MUTE_STATUS_ON 1 #define CEC_MSG_REPORT_SHORT_AUDIO_DESCRIPTOR 0xa3 #define CEC_MSG_REQUEST_SHORT_AUDIO_DESCRIPTOR 0xa4 #define CEC_MSG_SET_SYSTEM_AUDIO_MODE 0x72 /* System Audio Status Operand (sys_aud_status) */ #define CEC_OP_SYS_AUD_STATUS_OFF 0 #define CEC_OP_SYS_AUD_STATUS_ON 1 #define CEC_MSG_SYSTEM_AUDIO_MODE_REQUEST 0x70 #define CEC_MSG_SYSTEM_AUDIO_MODE_STATUS 0x7e /* Audio Format ID Operand (audio_format_id) */ #define CEC_OP_AUD_FMT_ID_CEA861 0 #define CEC_OP_AUD_FMT_ID_CEA861_CXT 1 #define CEC_MSG_SET_AUDIO_VOLUME_LEVEL 0x73 /* Audio Rate Control Feature */ #define CEC_MSG_SET_AUDIO_RATE 0x9a /* Audio Rate Operand (audio_rate) */ #define CEC_OP_AUD_RATE_OFF 0 #define CEC_OP_AUD_RATE_WIDE_STD 1 #define CEC_OP_AUD_RATE_WIDE_FAST 2 #define CEC_OP_AUD_RATE_WIDE_SLOW 3 #define CEC_OP_AUD_RATE_NARROW_STD 4 #define CEC_OP_AUD_RATE_NARROW_FAST 5 #define CEC_OP_AUD_RATE_NARROW_SLOW 6 /* Audio Return Channel Control Feature */ #define CEC_MSG_INITIATE_ARC 0xc0 #define CEC_MSG_REPORT_ARC_INITIATED 0xc1 #define CEC_MSG_REPORT_ARC_TERMINATED 0xc2 #define CEC_MSG_REQUEST_ARC_INITIATION 0xc3 #define CEC_MSG_REQUEST_ARC_TERMINATION 0xc4 #define CEC_MSG_TERMINATE_ARC 0xc5 /* Dynamic Audio Lipsync Feature */ /* Only for CEC 2.0 and up */ #define CEC_MSG_REQUEST_CURRENT_LATENCY 0xa7 #define CEC_MSG_REPORT_CURRENT_LATENCY 0xa8 /* Low Latency Mode Operand (low_latency_mode) */ #define CEC_OP_LOW_LATENCY_MODE_OFF 0 #define CEC_OP_LOW_LATENCY_MODE_ON 1 /* Audio Output Compensated Operand (audio_out_compensated) */ #define CEC_OP_AUD_OUT_COMPENSATED_NA 0 #define CEC_OP_AUD_OUT_COMPENSATED_DELAY 1 #define CEC_OP_AUD_OUT_COMPENSATED_NO_DELAY 2 #define CEC_OP_AUD_OUT_COMPENSATED_PARTIAL_DELAY 3 /* Capability Discovery and Control Feature */ #define CEC_MSG_CDC_MESSAGE 0xf8 /* Ethernet-over-HDMI: nobody ever does this... */ #define CEC_MSG_CDC_HEC_INQUIRE_STATE 0x00 #define CEC_MSG_CDC_HEC_REPORT_STATE 0x01 /* HEC Functionality State Operand (hec_func_state) */ #define CEC_OP_HEC_FUNC_STATE_NOT_SUPPORTED 0 #define CEC_OP_HEC_FUNC_STATE_INACTIVE 1 #define CEC_OP_HEC_FUNC_STATE_ACTIVE 2 #define CEC_OP_HEC_FUNC_STATE_ACTIVATION_FIELD 3 /* Host Functionality State Operand (host_func_state) */ #define CEC_OP_HOST_FUNC_STATE_NOT_SUPPORTED 0 #define CEC_OP_HOST_FUNC_STATE_INACTIVE 1 #define CEC_OP_HOST_FUNC_STATE_ACTIVE 2 /* ENC Functionality State Operand (enc_func_state) */ #define CEC_OP_ENC_FUNC_STATE_EXT_CON_NOT_SUPPORTED 0 #define CEC_OP_ENC_FUNC_STATE_EXT_CON_INACTIVE 1 #define CEC_OP_ENC_FUNC_STATE_EXT_CON_ACTIVE 2 /* CDC Error Code Operand (cdc_errcode) */ #define CEC_OP_CDC_ERROR_CODE_NONE 0 #define CEC_OP_CDC_ERROR_CODE_CAP_UNSUPPORTED 1 #define CEC_OP_CDC_ERROR_CODE_WRONG_STATE 2 #define CEC_OP_CDC_ERROR_CODE_OTHER 3 /* HEC Support Operand (hec_support) */ #define CEC_OP_HEC_SUPPORT_NO 0 #define CEC_OP_HEC_SUPPORT_YES 1 /* HEC Activation Operand (hec_activation) */ #define CEC_OP_HEC_ACTIVATION_ON 0 #define CEC_OP_HEC_ACTIVATION_OFF 1 #define CEC_MSG_CDC_HEC_SET_STATE_ADJACENT 0x02 #define CEC_MSG_CDC_HEC_SET_STATE 0x03 /* HEC Set State Operand (hec_set_state) */ #define CEC_OP_HEC_SET_STATE_DEACTIVATE 0 #define CEC_OP_HEC_SET_STATE_ACTIVATE 1 #define CEC_MSG_CDC_HEC_REQUEST_DEACTIVATION 0x04 #define CEC_MSG_CDC_HEC_NOTIFY_ALIVE 0x05 #define CEC_MSG_CDC_HEC_DISCOVER 0x06 /* Hotplug Detect messages */ #define CEC_MSG_CDC_HPD_SET_STATE 0x10 /* HPD State Operand (hpd_state) */ #define CEC_OP_HPD_STATE_CP_EDID_DISABLE 0 #define CEC_OP_HPD_STATE_CP_EDID_ENABLE 1 #define CEC_OP_HPD_STATE_CP_EDID_DISABLE_ENABLE 2 #define CEC_OP_HPD_STATE_EDID_DISABLE 3 #define CEC_OP_HPD_STATE_EDID_ENABLE 4 #define CEC_OP_HPD_STATE_EDID_DISABLE_ENABLE 5 #define CEC_MSG_CDC_HPD_REPORT_STATE 0x11 /* HPD Error Code Operand (hpd_error) */ #define CEC_OP_HPD_ERROR_NONE 0 #define CEC_OP_HPD_ERROR_INITIATOR_NOT_CAPABLE 1 #define CEC_OP_HPD_ERROR_INITIATOR_WRONG_STATE 2 #define CEC_OP_HPD_ERROR_OTHER 3 #define CEC_OP_HPD_ERROR_NONE_NO_VIDEO 4 /* End of Messages */ /* Helper functions to identify the 'special' CEC devices */ static inline int cec_is_2nd_tv(const struct cec_log_addrs *las) { /* * It is a second TV if the logical address is 14 or 15 and the * primary device type is a TV. */ return las->num_log_addrs && las->log_addr[0] >= CEC_LOG_ADDR_SPECIFIC && las->primary_device_type[0] == CEC_OP_PRIM_DEVTYPE_TV; } static inline int cec_is_processor(const struct cec_log_addrs *las) { /* * It is a processor if the logical address is 12-15 and the * primary device type is a Processor. */ return las->num_log_addrs && las->log_addr[0] >= CEC_LOG_ADDR_BACKUP_1 && las->primary_device_type[0] == CEC_OP_PRIM_DEVTYPE_PROCESSOR; } static inline int cec_is_switch(const struct cec_log_addrs *las) { /* * It is a switch if the logical address is 15 and the * primary device type is a Switch and the CDC-Only flag is not set. */ return las->num_log_addrs == 1 && las->log_addr[0] == CEC_LOG_ADDR_UNREGISTERED && las->primary_device_type[0] == CEC_OP_PRIM_DEVTYPE_SWITCH && !(las->flags & CEC_LOG_ADDRS_FL_CDC_ONLY); } static inline int cec_is_cdc_only(const struct cec_log_addrs *las) { /* * It is a CDC-only device if the logical address is 15 and the * primary device type is a Switch and the CDC-Only flag is set. */ return las->num_log_addrs == 1 && las->log_addr[0] == CEC_LOG_ADDR_UNREGISTERED && las->primary_device_type[0] == CEC_OP_PRIM_DEVTYPE_SWITCH && (las->flags & CEC_LOG_ADDRS_FL_CDC_ONLY); } #endif |
| 1 1 1 3 9 10 10 9 9 8 9 8 9 2 8 7 7 7 7 6 6 5 5 5 10 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Squashfs - a compressed read only filesystem for Linux * * Copyright (c) 2002, 2003, 2004, 2005, 2006, 2007, 2008 * Phillip Lougher <phillip@squashfs.org.uk> * * dir.c */ /* * This file implements code to read directories from disk. * * See namei.c for a description of directory organisation on disk. */ #include <linux/fs.h> #include <linux/vfs.h> #include <linux/slab.h> #include "squashfs_fs.h" #include "squashfs_fs_sb.h" #include "squashfs_fs_i.h" #include "squashfs.h" static const unsigned char squashfs_filetype_table[] = { DT_UNKNOWN, DT_DIR, DT_REG, DT_LNK, DT_BLK, DT_CHR, DT_FIFO, DT_SOCK }; /* * Lookup offset (f_pos) in the directory index, returning the * metadata block containing it. * * If we get an error reading the index then return the part of the index * (if any) we have managed to read - the index isn't essential, just * quicker. */ static int get_dir_index_using_offset(struct super_block *sb, u64 *next_block, int *next_offset, u64 index_start, int index_offset, int i_count, u64 f_pos) { struct squashfs_sb_info *msblk = sb->s_fs_info; int err, i, index, length = 0; unsigned int size; struct squashfs_dir_index dir_index; TRACE("Entered get_dir_index_using_offset, i_count %d, f_pos %lld\n", i_count, f_pos); /* * Translate from external f_pos to the internal f_pos. This * is offset by 3 because we invent "." and ".." entries which are * not actually stored in the directory. */ if (f_pos <= 3) return f_pos; f_pos -= 3; for (i = 0; i < i_count; i++) { err = squashfs_read_metadata(sb, &dir_index, &index_start, &index_offset, sizeof(dir_index)); if (err < 0) break; index = le32_to_cpu(dir_index.index); if (index > f_pos) /* * Found the index we're looking for. */ break; size = le32_to_cpu(dir_index.size) + 1; /* size should never be larger than SQUASHFS_NAME_LEN */ if (size > SQUASHFS_NAME_LEN) break; err = squashfs_read_metadata(sb, NULL, &index_start, &index_offset, size); if (err < 0) break; length = index; *next_block = le32_to_cpu(dir_index.start_block) + msblk->directory_table; } *next_offset = (length + *next_offset) % SQUASHFS_METADATA_SIZE; /* * Translate back from internal f_pos to external f_pos. */ return length + 3; } static int squashfs_readdir(struct file *file, struct dir_context *ctx) { struct inode *inode = file_inode(file); struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; u64 block = squashfs_i(inode)->start + msblk->directory_table; int offset = squashfs_i(inode)->offset, length, err; unsigned int inode_number, dir_count, size, type; struct squashfs_dir_header dirh; struct squashfs_dir_entry *dire; TRACE("Entered squashfs_readdir [%llx:%x]\n", block, offset); dire = kmalloc(sizeof(*dire) + SQUASHFS_NAME_LEN + 1, GFP_KERNEL); if (dire == NULL) { ERROR("Failed to allocate squashfs_dir_entry\n"); goto finish; } /* * Return "." and ".." entries as the first two filenames in the * directory. To maximise compression these two entries are not * stored in the directory, and so we invent them here. * * It also means that the external f_pos is offset by 3 from the * on-disk directory f_pos. */ while (ctx->pos < 3) { char *name; int i_ino; if (ctx->pos == 0) { name = "."; size = 1; i_ino = inode->i_ino; } else { name = ".."; size = 2; i_ino = squashfs_i(inode)->parent; } if (!dir_emit(ctx, name, size, i_ino, squashfs_filetype_table[1])) goto finish; ctx->pos += size; } length = get_dir_index_using_offset(inode->i_sb, &block, &offset, squashfs_i(inode)->dir_idx_start, squashfs_i(inode)->dir_idx_offset, squashfs_i(inode)->dir_idx_cnt, ctx->pos); while (length < i_size_read(inode)) { /* * Read directory header */ err = squashfs_read_metadata(inode->i_sb, &dirh, &block, &offset, sizeof(dirh)); if (err < 0) goto failed_read; length += sizeof(dirh); dir_count = le32_to_cpu(dirh.count) + 1; if (dir_count > SQUASHFS_DIR_COUNT) goto failed_read; while (dir_count--) { /* * Read directory entry. */ err = squashfs_read_metadata(inode->i_sb, dire, &block, &offset, sizeof(*dire)); if (err < 0) goto failed_read; size = le16_to_cpu(dire->size) + 1; /* size should never be larger than SQUASHFS_NAME_LEN */ if (size > SQUASHFS_NAME_LEN) goto failed_read; err = squashfs_read_metadata(inode->i_sb, dire->name, &block, &offset, size); if (err < 0) goto failed_read; length += sizeof(*dire) + size; if (ctx->pos >= length) continue; dire->name[size] = '\0'; inode_number = le32_to_cpu(dirh.inode_number) + ((short) le16_to_cpu(dire->inode_number)); type = le16_to_cpu(dire->type); if (type > SQUASHFS_MAX_DIR_TYPE) goto failed_read; if (!dir_emit(ctx, dire->name, size, inode_number, squashfs_filetype_table[type])) goto finish; ctx->pos = length; } } finish: kfree(dire); return 0; failed_read: ERROR("Unable to read directory block [%llx:%x]\n", block, offset); kfree(dire); return 0; } const struct file_operations squashfs_dir_ops = { .read = generic_read_dir, .iterate_shared = squashfs_readdir, .llseek = generic_file_llseek, }; |
| 77 68 23 17 72 72 68 65 65 65 68 65 98 102 98 68 68 68 19 16 14 3 6 6 3 3 3 6 6 6 3 2 2 2 3 123 122 8 187 187 186 186 186 187 6 32 9 8 9 2 9 9 8 3 9 32 2 1 1 1 2 5 9 5 6 9 6 167 166 166 166 166 166 22 144 166 166 166 166 167 23 2 2 45 3 3 4 2 3 32 11 10 10 1 9 4 6 11 5 203 16 3 5 4 33 3 155 37 37 144 36 34 8 36 36 145 145 145 2 144 144 144 144 144 144 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 | // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 1991, 1992 Linus Torvalds * * Added support for a Unix98-style ptmx device. * -- C. Scott Ananian <cananian@alumni.princeton.edu>, 14-Jan-1998 * */ #include <linux/module.h> #include <linux/errno.h> #include <linux/interrupt.h> #include <linux/tty.h> #include <linux/tty_flip.h> #include <linux/fcntl.h> #include <linux/sched/signal.h> #include <linux/string.h> #include <linux/major.h> #include <linux/mm.h> #include <linux/init.h> #include <linux/device.h> #include <linux/uaccess.h> #include <linux/bitops.h> #include <linux/devpts_fs.h> #include <linux/slab.h> #include <linux/mutex.h> #include <linux/poll.h> #include <linux/mount.h> #include <linux/file.h> #include <linux/ioctl.h> #include <linux/compat.h> #include "tty.h" #undef TTY_DEBUG_HANGUP #ifdef TTY_DEBUG_HANGUP # define tty_debug_hangup(tty, f, args...) tty_debug(tty, f, ##args) #else # define tty_debug_hangup(tty, f, args...) do {} while (0) #endif #ifdef CONFIG_UNIX98_PTYS static struct tty_driver *ptm_driver; static struct tty_driver *pts_driver; static DEFINE_MUTEX(devpts_mutex); #endif static void pty_close(struct tty_struct *tty, struct file *filp) { if (tty->driver->subtype == PTY_TYPE_MASTER) WARN_ON(tty->count > 1); else { if (tty_io_error(tty)) return; if (tty->count > 2) return; } set_bit(TTY_IO_ERROR, &tty->flags); wake_up_interruptible(&tty->read_wait); wake_up_interruptible(&tty->write_wait); spin_lock_irq(&tty->ctrl.lock); tty->ctrl.packet = false; spin_unlock_irq(&tty->ctrl.lock); /* Review - krefs on tty_link ?? */ if (!tty->link) return; set_bit(TTY_OTHER_CLOSED, &tty->link->flags); wake_up_interruptible(&tty->link->read_wait); wake_up_interruptible(&tty->link->write_wait); if (tty->driver->subtype == PTY_TYPE_MASTER) { set_bit(TTY_OTHER_CLOSED, &tty->flags); #ifdef CONFIG_UNIX98_PTYS if (tty->driver == ptm_driver) { mutex_lock(&devpts_mutex); if (tty->link->driver_data) devpts_pty_kill(tty->link->driver_data); mutex_unlock(&devpts_mutex); } #endif tty_vhangup(tty->link); } } /* * The unthrottle routine is called by the line discipline to signal * that it can receive more characters. For PTY's, the TTY_THROTTLED * flag is always set, to force the line discipline to always call the * unthrottle routine when there are fewer than TTY_THRESHOLD_UNTHROTTLE * characters in the queue. This is necessary since each time this * happens, we need to wake up any sleeping processes that could be * (1) trying to send data to the pty, or (2) waiting in wait_until_sent() * for the pty buffer to be drained. */ static void pty_unthrottle(struct tty_struct *tty) { tty_wakeup(tty->link); set_bit(TTY_THROTTLED, &tty->flags); } /** * pty_write - write to a pty * @tty: the tty we write from * @buf: kernel buffer of data * @c: bytes to write * * Our "hardware" write method. Data is coming from the ldisc which * may be in a non sleeping state. We simply throw this at the other * end of the link as if we were an IRQ handler receiving stuff for * the other side of the pty/tty pair. */ static ssize_t pty_write(struct tty_struct *tty, const u8 *buf, size_t c) { struct tty_struct *to = tty->link; if (tty->flow.stopped || !c) return 0; return tty_insert_flip_string_and_push_buffer(to->port, buf, c); } /** * pty_write_room - write space * @tty: tty we are writing from * * Report how many bytes the ldisc can send into the queue for * the other device. */ static unsigned int pty_write_room(struct tty_struct *tty) { if (tty->flow.stopped) return 0; return tty_buffer_space_avail(tty->link->port); } /* Set the lock flag on a pty */ static int pty_set_lock(struct tty_struct *tty, int __user *arg) { int val; if (get_user(val, arg)) return -EFAULT; if (val) set_bit(TTY_PTY_LOCK, &tty->flags); else clear_bit(TTY_PTY_LOCK, &tty->flags); return 0; } static int pty_get_lock(struct tty_struct *tty, int __user *arg) { int locked = test_bit(TTY_PTY_LOCK, &tty->flags); return put_user(locked, arg); } /* Set the packet mode on a pty */ static int pty_set_pktmode(struct tty_struct *tty, int __user *arg) { int pktmode; if (get_user(pktmode, arg)) return -EFAULT; spin_lock_irq(&tty->ctrl.lock); if (pktmode) { if (!tty->ctrl.packet) { tty->link->ctrl.pktstatus = 0; smp_mb(); tty->ctrl.packet = true; } } else tty->ctrl.packet = false; spin_unlock_irq(&tty->ctrl.lock); return 0; } /* Get the packet mode of a pty */ static int pty_get_pktmode(struct tty_struct *tty, int __user *arg) { int pktmode = tty->ctrl.packet; return put_user(pktmode, arg); } /* Send a signal to the slave */ static int pty_signal(struct tty_struct *tty, int sig) { struct pid *pgrp; if (sig != SIGINT && sig != SIGQUIT && sig != SIGTSTP) return -EINVAL; if (tty->link) { pgrp = tty_get_pgrp(tty->link); if (pgrp) kill_pgrp(pgrp, sig, 1); put_pid(pgrp); } return 0; } static void pty_flush_buffer(struct tty_struct *tty) { struct tty_struct *to = tty->link; if (!to) return; tty_buffer_flush(to, NULL); if (to->ctrl.packet) { spin_lock_irq(&tty->ctrl.lock); tty->ctrl.pktstatus |= TIOCPKT_FLUSHWRITE; wake_up_interruptible(&to->read_wait); spin_unlock_irq(&tty->ctrl.lock); } } static int pty_open(struct tty_struct *tty, struct file *filp) { if (!tty || !tty->link) return -ENODEV; if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) goto out; if (test_bit(TTY_PTY_LOCK, &tty->link->flags)) goto out; if (tty->driver->subtype == PTY_TYPE_SLAVE && tty->link->count != 1) goto out; clear_bit(TTY_IO_ERROR, &tty->flags); clear_bit(TTY_OTHER_CLOSED, &tty->link->flags); set_bit(TTY_THROTTLED, &tty->flags); return 0; out: set_bit(TTY_IO_ERROR, &tty->flags); return -EIO; } static void pty_set_termios(struct tty_struct *tty, const struct ktermios *old_termios) { /* See if packet mode change of state. */ if (tty->link && tty->link->ctrl.packet) { int extproc = (old_termios->c_lflag & EXTPROC) | L_EXTPROC(tty); int old_flow = ((old_termios->c_iflag & IXON) && (old_termios->c_cc[VSTOP] == '\023') && (old_termios->c_cc[VSTART] == '\021')); int new_flow = (I_IXON(tty) && STOP_CHAR(tty) == '\023' && START_CHAR(tty) == '\021'); if ((old_flow != new_flow) || extproc) { spin_lock_irq(&tty->ctrl.lock); if (old_flow != new_flow) { tty->ctrl.pktstatus &= ~(TIOCPKT_DOSTOP | TIOCPKT_NOSTOP); if (new_flow) tty->ctrl.pktstatus |= TIOCPKT_DOSTOP; else tty->ctrl.pktstatus |= TIOCPKT_NOSTOP; } if (extproc) tty->ctrl.pktstatus |= TIOCPKT_IOCTL; spin_unlock_irq(&tty->ctrl.lock); wake_up_interruptible(&tty->link->read_wait); } } tty->termios.c_cflag &= ~(CSIZE | PARENB); tty->termios.c_cflag |= (CS8 | CREAD); } /** * pty_resize - resize event * @tty: tty being resized * @ws: window size being set. * * Update the termios variables and send the necessary signals to * peform a terminal resize correctly */ static int pty_resize(struct tty_struct *tty, struct winsize *ws) { struct pid *pgrp, *rpgrp; struct tty_struct *pty = tty->link; /* For a PTY we need to lock the tty side */ mutex_lock(&tty->winsize_mutex); if (!memcmp(ws, &tty->winsize, sizeof(*ws))) goto done; /* Signal the foreground process group of both ptys */ pgrp = tty_get_pgrp(tty); rpgrp = tty_get_pgrp(pty); if (pgrp) kill_pgrp(pgrp, SIGWINCH, 1); if (rpgrp != pgrp && rpgrp) kill_pgrp(rpgrp, SIGWINCH, 1); put_pid(pgrp); put_pid(rpgrp); tty->winsize = *ws; pty->winsize = *ws; /* Never used so will go away soon */ done: mutex_unlock(&tty->winsize_mutex); return 0; } /** * pty_start - start() handler * pty_stop - stop() handler * @tty: tty being flow-controlled * * Propagates the TIOCPKT status to the master pty. * * NB: only the master pty can be in packet mode so only the slave * needs start()/stop() handlers */ static void pty_start(struct tty_struct *tty) { unsigned long flags; if (tty->link && tty->link->ctrl.packet) { spin_lock_irqsave(&tty->ctrl.lock, flags); tty->ctrl.pktstatus &= ~TIOCPKT_STOP; tty->ctrl.pktstatus |= TIOCPKT_START; spin_unlock_irqrestore(&tty->ctrl.lock, flags); wake_up_interruptible_poll(&tty->link->read_wait, EPOLLIN); } } static void pty_stop(struct tty_struct *tty) { unsigned long flags; if (tty->link && tty->link->ctrl.packet) { spin_lock_irqsave(&tty->ctrl.lock, flags); tty->ctrl.pktstatus &= ~TIOCPKT_START; tty->ctrl.pktstatus |= TIOCPKT_STOP; spin_unlock_irqrestore(&tty->ctrl.lock, flags); wake_up_interruptible_poll(&tty->link->read_wait, EPOLLIN); } } /** * pty_common_install - set up the pty pair * @driver: the pty driver * @tty: the tty being instantiated * @legacy: true if this is BSD style * * Perform the initial set up for the tty/pty pair. Called from the * tty layer when the port is first opened. * * Locking: the caller must hold the tty_mutex */ static int pty_common_install(struct tty_driver *driver, struct tty_struct *tty, bool legacy) { struct tty_struct *o_tty; struct tty_port *ports[2]; int idx = tty->index; int retval = -ENOMEM; /* Opening the slave first has always returned -EIO */ if (driver->subtype != PTY_TYPE_MASTER) return -EIO; ports[0] = kmalloc(sizeof **ports, GFP_KERNEL); ports[1] = kmalloc(sizeof **ports, GFP_KERNEL); if (!ports[0] || !ports[1]) goto err; if (!try_module_get(driver->other->owner)) { /* This cannot in fact currently happen */ goto err; } o_tty = alloc_tty_struct(driver->other, idx); if (!o_tty) goto err_put_module; tty_set_lock_subclass(o_tty); lockdep_set_subclass(&o_tty->termios_rwsem, TTY_LOCK_SLAVE); if (legacy) { /* We always use new tty termios data so we can do this the easy way .. */ tty_init_termios(tty); tty_init_termios(o_tty); driver->other->ttys[idx] = o_tty; driver->ttys[idx] = tty; } else { memset(&tty->termios_locked, 0, sizeof(tty->termios_locked)); tty->termios = driver->init_termios; memset(&o_tty->termios_locked, 0, sizeof(tty->termios_locked)); o_tty->termios = driver->other->init_termios; } /* * Everything allocated ... set up the o_tty structure. */ tty_driver_kref_get(driver->other); /* Establish the links in both directions */ tty->link = o_tty; o_tty->link = tty; tty_port_init(ports[0]); tty_port_init(ports[1]); tty_buffer_set_limit(ports[0], 8192); tty_buffer_set_limit(ports[1], 8192); o_tty->port = ports[0]; tty->port = ports[1]; o_tty->port->itty = o_tty; tty_buffer_set_lock_subclass(o_tty->port); tty_driver_kref_get(driver); tty->count++; o_tty->count++; return 0; err_put_module: module_put(driver->other->owner); err: kfree(ports[0]); kfree(ports[1]); return retval; } static void pty_cleanup(struct tty_struct *tty) { tty_port_put(tty->port); } /* Traditional BSD devices */ #ifdef CONFIG_LEGACY_PTYS static int pty_install(struct tty_driver *driver, struct tty_struct *tty) { return pty_common_install(driver, tty, true); } static void pty_remove(struct tty_driver *driver, struct tty_struct *tty) { struct tty_struct *pair = tty->link; driver->ttys[tty->index] = NULL; if (pair) pair->driver->ttys[pair->index] = NULL; } static int pty_bsd_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { switch (cmd) { case TIOCSPTLCK: /* Set PT Lock (disallow slave open) */ return pty_set_lock(tty, (int __user *) arg); case TIOCGPTLCK: /* Get PT Lock status */ return pty_get_lock(tty, (int __user *)arg); case TIOCPKT: /* Set PT packet mode */ return pty_set_pktmode(tty, (int __user *)arg); case TIOCGPKT: /* Get PT packet mode */ return pty_get_pktmode(tty, (int __user *)arg); case TIOCSIG: /* Send signal to other side of pty */ return pty_signal(tty, (int) arg); case TIOCGPTN: /* TTY returns ENOTTY, but glibc expects EINVAL here */ return -EINVAL; } return -ENOIOCTLCMD; } #ifdef CONFIG_COMPAT static long pty_bsd_compat_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { /* * PTY ioctls don't require any special translation between 32-bit and * 64-bit userspace, they are already compatible. */ return pty_bsd_ioctl(tty, cmd, (unsigned long)compat_ptr(arg)); } #else #define pty_bsd_compat_ioctl NULL #endif static int legacy_count = CONFIG_LEGACY_PTY_COUNT; /* * not really modular, but the easiest way to keep compat with existing * bootargs behaviour is to continue using module_param here. */ module_param(legacy_count, int, 0); /* * The master side of a pty can do TIOCSPTLCK and thus * has pty_bsd_ioctl. */ static const struct tty_operations master_pty_ops_bsd = { .install = pty_install, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .unthrottle = pty_unthrottle, .ioctl = pty_bsd_ioctl, .compat_ioctl = pty_bsd_compat_ioctl, .cleanup = pty_cleanup, .resize = pty_resize, .remove = pty_remove }; static const struct tty_operations slave_pty_ops_bsd = { .install = pty_install, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .unthrottle = pty_unthrottle, .set_termios = pty_set_termios, .cleanup = pty_cleanup, .resize = pty_resize, .start = pty_start, .stop = pty_stop, .remove = pty_remove }; static void __init legacy_pty_init(void) { struct tty_driver *pty_driver, *pty_slave_driver; if (legacy_count <= 0) return; pty_driver = tty_alloc_driver(legacy_count, TTY_DRIVER_RESET_TERMIOS | TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_ALLOC); if (IS_ERR(pty_driver)) panic("Couldn't allocate pty driver"); pty_slave_driver = tty_alloc_driver(legacy_count, TTY_DRIVER_RESET_TERMIOS | TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_ALLOC); if (IS_ERR(pty_slave_driver)) panic("Couldn't allocate pty slave driver"); pty_driver->driver_name = "pty_master"; pty_driver->name = "pty"; pty_driver->major = PTY_MASTER_MAJOR; pty_driver->minor_start = 0; pty_driver->type = TTY_DRIVER_TYPE_PTY; pty_driver->subtype = PTY_TYPE_MASTER; pty_driver->init_termios = tty_std_termios; pty_driver->init_termios.c_iflag = 0; pty_driver->init_termios.c_oflag = 0; pty_driver->init_termios.c_cflag = B38400 | CS8 | CREAD; pty_driver->init_termios.c_lflag = 0; pty_driver->init_termios.c_ispeed = 38400; pty_driver->init_termios.c_ospeed = 38400; pty_driver->other = pty_slave_driver; tty_set_operations(pty_driver, &master_pty_ops_bsd); pty_slave_driver->driver_name = "pty_slave"; pty_slave_driver->name = "ttyp"; pty_slave_driver->major = PTY_SLAVE_MAJOR; pty_slave_driver->minor_start = 0; pty_slave_driver->type = TTY_DRIVER_TYPE_PTY; pty_slave_driver->subtype = PTY_TYPE_SLAVE; pty_slave_driver->init_termios = tty_std_termios; pty_slave_driver->init_termios.c_cflag = B38400 | CS8 | CREAD; pty_slave_driver->init_termios.c_ispeed = 38400; pty_slave_driver->init_termios.c_ospeed = 38400; pty_slave_driver->other = pty_driver; tty_set_operations(pty_slave_driver, &slave_pty_ops_bsd); if (tty_register_driver(pty_driver)) panic("Couldn't register pty driver"); if (tty_register_driver(pty_slave_driver)) panic("Couldn't register pty slave driver"); } #else static inline void legacy_pty_init(void) { } #endif /* Unix98 devices */ #ifdef CONFIG_UNIX98_PTYS static struct cdev ptmx_cdev; /** * ptm_open_peer - open the peer of a pty * @master: the open struct file of the ptmx device node * @tty: the master of the pty being opened * @flags: the flags for open * * Provide a race free way for userspace to open the slave end of a pty * (where they have the master fd and cannot access or trust the mount * namespace /dev/pts was mounted inside). */ int ptm_open_peer(struct file *master, struct tty_struct *tty, int flags) { int fd; struct file *filp; int retval = -EINVAL; struct path path; if (tty->driver != ptm_driver) return -EIO; fd = get_unused_fd_flags(flags); if (fd < 0) { retval = fd; goto err; } /* Compute the slave's path */ path.mnt = devpts_mntget(master, tty->driver_data); if (IS_ERR(path.mnt)) { retval = PTR_ERR(path.mnt); goto err_put; } path.dentry = tty->link->driver_data; filp = dentry_open(&path, flags, current_cred()); mntput(path.mnt); if (IS_ERR(filp)) { retval = PTR_ERR(filp); goto err_put; } fd_install(fd, filp); return fd; err_put: put_unused_fd(fd); err: return retval; } static int pty_unix98_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { switch (cmd) { case TIOCSPTLCK: /* Set PT Lock (disallow slave open) */ return pty_set_lock(tty, (int __user *)arg); case TIOCGPTLCK: /* Get PT Lock status */ return pty_get_lock(tty, (int __user *)arg); case TIOCPKT: /* Set PT packet mode */ return pty_set_pktmode(tty, (int __user *)arg); case TIOCGPKT: /* Get PT packet mode */ return pty_get_pktmode(tty, (int __user *)arg); case TIOCGPTN: /* Get PT Number */ return put_user(tty->index, (unsigned int __user *)arg); case TIOCSIG: /* Send signal to other side of pty */ return pty_signal(tty, (int) arg); } return -ENOIOCTLCMD; } #ifdef CONFIG_COMPAT static long pty_unix98_compat_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { /* * PTY ioctls don't require any special translation between 32-bit and * 64-bit userspace, they are already compatible. */ return pty_unix98_ioctl(tty, cmd, cmd == TIOCSIG ? arg : (unsigned long)compat_ptr(arg)); } #else #define pty_unix98_compat_ioctl NULL #endif /** * ptm_unix98_lookup - find a pty master * @driver: ptm driver * @file: unused * @idx: tty index * * Look up a pty master device. Called under the tty_mutex for now. * This provides our locking. */ static struct tty_struct *ptm_unix98_lookup(struct tty_driver *driver, struct file *file, int idx) { /* Master must be open via /dev/ptmx */ return ERR_PTR(-EIO); } /** * pts_unix98_lookup - find a pty slave * @driver: pts driver * @file: file pointer to tty * @idx: tty index * * Look up a pty master device. Called under the tty_mutex for now. * This provides our locking for the tty pointer. */ static struct tty_struct *pts_unix98_lookup(struct tty_driver *driver, struct file *file, int idx) { struct tty_struct *tty; mutex_lock(&devpts_mutex); tty = devpts_get_priv(file->f_path.dentry); mutex_unlock(&devpts_mutex); /* Master must be open before slave */ if (!tty) return ERR_PTR(-EIO); return tty; } static int pty_unix98_install(struct tty_driver *driver, struct tty_struct *tty) { return pty_common_install(driver, tty, false); } /* this is called once with whichever end is closed last */ static void pty_unix98_remove(struct tty_driver *driver, struct tty_struct *tty) { struct pts_fs_info *fsi; if (tty->driver->subtype == PTY_TYPE_MASTER) fsi = tty->driver_data; else fsi = tty->link->driver_data; if (fsi) { devpts_kill_index(fsi, tty->index); devpts_release(fsi); } } static void pty_show_fdinfo(struct tty_struct *tty, struct seq_file *m) { seq_printf(m, "tty-index:\t%d\n", tty->index); } static const struct tty_operations ptm_unix98_ops = { .lookup = ptm_unix98_lookup, .install = pty_unix98_install, .remove = pty_unix98_remove, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .unthrottle = pty_unthrottle, .ioctl = pty_unix98_ioctl, .compat_ioctl = pty_unix98_compat_ioctl, .resize = pty_resize, .cleanup = pty_cleanup, .show_fdinfo = pty_show_fdinfo, }; static const struct tty_operations pty_unix98_ops = { .lookup = pts_unix98_lookup, .install = pty_unix98_install, .remove = pty_unix98_remove, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .unthrottle = pty_unthrottle, .set_termios = pty_set_termios, .start = pty_start, .stop = pty_stop, .cleanup = pty_cleanup, }; /** * ptmx_open - open a unix 98 pty master * @inode: inode of device file * @filp: file pointer to tty * * Allocate a unix98 pty master device from the ptmx driver. * * Locking: tty_mutex protects the init_dev work. tty->count should * protect the rest. * allocated_ptys_lock handles the list of free pty numbers */ static int ptmx_open(struct inode *inode, struct file *filp) { struct pts_fs_info *fsi; struct tty_struct *tty; struct dentry *dentry; int retval; int index; nonseekable_open(inode, filp); /* We refuse fsnotify events on ptmx, since it's a shared resource */ filp->f_mode |= FMODE_NONOTIFY; retval = tty_alloc_file(filp); if (retval) return retval; fsi = devpts_acquire(filp); if (IS_ERR(fsi)) { retval = PTR_ERR(fsi); goto out_free_file; } /* find a device that is not in use. */ mutex_lock(&devpts_mutex); index = devpts_new_index(fsi); mutex_unlock(&devpts_mutex); retval = index; if (index < 0) goto out_put_fsi; mutex_lock(&tty_mutex); tty = tty_init_dev(ptm_driver, index); /* The tty returned here is locked so we can safely drop the mutex */ mutex_unlock(&tty_mutex); retval = PTR_ERR(tty); if (IS_ERR(tty)) goto out; /* * From here on out, the tty is "live", and the index and * fsi will be killed/put by the tty_release() */ set_bit(TTY_PTY_LOCK, &tty->flags); /* LOCK THE SLAVE */ tty->driver_data = fsi; tty_add_file(tty, filp); dentry = devpts_pty_new(fsi, index, tty->link); if (IS_ERR(dentry)) { retval = PTR_ERR(dentry); goto err_release; } tty->link->driver_data = dentry; retval = ptm_driver->ops->open(tty, filp); if (retval) goto err_release; tty_debug_hangup(tty, "opening (count=%d)\n", tty->count); tty_unlock(tty); return 0; err_release: tty_unlock(tty); // This will also put-ref the fsi tty_release(inode, filp); return retval; out: devpts_kill_index(fsi, index); out_put_fsi: devpts_release(fsi); out_free_file: tty_free_file(filp); return retval; } static struct file_operations ptmx_fops __ro_after_init; static void __init unix98_pty_init(void) { ptm_driver = tty_alloc_driver(NR_UNIX98_PTY_MAX, TTY_DRIVER_RESET_TERMIOS | TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_DEV | TTY_DRIVER_DEVPTS_MEM | TTY_DRIVER_DYNAMIC_ALLOC); if (IS_ERR(ptm_driver)) panic("Couldn't allocate Unix98 ptm driver"); pts_driver = tty_alloc_driver(NR_UNIX98_PTY_MAX, TTY_DRIVER_RESET_TERMIOS | TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_DEV | TTY_DRIVER_DEVPTS_MEM | TTY_DRIVER_DYNAMIC_ALLOC); if (IS_ERR(pts_driver)) panic("Couldn't allocate Unix98 pts driver"); ptm_driver->driver_name = "pty_master"; ptm_driver->name = "ptm"; ptm_driver->major = UNIX98_PTY_MASTER_MAJOR; ptm_driver->minor_start = 0; ptm_driver->type = TTY_DRIVER_TYPE_PTY; ptm_driver->subtype = PTY_TYPE_MASTER; ptm_driver->init_termios = tty_std_termios; ptm_driver->init_termios.c_iflag = 0; ptm_driver->init_termios.c_oflag = 0; ptm_driver->init_termios.c_cflag = B38400 | CS8 | CREAD; ptm_driver->init_termios.c_lflag = 0; ptm_driver->init_termios.c_ispeed = 38400; ptm_driver->init_termios.c_ospeed = 38400; ptm_driver->other = pts_driver; tty_set_operations(ptm_driver, &ptm_unix98_ops); pts_driver->driver_name = "pty_slave"; pts_driver->name = "pts"; pts_driver->major = UNIX98_PTY_SLAVE_MAJOR; pts_driver->minor_start = 0; pts_driver->type = TTY_DRIVER_TYPE_PTY; pts_driver->subtype = PTY_TYPE_SLAVE; pts_driver->init_termios = tty_std_termios; pts_driver->init_termios.c_cflag = B38400 | CS8 | CREAD; pts_driver->init_termios.c_ispeed = 38400; pts_driver->init_termios.c_ospeed = 38400; pts_driver->other = ptm_driver; tty_set_operations(pts_driver, &pty_unix98_ops); if (tty_register_driver(ptm_driver)) panic("Couldn't register Unix98 ptm driver"); if (tty_register_driver(pts_driver)) panic("Couldn't register Unix98 pts driver"); /* Now create the /dev/ptmx special device */ tty_default_fops(&ptmx_fops); ptmx_fops.open = ptmx_open; cdev_init(&ptmx_cdev, &ptmx_fops); if (cdev_add(&ptmx_cdev, MKDEV(TTYAUX_MAJOR, 2), 1) || register_chrdev_region(MKDEV(TTYAUX_MAJOR, 2), 1, "/dev/ptmx") < 0) panic("Couldn't register /dev/ptmx driver"); device_create(&tty_class, NULL, MKDEV(TTYAUX_MAJOR, 2), NULL, "ptmx"); } #else static inline void unix98_pty_init(void) { } #endif static int __init pty_init(void) { legacy_pty_init(); unix98_pty_init(); return 0; } device_initcall(pty_init); |
| 13 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_INTERVAL_TREE_H #define _LINUX_INTERVAL_TREE_H #include <linux/rbtree.h> struct interval_tree_node { struct rb_node rb; unsigned long start; /* Start of interval */ unsigned long last; /* Last location _in_ interval */ unsigned long __subtree_last; }; extern void interval_tree_insert(struct interval_tree_node *node, struct rb_root_cached *root); extern void interval_tree_remove(struct interval_tree_node *node, struct rb_root_cached *root); extern struct interval_tree_node * interval_tree_iter_first(struct rb_root_cached *root, unsigned long start, unsigned long last); extern struct interval_tree_node * interval_tree_iter_next(struct interval_tree_node *node, unsigned long start, unsigned long last); /** * struct interval_tree_span_iter - Find used and unused spans. * @start_hole: Start of an interval for a hole when is_hole == 1 * @last_hole: Inclusive end of an interval for a hole when is_hole == 1 * @start_used: Start of a used interval when is_hole == 0 * @last_used: Inclusive end of a used interval when is_hole == 0 * @is_hole: 0 == used, 1 == is_hole, -1 == done iteration * * This iterator travels over spans in an interval tree. It does not return * nodes but classifies each span as either a hole, where no nodes intersect, or * a used, which is fully covered by nodes. Each iteration step toggles between * hole and used until the entire range is covered. The returned spans always * fully cover the requested range. * * The iterator is greedy, it always returns the largest hole or used possible, * consolidating all consecutive nodes. * * Use interval_tree_span_iter_done() to detect end of iteration. */ struct interval_tree_span_iter { /* private: not for use by the caller */ struct interval_tree_node *nodes[2]; unsigned long first_index; unsigned long last_index; /* public: */ union { unsigned long start_hole; unsigned long start_used; }; union { unsigned long last_hole; unsigned long last_used; }; int is_hole; }; void interval_tree_span_iter_first(struct interval_tree_span_iter *state, struct rb_root_cached *itree, unsigned long first_index, unsigned long last_index); void interval_tree_span_iter_advance(struct interval_tree_span_iter *iter, struct rb_root_cached *itree, unsigned long new_index); void interval_tree_span_iter_next(struct interval_tree_span_iter *state); static inline bool interval_tree_span_iter_done(struct interval_tree_span_iter *state) { return state->is_hole == -1; } #define interval_tree_for_each_span(span, itree, first_index, last_index) \ for (interval_tree_span_iter_first(span, itree, \ first_index, last_index); \ !interval_tree_span_iter_done(span); \ interval_tree_span_iter_next(span)) #endif /* _LINUX_INTERVAL_TREE_H */ |
| 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 | // SPDX-License-Identifier: GPL-2.0-only /* * phonet.c -- USB CDC Phonet host driver * * Copyright (C) 2008-2009 Nokia Corporation. All rights reserved. * * Author: Rémi Denis-Courmont */ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/module.h> #include <linux/gfp.h> #include <linux/usb.h> #include <linux/usb/cdc.h> #include <linux/netdevice.h> #include <linux/if_arp.h> #include <linux/if_phonet.h> #include <linux/phonet.h> #define PN_MEDIA_USB 0x1B static const unsigned rxq_size = 17; struct usbpn_dev { struct net_device *dev; struct usb_interface *intf, *data_intf; struct usb_device *usb; unsigned int tx_pipe, rx_pipe; u8 active_setting; u8 disconnected; unsigned tx_queue; spinlock_t tx_lock; spinlock_t rx_lock; struct sk_buff *rx_skb; struct urb *urbs[]; }; static void tx_complete(struct urb *req); static void rx_complete(struct urb *req); /* * Network device callbacks */ static netdev_tx_t usbpn_xmit(struct sk_buff *skb, struct net_device *dev) { struct usbpn_dev *pnd = netdev_priv(dev); struct urb *req = NULL; unsigned long flags; int err; if (skb->protocol != htons(ETH_P_PHONET)) goto drop; req = usb_alloc_urb(0, GFP_ATOMIC); if (!req) goto drop; usb_fill_bulk_urb(req, pnd->usb, pnd->tx_pipe, skb->data, skb->len, tx_complete, skb); req->transfer_flags = URB_ZERO_PACKET; err = usb_submit_urb(req, GFP_ATOMIC); if (err) { usb_free_urb(req); goto drop; } spin_lock_irqsave(&pnd->tx_lock, flags); pnd->tx_queue++; if (pnd->tx_queue >= dev->tx_queue_len) netif_stop_queue(dev); spin_unlock_irqrestore(&pnd->tx_lock, flags); return NETDEV_TX_OK; drop: dev_kfree_skb(skb); dev->stats.tx_dropped++; return NETDEV_TX_OK; } static void tx_complete(struct urb *req) { struct sk_buff *skb = req->context; struct net_device *dev = skb->dev; struct usbpn_dev *pnd = netdev_priv(dev); int status = req->status; unsigned long flags; switch (status) { case 0: dev->stats.tx_bytes += skb->len; break; case -ENOENT: case -ECONNRESET: case -ESHUTDOWN: dev->stats.tx_aborted_errors++; fallthrough; default: dev->stats.tx_errors++; dev_dbg(&dev->dev, "TX error (%d)\n", status); } dev->stats.tx_packets++; spin_lock_irqsave(&pnd->tx_lock, flags); pnd->tx_queue--; netif_wake_queue(dev); spin_unlock_irqrestore(&pnd->tx_lock, flags); dev_kfree_skb_any(skb); usb_free_urb(req); } static int rx_submit(struct usbpn_dev *pnd, struct urb *req, gfp_t gfp_flags) { struct net_device *dev = pnd->dev; struct page *page; int err; page = __dev_alloc_page(gfp_flags | __GFP_NOMEMALLOC); if (!page) return -ENOMEM; usb_fill_bulk_urb(req, pnd->usb, pnd->rx_pipe, page_address(page), PAGE_SIZE, rx_complete, dev); req->transfer_flags = 0; err = usb_submit_urb(req, gfp_flags); if (unlikely(err)) { dev_dbg(&dev->dev, "RX submit error (%d)\n", err); put_page(page); } return err; } static void rx_complete(struct urb *req) { struct net_device *dev = req->context; struct usbpn_dev *pnd = netdev_priv(dev); struct page *page = virt_to_page(req->transfer_buffer); struct sk_buff *skb; unsigned long flags; int status = req->status; switch (status) { case 0: spin_lock_irqsave(&pnd->rx_lock, flags); skb = pnd->rx_skb; if (!skb) { skb = pnd->rx_skb = netdev_alloc_skb(dev, 12); if (likely(skb)) { /* Can't use pskb_pull() on page in IRQ */ skb_put_data(skb, page_address(page), 1); skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, 1, req->actual_length, PAGE_SIZE); page = NULL; } } else { skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, 0, req->actual_length, PAGE_SIZE); page = NULL; } if (req->actual_length < PAGE_SIZE) pnd->rx_skb = NULL; /* Last fragment */ else skb = NULL; spin_unlock_irqrestore(&pnd->rx_lock, flags); if (skb) { skb->protocol = htons(ETH_P_PHONET); skb_reset_mac_header(skb); __skb_pull(skb, 1); skb->dev = dev; dev->stats.rx_packets++; dev->stats.rx_bytes += skb->len; netif_rx(skb); } goto resubmit; case -ENOENT: case -ECONNRESET: case -ESHUTDOWN: req = NULL; break; case -EOVERFLOW: dev->stats.rx_over_errors++; dev_dbg(&dev->dev, "RX overflow\n"); break; case -EILSEQ: dev->stats.rx_crc_errors++; break; } dev->stats.rx_errors++; resubmit: if (page) put_page(page); if (req) rx_submit(pnd, req, GFP_ATOMIC); } static int usbpn_close(struct net_device *dev); static int usbpn_open(struct net_device *dev) { struct usbpn_dev *pnd = netdev_priv(dev); int err; unsigned i; unsigned num = pnd->data_intf->cur_altsetting->desc.bInterfaceNumber; err = usb_set_interface(pnd->usb, num, pnd->active_setting); if (err) return err; for (i = 0; i < rxq_size; i++) { struct urb *req = usb_alloc_urb(0, GFP_KERNEL); if (!req || rx_submit(pnd, req, GFP_KERNEL)) { usb_free_urb(req); usbpn_close(dev); return -ENOMEM; } pnd->urbs[i] = req; } netif_wake_queue(dev); return 0; } static int usbpn_close(struct net_device *dev) { struct usbpn_dev *pnd = netdev_priv(dev); unsigned i; unsigned num = pnd->data_intf->cur_altsetting->desc.bInterfaceNumber; netif_stop_queue(dev); for (i = 0; i < rxq_size; i++) { struct urb *req = pnd->urbs[i]; if (!req) continue; usb_kill_urb(req); usb_free_urb(req); pnd->urbs[i] = NULL; } return usb_set_interface(pnd->usb, num, !pnd->active_setting); } static int usbpn_siocdevprivate(struct net_device *dev, struct ifreq *ifr, void __user *data, int cmd) { struct if_phonet_req *req = (struct if_phonet_req *)ifr; switch (cmd) { case SIOCPNGAUTOCONF: req->ifr_phonet_autoconf.device = PN_DEV_PC; return 0; } return -ENOIOCTLCMD; } static const struct net_device_ops usbpn_ops = { .ndo_open = usbpn_open, .ndo_stop = usbpn_close, .ndo_start_xmit = usbpn_xmit, .ndo_siocdevprivate = usbpn_siocdevprivate, }; static void usbpn_setup(struct net_device *dev) { const u8 addr = PN_MEDIA_USB; dev->features = 0; dev->netdev_ops = &usbpn_ops; dev->header_ops = &phonet_header_ops; dev->type = ARPHRD_PHONET; dev->flags = IFF_POINTOPOINT | IFF_NOARP; dev->mtu = PHONET_MAX_MTU; dev->min_mtu = PHONET_MIN_MTU; dev->max_mtu = PHONET_MAX_MTU; dev->hard_header_len = 1; dev->addr_len = 1; dev_addr_set(dev, &addr); dev->tx_queue_len = 3; dev->needs_free_netdev = true; } /* * USB driver callbacks */ static const struct usb_device_id usbpn_ids[] = { { .match_flags = USB_DEVICE_ID_MATCH_VENDOR | USB_DEVICE_ID_MATCH_INT_CLASS | USB_DEVICE_ID_MATCH_INT_SUBCLASS, .idVendor = 0x0421, /* Nokia */ .bInterfaceClass = USB_CLASS_COMM, .bInterfaceSubClass = 0xFE, }, { }, }; MODULE_DEVICE_TABLE(usb, usbpn_ids); static struct usb_driver usbpn_driver; static int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *id) { static const char ifname[] = "usbpn%d"; const struct usb_cdc_union_desc *union_header = NULL; const struct usb_host_interface *data_desc; struct usb_interface *data_intf; struct usb_device *usbdev = interface_to_usbdev(intf); struct net_device *dev; struct usbpn_dev *pnd; u8 *data; int phonet = 0; int len, err; struct usb_cdc_parsed_header hdr; data = intf->altsetting->extra; len = intf->altsetting->extralen; cdc_parse_cdc_header(&hdr, intf, data, len); union_header = hdr.usb_cdc_union_desc; phonet = hdr.phonet_magic_present; if (!union_header || !phonet) return -EINVAL; data_intf = usb_ifnum_to_if(usbdev, union_header->bSlaveInterface0); if (data_intf == NULL) return -ENODEV; /* Data interface has one inactive and one active setting */ if (data_intf->num_altsetting != 2) return -EINVAL; if (data_intf->altsetting[0].desc.bNumEndpoints == 0 && data_intf->altsetting[1].desc.bNumEndpoints == 2) data_desc = data_intf->altsetting + 1; else if (data_intf->altsetting[0].desc.bNumEndpoints == 2 && data_intf->altsetting[1].desc.bNumEndpoints == 0) data_desc = data_intf->altsetting; else return -EINVAL; dev = alloc_netdev(struct_size(pnd, urbs, rxq_size), ifname, NET_NAME_UNKNOWN, usbpn_setup); if (!dev) return -ENOMEM; pnd = netdev_priv(dev); SET_NETDEV_DEV(dev, &intf->dev); pnd->dev = dev; pnd->usb = usbdev; pnd->intf = intf; pnd->data_intf = data_intf; spin_lock_init(&pnd->tx_lock); spin_lock_init(&pnd->rx_lock); /* Endpoints */ if (usb_pipein(data_desc->endpoint[0].desc.bEndpointAddress)) { pnd->rx_pipe = usb_rcvbulkpipe(usbdev, data_desc->endpoint[0].desc.bEndpointAddress); pnd->tx_pipe = usb_sndbulkpipe(usbdev, data_desc->endpoint[1].desc.bEndpointAddress); } else { pnd->rx_pipe = usb_rcvbulkpipe(usbdev, data_desc->endpoint[1].desc.bEndpointAddress); pnd->tx_pipe = usb_sndbulkpipe(usbdev, data_desc->endpoint[0].desc.bEndpointAddress); } pnd->active_setting = data_desc - data_intf->altsetting; err = usb_driver_claim_interface(&usbpn_driver, data_intf, pnd); if (err) goto out; /* Force inactive mode until the network device is brought UP */ usb_set_interface(usbdev, union_header->bSlaveInterface0, !pnd->active_setting); usb_set_intfdata(intf, pnd); err = register_netdev(dev); if (err) { /* Set disconnected flag so that disconnect() returns early. */ pnd->disconnected = 1; usb_driver_release_interface(&usbpn_driver, data_intf); goto out; } dev_dbg(&dev->dev, "USB CDC Phonet device found\n"); return 0; out: usb_set_intfdata(intf, NULL); free_netdev(dev); return err; } static void usbpn_disconnect(struct usb_interface *intf) { struct usbpn_dev *pnd = usb_get_intfdata(intf); if (pnd->disconnected) return; pnd->disconnected = 1; usb_driver_release_interface(&usbpn_driver, (pnd->intf == intf) ? pnd->data_intf : pnd->intf); unregister_netdev(pnd->dev); } static struct usb_driver usbpn_driver = { .name = "cdc_phonet", .probe = usbpn_probe, .disconnect = usbpn_disconnect, .id_table = usbpn_ids, .disable_hub_initiated_lpm = 1, }; module_usb_driver(usbpn_driver); MODULE_AUTHOR("Remi Denis-Courmont"); MODULE_DESCRIPTION("USB CDC Phonet host interface"); MODULE_LICENSE("GPL"); |
| 18 8 18 8 17 17 17 17 17 17 17 8 7 7 1 7 1 7 1 7 1 7 2 7 1 7 1 7 1 7 6 7 7 18 18 16 10 18 16 9 8 1 10 8 9 10 11 11 11 11 11 11 11 11 11 11 11 11 1 11 17 17 8 8 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 | // SPDX-License-Identifier: GPL-2.0-only /* Copyright (C) 2013 Cisco Systems, Inc, 2013. * * Author: Vijay Subramanian <vijaynsu@cisco.com> * Author: Mythili Prabhu <mysuryan@cisco.com> * * ECN support is added by Naeem Khademi <naeemk@ifi.uio.no> * University of Oslo, Norway. * * References: * RFC 8033: https://tools.ietf.org/html/rfc8033 */ #include <linux/module.h> #include <linux/slab.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/errno.h> #include <linux/skbuff.h> #include <net/pkt_sched.h> #include <net/inet_ecn.h> #include <net/pie.h> /* private data for the Qdisc */ struct pie_sched_data { struct pie_vars vars; struct pie_params params; struct pie_stats stats; struct timer_list adapt_timer; struct Qdisc *sch; }; bool pie_drop_early(struct Qdisc *sch, struct pie_params *params, struct pie_vars *vars, u32 backlog, u32 packet_size) { u64 rnd; u64 local_prob = vars->prob; u32 mtu = psched_mtu(qdisc_dev(sch)); /* If there is still burst allowance left skip random early drop */ if (vars->burst_time > 0) return false; /* If current delay is less than half of target, and * if drop prob is low already, disable early_drop */ if ((vars->qdelay < params->target / 2) && (vars->prob < MAX_PROB / 5)) return false; /* If we have fewer than 2 mtu-sized packets, disable pie_drop_early, * similar to min_th in RED */ if (backlog < 2 * mtu) return false; /* If bytemode is turned on, use packet size to compute new * probablity. Smaller packets will have lower drop prob in this case */ if (params->bytemode && packet_size <= mtu) local_prob = (u64)packet_size * div_u64(local_prob, mtu); else local_prob = vars->prob; if (local_prob == 0) vars->accu_prob = 0; else vars->accu_prob += local_prob; if (vars->accu_prob < (MAX_PROB / 100) * 85) return false; if (vars->accu_prob >= (MAX_PROB / 2) * 17) return true; get_random_bytes(&rnd, 8); if ((rnd >> BITS_PER_BYTE) < local_prob) { vars->accu_prob = 0; return true; } return false; } EXPORT_SYMBOL_GPL(pie_drop_early); static int pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) { struct pie_sched_data *q = qdisc_priv(sch); bool enqueue = false; if (unlikely(qdisc_qlen(sch) >= sch->limit)) { q->stats.overlimit++; goto out; } if (!pie_drop_early(sch, &q->params, &q->vars, sch->qstats.backlog, skb->len)) { enqueue = true; } else if (q->params.ecn && (q->vars.prob <= MAX_PROB / 10) && INET_ECN_set_ce(skb)) { /* If packet is ecn capable, mark it if drop probability * is lower than 10%, else drop it. */ q->stats.ecn_mark++; enqueue = true; } /* we can enqueue the packet */ if (enqueue) { /* Set enqueue time only when dq_rate_estimator is disabled. */ if (!q->params.dq_rate_estimator) pie_set_enqueue_time(skb); q->stats.packets_in++; if (qdisc_qlen(sch) > q->stats.maxq) q->stats.maxq = qdisc_qlen(sch); return qdisc_enqueue_tail(skb, sch); } out: q->stats.dropped++; q->vars.accu_prob = 0; return qdisc_drop(skb, sch, to_free); } static const struct nla_policy pie_policy[TCA_PIE_MAX + 1] = { [TCA_PIE_TARGET] = {.type = NLA_U32}, [TCA_PIE_LIMIT] = {.type = NLA_U32}, [TCA_PIE_TUPDATE] = {.type = NLA_U32}, [TCA_PIE_ALPHA] = {.type = NLA_U32}, [TCA_PIE_BETA] = {.type = NLA_U32}, [TCA_PIE_ECN] = {.type = NLA_U32}, [TCA_PIE_BYTEMODE] = {.type = NLA_U32}, [TCA_PIE_DQ_RATE_ESTIMATOR] = {.type = NLA_U32}, }; static int pie_change(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { struct pie_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_PIE_MAX + 1]; unsigned int qlen, dropped = 0; int err; err = nla_parse_nested_deprecated(tb, TCA_PIE_MAX, opt, pie_policy, NULL); if (err < 0) return err; sch_tree_lock(sch); /* convert from microseconds to pschedtime */ if (tb[TCA_PIE_TARGET]) { /* target is in us */ u32 target = nla_get_u32(tb[TCA_PIE_TARGET]); /* convert to pschedtime */ WRITE_ONCE(q->params.target, PSCHED_NS2TICKS((u64)target * NSEC_PER_USEC)); } /* tupdate is in jiffies */ if (tb[TCA_PIE_TUPDATE]) WRITE_ONCE(q->params.tupdate, usecs_to_jiffies(nla_get_u32(tb[TCA_PIE_TUPDATE]))); if (tb[TCA_PIE_LIMIT]) { u32 limit = nla_get_u32(tb[TCA_PIE_LIMIT]); WRITE_ONCE(q->params.limit, limit); WRITE_ONCE(sch->limit, limit); } if (tb[TCA_PIE_ALPHA]) WRITE_ONCE(q->params.alpha, nla_get_u32(tb[TCA_PIE_ALPHA])); if (tb[TCA_PIE_BETA]) WRITE_ONCE(q->params.beta, nla_get_u32(tb[TCA_PIE_BETA])); if (tb[TCA_PIE_ECN]) WRITE_ONCE(q->params.ecn, nla_get_u32(tb[TCA_PIE_ECN])); if (tb[TCA_PIE_BYTEMODE]) WRITE_ONCE(q->params.bytemode, nla_get_u32(tb[TCA_PIE_BYTEMODE])); if (tb[TCA_PIE_DQ_RATE_ESTIMATOR]) WRITE_ONCE(q->params.dq_rate_estimator, nla_get_u32(tb[TCA_PIE_DQ_RATE_ESTIMATOR])); /* Drop excess packets if new limit is lower */ qlen = sch->q.qlen; while (sch->q.qlen > sch->limit) { struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb); rtnl_qdisc_drop(skb, sch); } qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped); sch_tree_unlock(sch); return 0; } void pie_process_dequeue(struct sk_buff *skb, struct pie_params *params, struct pie_vars *vars, u32 backlog) { psched_time_t now = psched_get_time(); u32 dtime = 0; /* If dq_rate_estimator is disabled, calculate qdelay using the * packet timestamp. */ if (!params->dq_rate_estimator) { vars->qdelay = now - pie_get_enqueue_time(skb); if (vars->dq_tstamp != DTIME_INVALID) dtime = now - vars->dq_tstamp; vars->dq_tstamp = now; if (backlog == 0) vars->qdelay = 0; if (dtime == 0) return; goto burst_allowance_reduction; } /* If current queue is about 10 packets or more and dq_count is unset * we have enough packets to calculate the drain rate. Save * current time as dq_tstamp and start measurement cycle. */ if (backlog >= QUEUE_THRESHOLD && vars->dq_count == DQCOUNT_INVALID) { vars->dq_tstamp = psched_get_time(); vars->dq_count = 0; } /* Calculate the average drain rate from this value. If queue length * has receded to a small value viz., <= QUEUE_THRESHOLD bytes, reset * the dq_count to -1 as we don't have enough packets to calculate the * drain rate anymore. The following if block is entered only when we * have a substantial queue built up (QUEUE_THRESHOLD bytes or more) * and we calculate the drain rate for the threshold here. dq_count is * in bytes, time difference in psched_time, hence rate is in * bytes/psched_time. */ if (vars->dq_count != DQCOUNT_INVALID) { vars->dq_count += skb->len; if (vars->dq_count >= QUEUE_THRESHOLD) { u32 count = vars->dq_count << PIE_SCALE; dtime = now - vars->dq_tstamp; if (dtime == 0) return; count = count / dtime; if (vars->avg_dq_rate == 0) vars->avg_dq_rate = count; else vars->avg_dq_rate = (vars->avg_dq_rate - (vars->avg_dq_rate >> 3)) + (count >> 3); /* If the queue has receded below the threshold, we hold * on to the last drain rate calculated, else we reset * dq_count to 0 to re-enter the if block when the next * packet is dequeued */ if (backlog < QUEUE_THRESHOLD) { vars->dq_count = DQCOUNT_INVALID; } else { vars->dq_count = 0; vars->dq_tstamp = psched_get_time(); } goto burst_allowance_reduction; } } return; burst_allowance_reduction: if (vars->burst_time > 0) { if (vars->burst_time > dtime) vars->burst_time -= dtime; else vars->burst_time = 0; } } EXPORT_SYMBOL_GPL(pie_process_dequeue); void pie_calculate_probability(struct pie_params *params, struct pie_vars *vars, u32 backlog) { psched_time_t qdelay = 0; /* in pschedtime */ psched_time_t qdelay_old = 0; /* in pschedtime */ s64 delta = 0; /* determines the change in probability */ u64 oldprob; u64 alpha, beta; u32 power; bool update_prob = true; if (params->dq_rate_estimator) { qdelay_old = vars->qdelay; vars->qdelay_old = vars->qdelay; if (vars->avg_dq_rate > 0) qdelay = (backlog << PIE_SCALE) / vars->avg_dq_rate; else qdelay = 0; } else { qdelay = vars->qdelay; qdelay_old = vars->qdelay_old; } /* If qdelay is zero and backlog is not, it means backlog is very small, * so we do not update probability in this round. */ if (qdelay == 0 && backlog != 0) update_prob = false; /* In the algorithm, alpha and beta are between 0 and 2 with typical * value for alpha as 0.125. In this implementation, we use values 0-32 * passed from user space to represent this. Also, alpha and beta have * unit of HZ and need to be scaled before they can used to update * probability. alpha/beta are updated locally below by scaling down * by 16 to come to 0-2 range. */ alpha = ((u64)params->alpha * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; beta = ((u64)params->beta * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; /* We scale alpha and beta differently depending on how heavy the * congestion is. Please see RFC 8033 for details. */ if (vars->prob < MAX_PROB / 10) { alpha >>= 1; beta >>= 1; power = 100; while (vars->prob < div_u64(MAX_PROB, power) && power <= 1000000) { alpha >>= 2; beta >>= 2; power *= 10; } } /* alpha and beta should be between 0 and 32, in multiples of 1/16 */ delta += alpha * (qdelay - params->target); delta += beta * (qdelay - qdelay_old); oldprob = vars->prob; /* to ensure we increase probability in steps of no more than 2% */ if (delta > (s64)(MAX_PROB / (100 / 2)) && vars->prob >= MAX_PROB / 10) delta = (MAX_PROB / 100) * 2; /* Non-linear drop: * Tune drop probability to increase quickly for high delays(>= 250ms) * 250ms is derived through experiments and provides error protection */ if (qdelay > (PSCHED_NS2TICKS(250 * NSEC_PER_MSEC))) delta += MAX_PROB / (100 / 2); vars->prob += delta; if (delta > 0) { /* prevent overflow */ if (vars->prob < oldprob) { vars->prob = MAX_PROB; /* Prevent normalization error. If probability is at * maximum value already, we normalize it here, and * skip the check to do a non-linear drop in the next * section. */ update_prob = false; } } else { /* prevent underflow */ if (vars->prob > oldprob) vars->prob = 0; } /* Non-linear drop in probability: Reduce drop probability quickly if * delay is 0 for 2 consecutive Tupdate periods. */ if (qdelay == 0 && qdelay_old == 0 && update_prob) /* Reduce drop probability to 98.4% */ vars->prob -= vars->prob / 64; vars->qdelay = qdelay; vars->backlog_old = backlog; /* We restart the measurement cycle if the following conditions are met * 1. If the delay has been low for 2 consecutive Tupdate periods * 2. Calculated drop probability is zero * 3. If average dq_rate_estimator is enabled, we have at least one * estimate for the avg_dq_rate ie., is a non-zero value */ if ((vars->qdelay < params->target / 2) && (vars->qdelay_old < params->target / 2) && vars->prob == 0 && (!params->dq_rate_estimator || vars->avg_dq_rate > 0)) { pie_vars_init(vars); } if (!params->dq_rate_estimator) vars->qdelay_old = qdelay; } EXPORT_SYMBOL_GPL(pie_calculate_probability); static void pie_timer(struct timer_list *t) { struct pie_sched_data *q = from_timer(q, t, adapt_timer); struct Qdisc *sch = q->sch; spinlock_t *root_lock; rcu_read_lock(); root_lock = qdisc_lock(qdisc_root_sleeping(sch)); spin_lock(root_lock); pie_calculate_probability(&q->params, &q->vars, sch->qstats.backlog); /* reset the timer to fire after 'tupdate'. tupdate is in jiffies. */ if (q->params.tupdate) mod_timer(&q->adapt_timer, jiffies + q->params.tupdate); spin_unlock(root_lock); rcu_read_unlock(); } static int pie_init(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { struct pie_sched_data *q = qdisc_priv(sch); pie_params_init(&q->params); pie_vars_init(&q->vars); sch->limit = q->params.limit; q->sch = sch; timer_setup(&q->adapt_timer, pie_timer, 0); if (opt) { int err = pie_change(sch, opt, extack); if (err) return err; } mod_timer(&q->adapt_timer, jiffies + HZ / 2); return 0; } static int pie_dump(struct Qdisc *sch, struct sk_buff *skb) { struct pie_sched_data *q = qdisc_priv(sch); struct nlattr *opts; opts = nla_nest_start_noflag(skb, TCA_OPTIONS); if (!opts) goto nla_put_failure; /* convert target from pschedtime to us */ if (nla_put_u32(skb, TCA_PIE_TARGET, ((u32)PSCHED_TICKS2NS(READ_ONCE(q->params.target))) / NSEC_PER_USEC) || nla_put_u32(skb, TCA_PIE_LIMIT, READ_ONCE(sch->limit)) || nla_put_u32(skb, TCA_PIE_TUPDATE, jiffies_to_usecs(READ_ONCE(q->params.tupdate))) || nla_put_u32(skb, TCA_PIE_ALPHA, READ_ONCE(q->params.alpha)) || nla_put_u32(skb, TCA_PIE_BETA, READ_ONCE(q->params.beta)) || nla_put_u32(skb, TCA_PIE_ECN, q->params.ecn) || nla_put_u32(skb, TCA_PIE_BYTEMODE, READ_ONCE(q->params.bytemode)) || nla_put_u32(skb, TCA_PIE_DQ_RATE_ESTIMATOR, READ_ONCE(q->params.dq_rate_estimator))) goto nla_put_failure; return nla_nest_end(skb, opts); nla_put_failure: nla_nest_cancel(skb, opts); return -1; } static int pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d) { struct pie_sched_data *q = qdisc_priv(sch); struct tc_pie_xstats st = { .prob = q->vars.prob << BITS_PER_BYTE, .delay = ((u32)PSCHED_TICKS2NS(q->vars.qdelay)) / NSEC_PER_USEC, .packets_in = q->stats.packets_in, .overlimit = q->stats.overlimit, .maxq = q->stats.maxq, .dropped = q->stats.dropped, .ecn_mark = q->stats.ecn_mark, }; /* avg_dq_rate is only valid if dq_rate_estimator is enabled */ st.dq_rate_estimating = q->params.dq_rate_estimator; /* unscale and return dq_rate in bytes per sec */ if (q->params.dq_rate_estimator) st.avg_dq_rate = q->vars.avg_dq_rate * (PSCHED_TICKS_PER_SEC) >> PIE_SCALE; return gnet_stats_copy_app(d, &st, sizeof(st)); } static struct sk_buff *pie_qdisc_dequeue(struct Qdisc *sch) { struct pie_sched_data *q = qdisc_priv(sch); struct sk_buff *skb = qdisc_dequeue_head(sch); if (!skb) return NULL; pie_process_dequeue(skb, &q->params, &q->vars, sch->qstats.backlog); return skb; } static void pie_reset(struct Qdisc *sch) { struct pie_sched_data *q = qdisc_priv(sch); qdisc_reset_queue(sch); pie_vars_init(&q->vars); } static void pie_destroy(struct Qdisc *sch) { struct pie_sched_data *q = qdisc_priv(sch); q->params.tupdate = 0; del_timer_sync(&q->adapt_timer); } static struct Qdisc_ops pie_qdisc_ops __read_mostly = { .id = "pie", .priv_size = sizeof(struct pie_sched_data), .enqueue = pie_qdisc_enqueue, .dequeue = pie_qdisc_dequeue, .peek = qdisc_peek_dequeued, .init = pie_init, .destroy = pie_destroy, .reset = pie_reset, .change = pie_change, .dump = pie_dump, .dump_stats = pie_dump_stats, .owner = THIS_MODULE, }; MODULE_ALIAS_NET_SCH("pie"); static int __init pie_module_init(void) { return register_qdisc(&pie_qdisc_ops); } static void __exit pie_module_exit(void) { unregister_qdisc(&pie_qdisc_ops); } module_init(pie_module_init); module_exit(pie_module_exit); MODULE_DESCRIPTION("Proportional Integral controller Enhanced (PIE) scheduler"); MODULE_AUTHOR("Vijay Subramanian"); MODULE_AUTHOR("Mythili Prabhu"); MODULE_LICENSE("GPL"); |
| 364 366 364 365 365 16 2 5 56 9 9 1 6 6 6 73 73 6 1 1 6 17 17 17 1 1 1 1 1 1 1 1 1 1 1 96 96 96 73 4 3 2 3 3 2 1 4 4 2 2 2 2 1 2 1 1 3 2 3 1 3 2 2 1 1 2 1 2 17 16 16 16 16 16 2 9 7 14 14 14 14 1 14 6 8 9 9 9 7 7 6 7 7 7 5 7 6 11 30 19 30 30 29 28 28 21 5 5 4 2 10 18 30 228 118 367 265 41 41 41 41 41 8 5 41 228 2 11 11 1 11 224 11 11 3 2 9 11 11 228 281 6 6 6 281 16 16 2 281 368 368 368 53 52 54 54 45 43 2 1 41 41 357 227 228 228 57 53 57 57 57 3 56 41 55 1 227 1 1 1 1 1 1 235 236 66 11 227 227 227 227 227 227 227 235 6 1 39 39 1 39 36 36 35 34 34 34 2 2 4 2 1 1 3 2 11 2 4 13 13 14 14 14 13 11 10 39 39 14 11 504 504 44 370 369 363 363 334 360 15 15 3 12 12 2 487 487 355 487 40 486 1 489 184 630 630 184 184 184 277 279 14 99 57 49 45 5 2 42 101 20 20 20 20 19 16 19 15 1 15 15 9 8 9 40 38 8 38 38 6 33 32 31 32 20 20 32 28 14 6 6 3 28 18 33 12 14 14 37 37 12 32 37 37 53 29 29 13 13 52 40 40 25 13 22 31 29 5 5 3 2 3 29 19 7 3 3 3 7 7 7 52 45 45 5 45 45 2 43 30 43 37 37 43 6 6 6 5 6 40 26 43 45 364 7 365 61 61 60 59 9 5 5 57 57 57 61 37 26 23 10 13 17 17 4 16 3 13 13 11 3 3 11 10 11 17 44 42 42 42 40 40 39 40 37 13 27 3 34 1 8 35 8 7 4 3 4 44 29 3 2 27 27 2 2 2 25 2 3 3 1 26 3 29 26 24 25 23 23 22 21 22 19 15 15 1 4 18 9 9 18 30 1 7 6 6 4 4 4 4 3 2 2 3 3 3 3 1 7 7 7 6 5 5 4 5 4 3 1 2 2 2 1 1 7 68 19 4 7 5 19 4 1 3 3 3 1 19 68 12 499 12 11 12 12 270 271 267 267 137 101 1 2 2 1 1 100 101 271 8 1 8 8 8 8 8 8 2 8 8 8 4 4 4 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 5 5 5 5 5 2 5 2 2 2 2 2 2 2 2 2 2 4 1 1 4 5 5 5 5 2 5 4 1 5 5 5 2 5 65 65 65 64 1440 1441 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Linux NET3: Internet Group Management Protocol [IGMP] * * This code implements the IGMP protocol as defined in RFC1112. There has * been a further revision of this protocol since which is now supported. * * If you have trouble with this module be careful what gcc you have used, * the older version didn't come out right using gcc 2.5.8, the newer one * seems to fall out with gcc 2.6.2. * * Authors: * Alan Cox <alan@lxorguk.ukuu.org.uk> * * Fixes: * * Alan Cox : Added lots of __inline__ to optimise * the memory usage of all the tiny little * functions. * Alan Cox : Dumped the header building experiment. * Alan Cox : Minor tweaks ready for multicast routing * and extended IGMP protocol. * Alan Cox : Removed a load of inline directives. Gcc 2.5.8 * writes utterly bogus code otherwise (sigh) * fixed IGMP loopback to behave in the manner * desired by mrouted, fixed the fact it has been * broken since 1.3.6 and cleaned up a few minor * points. * * Chih-Jen Chang : Tried to revise IGMP to Version 2 * Tsu-Sheng Tsao E-mail: chihjenc@scf.usc.edu and tsusheng@scf.usc.edu * The enhancements are mainly based on Steve Deering's * ipmulti-3.5 source code. * Chih-Jen Chang : Added the igmp_get_mrouter_info and * Tsu-Sheng Tsao igmp_set_mrouter_info to keep track of * the mrouted version on that device. * Chih-Jen Chang : Added the max_resp_time parameter to * Tsu-Sheng Tsao igmp_heard_query(). Using this parameter * to identify the multicast router version * and do what the IGMP version 2 specified. * Chih-Jen Chang : Added a timer to revert to IGMP V2 router * Tsu-Sheng Tsao if the specified time expired. * Alan Cox : Stop IGMP from 0.0.0.0 being accepted. * Alan Cox : Use GFP_ATOMIC in the right places. * Christian Daudt : igmp timer wasn't set for local group * memberships but was being deleted, * which caused a "del_timer() called * from %p with timer not initialized\n" * message (960131). * Christian Daudt : removed del_timer from * igmp_timer_expire function (960205). * Christian Daudt : igmp_heard_report now only calls * igmp_timer_expire if tm->running is * true (960216). * Malcolm Beattie : ttl comparison wrong in igmp_rcv made * igmp_heard_query never trigger. Expiry * miscalculation fixed in igmp_heard_query * and random() made to return unsigned to * prevent negative expiry times. * Alexey Kuznetsov: Wrong group leaving behaviour, backport * fix from pending 2.1.x patches. * Alan Cox: Forget to enable FDDI support earlier. * Alexey Kuznetsov: Fixed leaving groups on device down. * Alexey Kuznetsov: Accordance to igmp-v2-06 draft. * David L Stevens: IGMPv3 support, with help from * Vinay Kulkarni */ #include <linux/module.h> #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/jiffies.h> #include <linux/string.h> #include <linux/socket.h> #include <linux/sockios.h> #include <linux/in.h> #include <linux/inet.h> #include <linux/netdevice.h> #include <linux/skbuff.h> #include <linux/inetdevice.h> #include <linux/igmp.h> #include <linux/if_arp.h> #include <linux/rtnetlink.h> #include <linux/times.h> #include <linux/pkt_sched.h> #include <linux/byteorder/generic.h> #include <net/net_namespace.h> #include <net/arp.h> #include <net/ip.h> #include <net/protocol.h> #include <net/route.h> #include <net/sock.h> #include <net/checksum.h> #include <net/inet_common.h> #include <linux/netfilter_ipv4.h> #ifdef CONFIG_IP_MROUTE #include <linux/mroute.h> #endif #ifdef CONFIG_PROC_FS #include <linux/proc_fs.h> #include <linux/seq_file.h> #endif #ifdef CONFIG_IP_MULTICAST /* Parameter names and values are taken from igmp-v2-06 draft */ #define IGMP_QUERY_INTERVAL (125*HZ) #define IGMP_QUERY_RESPONSE_INTERVAL (10*HZ) #define IGMP_INITIAL_REPORT_DELAY (1) /* IGMP_INITIAL_REPORT_DELAY is not from IGMP specs! * IGMP specs require to report membership immediately after * joining a group, but we delay the first report by a * small interval. It seems more natural and still does not * contradict to specs provided this delay is small enough. */ #define IGMP_V1_SEEN(in_dev) \ (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 1 || \ IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 1 || \ ((in_dev)->mr_v1_seen && \ time_before(jiffies, (in_dev)->mr_v1_seen))) #define IGMP_V2_SEEN(in_dev) \ (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 2 || \ IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 2 || \ ((in_dev)->mr_v2_seen && \ time_before(jiffies, (in_dev)->mr_v2_seen))) static int unsolicited_report_interval(struct in_device *in_dev) { int interval_ms, interval_jiffies; if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev)) interval_ms = IN_DEV_CONF_GET( in_dev, IGMPV2_UNSOLICITED_REPORT_INTERVAL); else /* v3 */ interval_ms = IN_DEV_CONF_GET( in_dev, IGMPV3_UNSOLICITED_REPORT_INTERVAL); interval_jiffies = msecs_to_jiffies(interval_ms); /* _timer functions can't handle a delay of 0 jiffies so ensure * we always return a positive value. */ if (interval_jiffies <= 0) interval_jiffies = 1; return interval_jiffies; } static void igmpv3_add_delrec(struct in_device *in_dev, struct ip_mc_list *im, gfp_t gfp); static void igmpv3_del_delrec(struct in_device *in_dev, struct ip_mc_list *im); static void igmpv3_clear_delrec(struct in_device *in_dev); static int sf_setstate(struct ip_mc_list *pmc); static void sf_markstate(struct ip_mc_list *pmc); #endif static void ip_mc_clear_src(struct ip_mc_list *pmc); static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode, int sfcount, __be32 *psfsrc, int delta); static void ip_ma_put(struct ip_mc_list *im) { if (refcount_dec_and_test(&im->refcnt)) { in_dev_put(im->interface); kfree_rcu(im, rcu); } } #define for_each_pmc_rcu(in_dev, pmc) \ for (pmc = rcu_dereference(in_dev->mc_list); \ pmc != NULL; \ pmc = rcu_dereference(pmc->next_rcu)) #define for_each_pmc_rtnl(in_dev, pmc) \ for (pmc = rtnl_dereference(in_dev->mc_list); \ pmc != NULL; \ pmc = rtnl_dereference(pmc->next_rcu)) static void ip_sf_list_clear_all(struct ip_sf_list *psf) { struct ip_sf_list *next; while (psf) { next = psf->sf_next; kfree(psf); psf = next; } } #ifdef CONFIG_IP_MULTICAST /* * Timer management */ static void igmp_stop_timer(struct ip_mc_list *im) { spin_lock_bh(&im->lock); if (del_timer(&im->timer)) refcount_dec(&im->refcnt); im->tm_running = 0; im->reporter = 0; im->unsolicit_count = 0; spin_unlock_bh(&im->lock); } /* It must be called with locked im->lock */ static void igmp_start_timer(struct ip_mc_list *im, int max_delay) { int tv = get_random_u32_below(max_delay); im->tm_running = 1; if (refcount_inc_not_zero(&im->refcnt)) { if (mod_timer(&im->timer, jiffies + tv + 2)) ip_ma_put(im); } } static void igmp_gq_start_timer(struct in_device *in_dev) { int tv = get_random_u32_below(in_dev->mr_maxdelay); unsigned long exp = jiffies + tv + 2; if (in_dev->mr_gq_running && time_after_eq(exp, (in_dev->mr_gq_timer).expires)) return; in_dev->mr_gq_running = 1; if (!mod_timer(&in_dev->mr_gq_timer, exp)) in_dev_hold(in_dev); } static void igmp_ifc_start_timer(struct in_device *in_dev, int delay) { int tv = get_random_u32_below(delay); if (!mod_timer(&in_dev->mr_ifc_timer, jiffies+tv+2)) in_dev_hold(in_dev); } static void igmp_mod_timer(struct ip_mc_list *im, int max_delay) { spin_lock_bh(&im->lock); im->unsolicit_count = 0; if (del_timer(&im->timer)) { if ((long)(im->timer.expires-jiffies) < max_delay) { add_timer(&im->timer); im->tm_running = 1; spin_unlock_bh(&im->lock); return; } refcount_dec(&im->refcnt); } igmp_start_timer(im, max_delay); spin_unlock_bh(&im->lock); } /* * Send an IGMP report. */ #define IGMP_SIZE (sizeof(struct igmphdr)+sizeof(struct iphdr)+4) static int is_in(struct ip_mc_list *pmc, struct ip_sf_list *psf, int type, int gdeleted, int sdeleted) { switch (type) { case IGMPV3_MODE_IS_INCLUDE: case IGMPV3_MODE_IS_EXCLUDE: if (gdeleted || sdeleted) return 0; if (!(pmc->gsquery && !psf->sf_gsresp)) { if (pmc->sfmode == MCAST_INCLUDE) return 1; /* don't include if this source is excluded * in all filters */ if (psf->sf_count[MCAST_INCLUDE]) return type == IGMPV3_MODE_IS_INCLUDE; return pmc->sfcount[MCAST_EXCLUDE] == psf->sf_count[MCAST_EXCLUDE]; } return 0; case IGMPV3_CHANGE_TO_INCLUDE: if (gdeleted || sdeleted) return 0; return psf->sf_count[MCAST_INCLUDE] != 0; case IGMPV3_CHANGE_TO_EXCLUDE: if (gdeleted || sdeleted) return 0; if (pmc->sfcount[MCAST_EXCLUDE] == 0 || psf->sf_count[MCAST_INCLUDE]) return 0; return pmc->sfcount[MCAST_EXCLUDE] == psf->sf_count[MCAST_EXCLUDE]; case IGMPV3_ALLOW_NEW_SOURCES: if (gdeleted || !psf->sf_crcount) return 0; return (pmc->sfmode == MCAST_INCLUDE) ^ sdeleted; case IGMPV3_BLOCK_OLD_SOURCES: if (pmc->sfmode == MCAST_INCLUDE) return gdeleted || (psf->sf_crcount && sdeleted); return psf->sf_crcount && !gdeleted && !sdeleted; } return 0; } static int igmp_scount(struct ip_mc_list *pmc, int type, int gdeleted, int sdeleted) { struct ip_sf_list *psf; int scount = 0; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (!is_in(pmc, psf, type, gdeleted, sdeleted)) continue; scount++; } return scount; } /* source address selection per RFC 3376 section 4.2.13 */ static __be32 igmpv3_get_srcaddr(struct net_device *dev, const struct flowi4 *fl4) { struct in_device *in_dev = __in_dev_get_rcu(dev); const struct in_ifaddr *ifa; if (!in_dev) return htonl(INADDR_ANY); in_dev_for_each_ifa_rcu(ifa, in_dev) { if (fl4->saddr == ifa->ifa_local) return fl4->saddr; } return htonl(INADDR_ANY); } static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu) { struct sk_buff *skb; struct rtable *rt; struct iphdr *pip; struct igmpv3_report *pig; struct net *net = dev_net(dev); struct flowi4 fl4; int hlen = LL_RESERVED_SPACE(dev); int tlen = dev->needed_tailroom; unsigned int size; size = min(mtu, IP_MAX_MTU); while (1) { skb = alloc_skb(size + hlen + tlen, GFP_ATOMIC | __GFP_NOWARN); if (skb) break; size >>= 1; if (size < 256) return NULL; } skb->priority = TC_PRIO_CONTROL; rt = ip_route_output_ports(net, &fl4, NULL, IGMPV3_ALL_MCR, 0, 0, 0, IPPROTO_IGMP, 0, dev->ifindex); if (IS_ERR(rt)) { kfree_skb(skb); return NULL; } skb_dst_set(skb, &rt->dst); skb->dev = dev; skb_reserve(skb, hlen); skb_tailroom_reserve(skb, mtu, tlen); skb_reset_network_header(skb); pip = ip_hdr(skb); skb_put(skb, sizeof(struct iphdr) + 4); pip->version = 4; pip->ihl = (sizeof(struct iphdr)+4)>>2; pip->tos = 0xc0; pip->frag_off = htons(IP_DF); pip->ttl = 1; pip->daddr = fl4.daddr; rcu_read_lock(); pip->saddr = igmpv3_get_srcaddr(dev, &fl4); rcu_read_unlock(); pip->protocol = IPPROTO_IGMP; pip->tot_len = 0; /* filled in later */ ip_select_ident(net, skb, NULL); ((u8 *)&pip[1])[0] = IPOPT_RA; ((u8 *)&pip[1])[1] = 4; ((u8 *)&pip[1])[2] = 0; ((u8 *)&pip[1])[3] = 0; skb->transport_header = skb->network_header + sizeof(struct iphdr) + 4; skb_put(skb, sizeof(*pig)); pig = igmpv3_report_hdr(skb); pig->type = IGMPV3_HOST_MEMBERSHIP_REPORT; pig->resv1 = 0; pig->csum = 0; pig->resv2 = 0; pig->ngrec = 0; return skb; } static int igmpv3_sendpack(struct sk_buff *skb) { struct igmphdr *pig = igmp_hdr(skb); const int igmplen = skb_tail_pointer(skb) - skb_transport_header(skb); pig->csum = ip_compute_csum(igmp_hdr(skb), igmplen); return ip_local_out(dev_net(skb_dst(skb)->dev), skb->sk, skb); } static int grec_size(struct ip_mc_list *pmc, int type, int gdel, int sdel) { return sizeof(struct igmpv3_grec) + 4*igmp_scount(pmc, type, gdel, sdel); } static struct sk_buff *add_grhead(struct sk_buff *skb, struct ip_mc_list *pmc, int type, struct igmpv3_grec **ppgr, unsigned int mtu) { struct net_device *dev = pmc->interface->dev; struct igmpv3_report *pih; struct igmpv3_grec *pgr; if (!skb) { skb = igmpv3_newpack(dev, mtu); if (!skb) return NULL; } pgr = skb_put(skb, sizeof(struct igmpv3_grec)); pgr->grec_type = type; pgr->grec_auxwords = 0; pgr->grec_nsrcs = 0; pgr->grec_mca = pmc->multiaddr; pih = igmpv3_report_hdr(skb); pih->ngrec = htons(ntohs(pih->ngrec)+1); *ppgr = pgr; return skb; } #define AVAILABLE(skb) ((skb) ? skb_availroom(skb) : 0) static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc, int type, int gdeleted, int sdeleted) { struct net_device *dev = pmc->interface->dev; struct net *net = dev_net(dev); struct igmpv3_report *pih; struct igmpv3_grec *pgr = NULL; struct ip_sf_list *psf, *psf_next, *psf_prev, **psf_list; int scount, stotal, first, isquery, truncate; unsigned int mtu; if (pmc->multiaddr == IGMP_ALL_HOSTS) return skb; if (ipv4_is_local_multicast(pmc->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return skb; mtu = READ_ONCE(dev->mtu); if (mtu < IPV4_MIN_MTU) return skb; isquery = type == IGMPV3_MODE_IS_INCLUDE || type == IGMPV3_MODE_IS_EXCLUDE; truncate = type == IGMPV3_MODE_IS_EXCLUDE || type == IGMPV3_CHANGE_TO_EXCLUDE; stotal = scount = 0; psf_list = sdeleted ? &pmc->tomb : &pmc->sources; if (!*psf_list) goto empty_source; pih = skb ? igmpv3_report_hdr(skb) : NULL; /* EX and TO_EX get a fresh packet, if needed */ if (truncate) { if (pih && pih->ngrec && AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) { if (skb) igmpv3_sendpack(skb); skb = igmpv3_newpack(dev, mtu); } } first = 1; psf_prev = NULL; for (psf = *psf_list; psf; psf = psf_next) { __be32 *psrc; psf_next = psf->sf_next; if (!is_in(pmc, psf, type, gdeleted, sdeleted)) { psf_prev = psf; continue; } /* Based on RFC3376 5.1. Should not send source-list change * records when there is a filter mode change. */ if (((gdeleted && pmc->sfmode == MCAST_EXCLUDE) || (!gdeleted && pmc->crcount)) && (type == IGMPV3_ALLOW_NEW_SOURCES || type == IGMPV3_BLOCK_OLD_SOURCES) && psf->sf_crcount) goto decrease_sf_crcount; /* clear marks on query responses */ if (isquery) psf->sf_gsresp = 0; if (AVAILABLE(skb) < sizeof(__be32) + first*sizeof(struct igmpv3_grec)) { if (truncate && !first) break; /* truncate these */ if (pgr) pgr->grec_nsrcs = htons(scount); if (skb) igmpv3_sendpack(skb); skb = igmpv3_newpack(dev, mtu); first = 1; scount = 0; } if (first) { skb = add_grhead(skb, pmc, type, &pgr, mtu); first = 0; } if (!skb) return NULL; psrc = skb_put(skb, sizeof(__be32)); *psrc = psf->sf_inaddr; scount++; stotal++; if ((type == IGMPV3_ALLOW_NEW_SOURCES || type == IGMPV3_BLOCK_OLD_SOURCES) && psf->sf_crcount) { decrease_sf_crcount: psf->sf_crcount--; if ((sdeleted || gdeleted) && psf->sf_crcount == 0) { if (psf_prev) psf_prev->sf_next = psf->sf_next; else *psf_list = psf->sf_next; kfree(psf); continue; } } psf_prev = psf; } empty_source: if (!stotal) { if (type == IGMPV3_ALLOW_NEW_SOURCES || type == IGMPV3_BLOCK_OLD_SOURCES) return skb; if (pmc->crcount || isquery) { /* make sure we have room for group header */ if (skb && AVAILABLE(skb) < sizeof(struct igmpv3_grec)) { igmpv3_sendpack(skb); skb = NULL; /* add_grhead will get a new one */ } skb = add_grhead(skb, pmc, type, &pgr, mtu); } } if (pgr) pgr->grec_nsrcs = htons(scount); if (isquery) pmc->gsquery = 0; /* clear query state on report */ return skb; } static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc) { struct sk_buff *skb = NULL; struct net *net = dev_net(in_dev->dev); int type; if (!pmc) { rcu_read_lock(); for_each_pmc_rcu(in_dev, pmc) { if (pmc->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(pmc->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue; spin_lock_bh(&pmc->lock); if (pmc->sfcount[MCAST_EXCLUDE]) type = IGMPV3_MODE_IS_EXCLUDE; else type = IGMPV3_MODE_IS_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0); spin_unlock_bh(&pmc->lock); } rcu_read_unlock(); } else { spin_lock_bh(&pmc->lock); if (pmc->sfcount[MCAST_EXCLUDE]) type = IGMPV3_MODE_IS_EXCLUDE; else type = IGMPV3_MODE_IS_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0); spin_unlock_bh(&pmc->lock); } if (!skb) return 0; return igmpv3_sendpack(skb); } /* * remove zero-count source records from a source filter list */ static void igmpv3_clear_zeros(struct ip_sf_list **ppsf) { struct ip_sf_list *psf_prev, *psf_next, *psf; psf_prev = NULL; for (psf = *ppsf; psf; psf = psf_next) { psf_next = psf->sf_next; if (psf->sf_crcount == 0) { if (psf_prev) psf_prev->sf_next = psf->sf_next; else *ppsf = psf->sf_next; kfree(psf); } else psf_prev = psf; } } static void kfree_pmc(struct ip_mc_list *pmc) { ip_sf_list_clear_all(pmc->sources); ip_sf_list_clear_all(pmc->tomb); kfree(pmc); } static void igmpv3_send_cr(struct in_device *in_dev) { struct ip_mc_list *pmc, *pmc_prev, *pmc_next; struct sk_buff *skb = NULL; int type, dtype; rcu_read_lock(); spin_lock_bh(&in_dev->mc_tomb_lock); /* deleted MCA's */ pmc_prev = NULL; for (pmc = in_dev->mc_tomb; pmc; pmc = pmc_next) { pmc_next = pmc->next; if (pmc->sfmode == MCAST_INCLUDE) { type = IGMPV3_BLOCK_OLD_SOURCES; dtype = IGMPV3_BLOCK_OLD_SOURCES; skb = add_grec(skb, pmc, type, 1, 0); skb = add_grec(skb, pmc, dtype, 1, 1); } if (pmc->crcount) { if (pmc->sfmode == MCAST_EXCLUDE) { type = IGMPV3_CHANGE_TO_INCLUDE; skb = add_grec(skb, pmc, type, 1, 0); } pmc->crcount--; if (pmc->crcount == 0) { igmpv3_clear_zeros(&pmc->tomb); igmpv3_clear_zeros(&pmc->sources); } } if (pmc->crcount == 0 && !pmc->tomb && !pmc->sources) { if (pmc_prev) pmc_prev->next = pmc_next; else in_dev->mc_tomb = pmc_next; in_dev_put(pmc->interface); kfree_pmc(pmc); } else pmc_prev = pmc; } spin_unlock_bh(&in_dev->mc_tomb_lock); /* change recs */ for_each_pmc_rcu(in_dev, pmc) { spin_lock_bh(&pmc->lock); if (pmc->sfcount[MCAST_EXCLUDE]) { type = IGMPV3_BLOCK_OLD_SOURCES; dtype = IGMPV3_ALLOW_NEW_SOURCES; } else { type = IGMPV3_ALLOW_NEW_SOURCES; dtype = IGMPV3_BLOCK_OLD_SOURCES; } skb = add_grec(skb, pmc, type, 0, 0); skb = add_grec(skb, pmc, dtype, 0, 1); /* deleted sources */ /* filter mode changes */ if (pmc->crcount) { if (pmc->sfmode == MCAST_EXCLUDE) type = IGMPV3_CHANGE_TO_EXCLUDE; else type = IGMPV3_CHANGE_TO_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0); pmc->crcount--; } spin_unlock_bh(&pmc->lock); } rcu_read_unlock(); if (!skb) return; (void) igmpv3_sendpack(skb); } static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc, int type) { struct sk_buff *skb; struct iphdr *iph; struct igmphdr *ih; struct rtable *rt; struct net_device *dev = in_dev->dev; struct net *net = dev_net(dev); __be32 group = pmc ? pmc->multiaddr : 0; struct flowi4 fl4; __be32 dst; int hlen, tlen; if (type == IGMPV3_HOST_MEMBERSHIP_REPORT) return igmpv3_send_report(in_dev, pmc); if (ipv4_is_local_multicast(group) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return 0; if (type == IGMP_HOST_LEAVE_MESSAGE) dst = IGMP_ALL_ROUTER; else dst = group; rt = ip_route_output_ports(net, &fl4, NULL, dst, 0, 0, 0, IPPROTO_IGMP, 0, dev->ifindex); if (IS_ERR(rt)) return -1; hlen = LL_RESERVED_SPACE(dev); tlen = dev->needed_tailroom; skb = alloc_skb(IGMP_SIZE + hlen + tlen, GFP_ATOMIC); if (!skb) { ip_rt_put(rt); return -1; } skb->priority = TC_PRIO_CONTROL; skb_dst_set(skb, &rt->dst); skb_reserve(skb, hlen); skb_reset_network_header(skb); iph = ip_hdr(skb); skb_put(skb, sizeof(struct iphdr) + 4); iph->version = 4; iph->ihl = (sizeof(struct iphdr)+4)>>2; iph->tos = 0xc0; iph->frag_off = htons(IP_DF); iph->ttl = 1; iph->daddr = dst; iph->saddr = fl4.saddr; iph->protocol = IPPROTO_IGMP; ip_select_ident(net, skb, NULL); ((u8 *)&iph[1])[0] = IPOPT_RA; ((u8 *)&iph[1])[1] = 4; ((u8 *)&iph[1])[2] = 0; ((u8 *)&iph[1])[3] = 0; ih = skb_put(skb, sizeof(struct igmphdr)); ih->type = type; ih->code = 0; ih->csum = 0; ih->group = group; ih->csum = ip_compute_csum((void *)ih, sizeof(struct igmphdr)); return ip_local_out(net, skb->sk, skb); } static void igmp_gq_timer_expire(struct timer_list *t) { struct in_device *in_dev = from_timer(in_dev, t, mr_gq_timer); in_dev->mr_gq_running = 0; igmpv3_send_report(in_dev, NULL); in_dev_put(in_dev); } static void igmp_ifc_timer_expire(struct timer_list *t) { struct in_device *in_dev = from_timer(in_dev, t, mr_ifc_timer); u32 mr_ifc_count; igmpv3_send_cr(in_dev); restart: mr_ifc_count = READ_ONCE(in_dev->mr_ifc_count); if (mr_ifc_count) { if (cmpxchg(&in_dev->mr_ifc_count, mr_ifc_count, mr_ifc_count - 1) != mr_ifc_count) goto restart; igmp_ifc_start_timer(in_dev, unsolicited_report_interval(in_dev)); } in_dev_put(in_dev); } static void igmp_ifc_event(struct in_device *in_dev) { struct net *net = dev_net(in_dev->dev); if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev)) return; WRITE_ONCE(in_dev->mr_ifc_count, in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv)); igmp_ifc_start_timer(in_dev, 1); } static void igmp_timer_expire(struct timer_list *t) { struct ip_mc_list *im = from_timer(im, t, timer); struct in_device *in_dev = im->interface; spin_lock(&im->lock); im->tm_running = 0; if (im->unsolicit_count && --im->unsolicit_count) igmp_start_timer(im, unsolicited_report_interval(in_dev)); im->reporter = 1; spin_unlock(&im->lock); if (IGMP_V1_SEEN(in_dev)) igmp_send_report(in_dev, im, IGMP_HOST_MEMBERSHIP_REPORT); else if (IGMP_V2_SEEN(in_dev)) igmp_send_report(in_dev, im, IGMPV2_HOST_MEMBERSHIP_REPORT); else igmp_send_report(in_dev, im, IGMPV3_HOST_MEMBERSHIP_REPORT); ip_ma_put(im); } /* mark EXCLUDE-mode sources */ static int igmp_xmarksources(struct ip_mc_list *pmc, int nsrcs, __be32 *srcs) { struct ip_sf_list *psf; int i, scount; scount = 0; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (scount == nsrcs) break; for (i = 0; i < nsrcs; i++) { /* skip inactive filters */ if (psf->sf_count[MCAST_INCLUDE] || pmc->sfcount[MCAST_EXCLUDE] != psf->sf_count[MCAST_EXCLUDE]) break; if (srcs[i] == psf->sf_inaddr) { scount++; break; } } } pmc->gsquery = 0; if (scount == nsrcs) /* all sources excluded */ return 0; return 1; } static int igmp_marksources(struct ip_mc_list *pmc, int nsrcs, __be32 *srcs) { struct ip_sf_list *psf; int i, scount; if (pmc->sfmode == MCAST_EXCLUDE) return igmp_xmarksources(pmc, nsrcs, srcs); /* mark INCLUDE-mode sources */ scount = 0; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (scount == nsrcs) break; for (i = 0; i < nsrcs; i++) if (srcs[i] == psf->sf_inaddr) { psf->sf_gsresp = 1; scount++; break; } } if (!scount) { pmc->gsquery = 0; return 0; } pmc->gsquery = 1; return 1; } /* return true if packet was dropped */ static bool igmp_heard_report(struct in_device *in_dev, __be32 group) { struct ip_mc_list *im; struct net *net = dev_net(in_dev->dev); /* Timers are only set for non-local groups */ if (group == IGMP_ALL_HOSTS) return false; if (ipv4_is_local_multicast(group) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return false; rcu_read_lock(); for_each_pmc_rcu(in_dev, im) { if (im->multiaddr == group) { igmp_stop_timer(im); break; } } rcu_read_unlock(); return false; } /* return true if packet was dropped */ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, int len) { struct igmphdr *ih = igmp_hdr(skb); struct igmpv3_query *ih3 = igmpv3_query_hdr(skb); struct ip_mc_list *im; __be32 group = ih->group; int max_delay; int mark = 0; struct net *net = dev_net(in_dev->dev); if (len == 8) { if (ih->code == 0) { /* Alas, old v1 router presents here. */ max_delay = IGMP_QUERY_RESPONSE_INTERVAL; in_dev->mr_v1_seen = jiffies + (in_dev->mr_qrv * in_dev->mr_qi) + in_dev->mr_qri; group = 0; } else { /* v2 router present */ max_delay = ih->code*(HZ/IGMP_TIMER_SCALE); in_dev->mr_v2_seen = jiffies + (in_dev->mr_qrv * in_dev->mr_qi) + in_dev->mr_qri; } /* cancel the interface change timer */ WRITE_ONCE(in_dev->mr_ifc_count, 0); if (del_timer(&in_dev->mr_ifc_timer)) __in_dev_put(in_dev); /* clear deleted report items */ igmpv3_clear_delrec(in_dev); } else if (len < 12) { return true; /* ignore bogus packet; freed by caller */ } else if (IGMP_V1_SEEN(in_dev)) { /* This is a v3 query with v1 queriers present */ max_delay = IGMP_QUERY_RESPONSE_INTERVAL; group = 0; } else if (IGMP_V2_SEEN(in_dev)) { /* this is a v3 query with v2 queriers present; * Interpretation of the max_delay code is problematic here. * A real v2 host would use ih_code directly, while v3 has a * different encoding. We use the v3 encoding as more likely * to be intended in a v3 query. */ max_delay = IGMPV3_MRC(ih3->code)*(HZ/IGMP_TIMER_SCALE); if (!max_delay) max_delay = 1; /* can't mod w/ 0 */ } else { /* v3 */ if (!pskb_may_pull(skb, sizeof(struct igmpv3_query))) return true; ih3 = igmpv3_query_hdr(skb); if (ih3->nsrcs) { if (!pskb_may_pull(skb, sizeof(struct igmpv3_query) + ntohs(ih3->nsrcs)*sizeof(__be32))) return true; ih3 = igmpv3_query_hdr(skb); } max_delay = IGMPV3_MRC(ih3->code)*(HZ/IGMP_TIMER_SCALE); if (!max_delay) max_delay = 1; /* can't mod w/ 0 */ in_dev->mr_maxdelay = max_delay; /* RFC3376, 4.1.6. QRV and 4.1.7. QQIC, when the most recently * received value was zero, use the default or statically * configured value. */ in_dev->mr_qrv = ih3->qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); in_dev->mr_qi = IGMPV3_QQIC(ih3->qqic)*HZ ?: IGMP_QUERY_INTERVAL; /* RFC3376, 8.3. Query Response Interval: * The number of seconds represented by the [Query Response * Interval] must be less than the [Query Interval]. */ if (in_dev->mr_qri >= in_dev->mr_qi) in_dev->mr_qri = (in_dev->mr_qi/HZ - 1)*HZ; if (!group) { /* general query */ if (ih3->nsrcs) return true; /* no sources allowed */ igmp_gq_start_timer(in_dev); return false; } /* mark sources to include, if group & source-specific */ mark = ih3->nsrcs != 0; } /* * - Start the timers in all of our membership records * that the query applies to for the interface on * which the query arrived excl. those that belong * to a "local" group (224.0.0.X) * - For timers already running check if they need to * be reset. * - Use the igmp->igmp_code field as the maximum * delay possible */ rcu_read_lock(); for_each_pmc_rcu(in_dev, im) { int changed; if (group && group != im->multiaddr) continue; if (im->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(im->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue; spin_lock_bh(&im->lock); if (im->tm_running) im->gsquery = im->gsquery && mark; else im->gsquery = mark; changed = !im->gsquery || igmp_marksources(im, ntohs(ih3->nsrcs), ih3->srcs); spin_unlock_bh(&im->lock); if (changed) igmp_mod_timer(im, max_delay); } rcu_read_unlock(); return false; } /* called in rcu_read_lock() section */ int igmp_rcv(struct sk_buff *skb) { /* This basically follows the spec line by line -- see RFC1112 */ struct igmphdr *ih; struct net_device *dev = skb->dev; struct in_device *in_dev; int len = skb->len; bool dropped = true; if (netif_is_l3_master(dev)) { dev = dev_get_by_index_rcu(dev_net(dev), IPCB(skb)->iif); if (!dev) goto drop; } in_dev = __in_dev_get_rcu(dev); if (!in_dev) goto drop; if (!pskb_may_pull(skb, sizeof(struct igmphdr))) goto drop; if (skb_checksum_simple_validate(skb)) goto drop; ih = igmp_hdr(skb); switch (ih->type) { case IGMP_HOST_MEMBERSHIP_QUERY: dropped = igmp_heard_query(in_dev, skb, len); break; case IGMP_HOST_MEMBERSHIP_REPORT: case IGMPV2_HOST_MEMBERSHIP_REPORT: /* Is it our report looped back? */ if (rt_is_output_route(skb_rtable(skb))) break; /* don't rely on MC router hearing unicast reports */ if (skb->pkt_type == PACKET_MULTICAST || skb->pkt_type == PACKET_BROADCAST) dropped = igmp_heard_report(in_dev, ih->group); break; case IGMP_PIM: #ifdef CONFIG_IP_PIMSM_V1 return pim_rcv_v1(skb); #endif case IGMPV3_HOST_MEMBERSHIP_REPORT: case IGMP_DVMRP: case IGMP_TRACE: case IGMP_HOST_LEAVE_MESSAGE: case IGMP_MTRACE: case IGMP_MTRACE_RESP: break; default: break; } drop: if (dropped) kfree_skb(skb); else consume_skb(skb); return 0; } #endif /* * Add a filter to a device */ static void ip_mc_filter_add(struct in_device *in_dev, __be32 addr) { char buf[MAX_ADDR_LEN]; struct net_device *dev = in_dev->dev; /* Checking for IFF_MULTICAST here is WRONG-WRONG-WRONG. We will get multicast token leakage, when IFF_MULTICAST is changed. This check should be done in ndo_set_rx_mode routine. Something sort of: if (dev->mc_list && dev->flags&IFF_MULTICAST) { do it; } --ANK */ if (arp_mc_map(addr, buf, dev, 0) == 0) dev_mc_add(dev, buf); } /* * Remove a filter from a device */ static void ip_mc_filter_del(struct in_device *in_dev, __be32 addr) { char buf[MAX_ADDR_LEN]; struct net_device *dev = in_dev->dev; if (arp_mc_map(addr, buf, dev, 0) == 0) dev_mc_del(dev, buf); } #ifdef CONFIG_IP_MULTICAST /* * deleted ip_mc_list manipulation */ static void igmpv3_add_delrec(struct in_device *in_dev, struct ip_mc_list *im, gfp_t gfp) { struct ip_mc_list *pmc; struct net *net = dev_net(in_dev->dev); /* this is an "ip_mc_list" for convenience; only the fields below * are actually used. In particular, the refcnt and users are not * used for management of the delete list. Using the same structure * for deleted items allows change reports to use common code with * non-deleted or query-response MCA's. */ pmc = kzalloc(sizeof(*pmc), gfp); if (!pmc) return; spin_lock_init(&pmc->lock); spin_lock_bh(&im->lock); pmc->interface = im->interface; in_dev_hold(in_dev); pmc->multiaddr = im->multiaddr; pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); pmc->sfmode = im->sfmode; if (pmc->sfmode == MCAST_INCLUDE) { struct ip_sf_list *psf; pmc->tomb = im->tomb; pmc->sources = im->sources; im->tomb = im->sources = NULL; for (psf = pmc->sources; psf; psf = psf->sf_next) psf->sf_crcount = pmc->crcount; } spin_unlock_bh(&im->lock); spin_lock_bh(&in_dev->mc_tomb_lock); pmc->next = in_dev->mc_tomb; in_dev->mc_tomb = pmc; spin_unlock_bh(&in_dev->mc_tomb_lock); } /* * restore ip_mc_list deleted records */ static void igmpv3_del_delrec(struct in_device *in_dev, struct ip_mc_list *im) { struct ip_mc_list *pmc, *pmc_prev; struct ip_sf_list *psf; struct net *net = dev_net(in_dev->dev); __be32 multiaddr = im->multiaddr; spin_lock_bh(&in_dev->mc_tomb_lock); pmc_prev = NULL; for (pmc = in_dev->mc_tomb; pmc; pmc = pmc->next) { if (pmc->multiaddr == multiaddr) break; pmc_prev = pmc; } if (pmc) { if (pmc_prev) pmc_prev->next = pmc->next; else in_dev->mc_tomb = pmc->next; } spin_unlock_bh(&in_dev->mc_tomb_lock); spin_lock_bh(&im->lock); if (pmc) { im->interface = pmc->interface; if (im->sfmode == MCAST_INCLUDE) { swap(im->tomb, pmc->tomb); swap(im->sources, pmc->sources); for (psf = im->sources; psf; psf = psf->sf_next) psf->sf_crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); } else { im->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); } in_dev_put(pmc->interface); kfree_pmc(pmc); } spin_unlock_bh(&im->lock); } /* * flush ip_mc_list deleted records */ static void igmpv3_clear_delrec(struct in_device *in_dev) { struct ip_mc_list *pmc, *nextpmc; spin_lock_bh(&in_dev->mc_tomb_lock); pmc = in_dev->mc_tomb; in_dev->mc_tomb = NULL; spin_unlock_bh(&in_dev->mc_tomb_lock); for (; pmc; pmc = nextpmc) { nextpmc = pmc->next; ip_mc_clear_src(pmc); in_dev_put(pmc->interface); kfree_pmc(pmc); } /* clear dead sources, too */ rcu_read_lock(); for_each_pmc_rcu(in_dev, pmc) { struct ip_sf_list *psf; spin_lock_bh(&pmc->lock); psf = pmc->tomb; pmc->tomb = NULL; spin_unlock_bh(&pmc->lock); ip_sf_list_clear_all(psf); } rcu_read_unlock(); } #endif static void __igmp_group_dropped(struct ip_mc_list *im, gfp_t gfp) { struct in_device *in_dev = im->interface; #ifdef CONFIG_IP_MULTICAST struct net *net = dev_net(in_dev->dev); int reporter; #endif if (im->loaded) { im->loaded = 0; ip_mc_filter_del(in_dev, im->multiaddr); } #ifdef CONFIG_IP_MULTICAST if (im->multiaddr == IGMP_ALL_HOSTS) return; if (ipv4_is_local_multicast(im->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return; reporter = im->reporter; igmp_stop_timer(im); if (!in_dev->dead) { if (IGMP_V1_SEEN(in_dev)) return; if (IGMP_V2_SEEN(in_dev)) { if (reporter) igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE); return; } /* IGMPv3 */ igmpv3_add_delrec(in_dev, im, gfp); igmp_ifc_event(in_dev); } #endif } static void igmp_group_dropped(struct ip_mc_list *im) { __igmp_group_dropped(im, GFP_KERNEL); } static void igmp_group_added(struct ip_mc_list *im) { struct in_device *in_dev = im->interface; #ifdef CONFIG_IP_MULTICAST struct net *net = dev_net(in_dev->dev); #endif if (im->loaded == 0) { im->loaded = 1; ip_mc_filter_add(in_dev, im->multiaddr); } #ifdef CONFIG_IP_MULTICAST if (im->multiaddr == IGMP_ALL_HOSTS) return; if (ipv4_is_local_multicast(im->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) return; if (in_dev->dead) return; im->unsolicit_count = READ_ONCE(net->ipv4.sysctl_igmp_qrv); if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev)) { spin_lock_bh(&im->lock); igmp_start_timer(im, IGMP_INITIAL_REPORT_DELAY); spin_unlock_bh(&im->lock); return; } /* else, v3 */ /* Based on RFC3376 5.1, for newly added INCLUDE SSM, we should * not send filter-mode change record as the mode should be from * IN() to IN(A). */ if (im->sfmode == MCAST_EXCLUDE) im->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); igmp_ifc_event(in_dev); #endif } /* * Multicast list managers */ static u32 ip_mc_hash(const struct ip_mc_list *im) { return hash_32((__force u32)im->multiaddr, MC_HASH_SZ_LOG); } static void ip_mc_hash_add(struct in_device *in_dev, struct ip_mc_list *im) { struct ip_mc_list __rcu **mc_hash; u32 hash; mc_hash = rtnl_dereference(in_dev->mc_hash); if (mc_hash) { hash = ip_mc_hash(im); im->next_hash = mc_hash[hash]; rcu_assign_pointer(mc_hash[hash], im); return; } /* do not use a hash table for small number of items */ if (in_dev->mc_count < 4) return; mc_hash = kzalloc(sizeof(struct ip_mc_list *) << MC_HASH_SZ_LOG, GFP_KERNEL); if (!mc_hash) return; for_each_pmc_rtnl(in_dev, im) { hash = ip_mc_hash(im); im->next_hash = mc_hash[hash]; RCU_INIT_POINTER(mc_hash[hash], im); } rcu_assign_pointer(in_dev->mc_hash, mc_hash); } static void ip_mc_hash_remove(struct in_device *in_dev, struct ip_mc_list *im) { struct ip_mc_list __rcu **mc_hash = rtnl_dereference(in_dev->mc_hash); struct ip_mc_list *aux; if (!mc_hash) return; mc_hash += ip_mc_hash(im); while ((aux = rtnl_dereference(*mc_hash)) != im) mc_hash = &aux->next_hash; *mc_hash = im->next_hash; } /* * A socket has joined a multicast group on device dev. */ static void ____ip_mc_inc_group(struct in_device *in_dev, __be32 addr, unsigned int mode, gfp_t gfp) { struct ip_mc_list *im; ASSERT_RTNL(); for_each_pmc_rtnl(in_dev, im) { if (im->multiaddr == addr) { im->users++; ip_mc_add_src(in_dev, &addr, mode, 0, NULL, 0); goto out; } } im = kzalloc(sizeof(*im), gfp); if (!im) goto out; im->users = 1; im->interface = in_dev; in_dev_hold(in_dev); im->multiaddr = addr; /* initial mode is (EX, empty) */ im->sfmode = mode; im->sfcount[mode] = 1; refcount_set(&im->refcnt, 1); spin_lock_init(&im->lock); #ifdef CONFIG_IP_MULTICAST timer_setup(&im->timer, igmp_timer_expire, 0); #endif im->next_rcu = in_dev->mc_list; in_dev->mc_count++; rcu_assign_pointer(in_dev->mc_list, im); ip_mc_hash_add(in_dev, im); #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, im); #endif igmp_group_added(im); if (!in_dev->dead) ip_rt_multicast_event(in_dev); out: return; } void __ip_mc_inc_group(struct in_device *in_dev, __be32 addr, gfp_t gfp) { ____ip_mc_inc_group(in_dev, addr, MCAST_EXCLUDE, gfp); } EXPORT_SYMBOL(__ip_mc_inc_group); void ip_mc_inc_group(struct in_device *in_dev, __be32 addr) { __ip_mc_inc_group(in_dev, addr, GFP_KERNEL); } EXPORT_SYMBOL(ip_mc_inc_group); static int ip_mc_check_iphdr(struct sk_buff *skb) { const struct iphdr *iph; unsigned int len; unsigned int offset = skb_network_offset(skb) + sizeof(*iph); if (!pskb_may_pull(skb, offset)) return -EINVAL; iph = ip_hdr(skb); if (iph->version != 4 || ip_hdrlen(skb) < sizeof(*iph)) return -EINVAL; offset += ip_hdrlen(skb) - sizeof(*iph); if (!pskb_may_pull(skb, offset)) return -EINVAL; iph = ip_hdr(skb); if (unlikely(ip_fast_csum((u8 *)iph, iph->ihl))) return -EINVAL; len = skb_network_offset(skb) + ntohs(iph->tot_len); if (skb->len < len || len < offset) return -EINVAL; skb_set_transport_header(skb, offset); return 0; } static int ip_mc_check_igmp_reportv3(struct sk_buff *skb) { unsigned int len = skb_transport_offset(skb); len += sizeof(struct igmpv3_report); return ip_mc_may_pull(skb, len) ? 0 : -EINVAL; } static int ip_mc_check_igmp_query(struct sk_buff *skb) { unsigned int transport_len = ip_transport_len(skb); unsigned int len; /* IGMPv{1,2}? */ if (transport_len != sizeof(struct igmphdr)) { /* or IGMPv3? */ if (transport_len < sizeof(struct igmpv3_query)) return -EINVAL; len = skb_transport_offset(skb) + sizeof(struct igmpv3_query); if (!ip_mc_may_pull(skb, len)) return -EINVAL; } /* RFC2236+RFC3376 (IGMPv2+IGMPv3) require the multicast link layer * all-systems destination addresses (224.0.0.1) for general queries */ if (!igmp_hdr(skb)->group && ip_hdr(skb)->daddr != htonl(INADDR_ALLHOSTS_GROUP)) return -EINVAL; return 0; } static int ip_mc_check_igmp_msg(struct sk_buff *skb) { switch (igmp_hdr(skb)->type) { case IGMP_HOST_LEAVE_MESSAGE: case IGMP_HOST_MEMBERSHIP_REPORT: case IGMPV2_HOST_MEMBERSHIP_REPORT: return 0; case IGMPV3_HOST_MEMBERSHIP_REPORT: return ip_mc_check_igmp_reportv3(skb); case IGMP_HOST_MEMBERSHIP_QUERY: return ip_mc_check_igmp_query(skb); default: return -ENOMSG; } } static __sum16 ip_mc_validate_checksum(struct sk_buff *skb) { return skb_checksum_simple_validate(skb); } static int ip_mc_check_igmp_csum(struct sk_buff *skb) { unsigned int len = skb_transport_offset(skb) + sizeof(struct igmphdr); unsigned int transport_len = ip_transport_len(skb); struct sk_buff *skb_chk; if (!ip_mc_may_pull(skb, len)) return -EINVAL; skb_chk = skb_checksum_trimmed(skb, transport_len, ip_mc_validate_checksum); if (!skb_chk) return -EINVAL; if (skb_chk != skb) kfree_skb(skb_chk); return 0; } /** * ip_mc_check_igmp - checks whether this is a sane IGMP packet * @skb: the skb to validate * * Checks whether an IPv4 packet is a valid IGMP packet. If so sets * skb transport header accordingly and returns zero. * * -EINVAL: A broken packet was detected, i.e. it violates some internet * standard * -ENOMSG: IP header validation succeeded but it is not an IGMP packet. * -ENOMEM: A memory allocation failure happened. * * Caller needs to set the skb network header and free any returned skb if it * differs from the provided skb. */ int ip_mc_check_igmp(struct sk_buff *skb) { int ret = ip_mc_check_iphdr(skb); if (ret < 0) return ret; if (ip_hdr(skb)->protocol != IPPROTO_IGMP) return -ENOMSG; ret = ip_mc_check_igmp_csum(skb); if (ret < 0) return ret; return ip_mc_check_igmp_msg(skb); } EXPORT_SYMBOL(ip_mc_check_igmp); /* * Resend IGMP JOIN report; used by netdev notifier. */ static void ip_mc_rejoin_groups(struct in_device *in_dev) { #ifdef CONFIG_IP_MULTICAST struct ip_mc_list *im; int type; struct net *net = dev_net(in_dev->dev); ASSERT_RTNL(); for_each_pmc_rtnl(in_dev, im) { if (im->multiaddr == IGMP_ALL_HOSTS) continue; if (ipv4_is_local_multicast(im->multiaddr) && !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports)) continue; /* a failover is happening and switches * must be notified immediately */ if (IGMP_V1_SEEN(in_dev)) type = IGMP_HOST_MEMBERSHIP_REPORT; else if (IGMP_V2_SEEN(in_dev)) type = IGMPV2_HOST_MEMBERSHIP_REPORT; else type = IGMPV3_HOST_MEMBERSHIP_REPORT; igmp_send_report(in_dev, im, type); } #endif } /* * A socket has left a multicast group on device dev */ void __ip_mc_dec_group(struct in_device *in_dev, __be32 addr, gfp_t gfp) { struct ip_mc_list *i; struct ip_mc_list __rcu **ip; ASSERT_RTNL(); for (ip = &in_dev->mc_list; (i = rtnl_dereference(*ip)) != NULL; ip = &i->next_rcu) { if (i->multiaddr == addr) { if (--i->users == 0) { ip_mc_hash_remove(in_dev, i); *ip = i->next_rcu; in_dev->mc_count--; __igmp_group_dropped(i, gfp); ip_mc_clear_src(i); if (!in_dev->dead) ip_rt_multicast_event(in_dev); ip_ma_put(i); return; } break; } } } EXPORT_SYMBOL(__ip_mc_dec_group); /* Device changing type */ void ip_mc_unmap(struct in_device *in_dev) { struct ip_mc_list *pmc; ASSERT_RTNL(); for_each_pmc_rtnl(in_dev, pmc) igmp_group_dropped(pmc); } void ip_mc_remap(struct in_device *in_dev) { struct ip_mc_list *pmc; ASSERT_RTNL(); for_each_pmc_rtnl(in_dev, pmc) { #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, pmc); #endif igmp_group_added(pmc); } } /* Device going down */ void ip_mc_down(struct in_device *in_dev) { struct ip_mc_list *pmc; ASSERT_RTNL(); for_each_pmc_rtnl(in_dev, pmc) igmp_group_dropped(pmc); #ifdef CONFIG_IP_MULTICAST WRITE_ONCE(in_dev->mr_ifc_count, 0); if (del_timer(&in_dev->mr_ifc_timer)) __in_dev_put(in_dev); in_dev->mr_gq_running = 0; if (del_timer(&in_dev->mr_gq_timer)) __in_dev_put(in_dev); #endif ip_mc_dec_group(in_dev, IGMP_ALL_HOSTS); } #ifdef CONFIG_IP_MULTICAST static void ip_mc_reset(struct in_device *in_dev) { struct net *net = dev_net(in_dev->dev); in_dev->mr_qi = IGMP_QUERY_INTERVAL; in_dev->mr_qri = IGMP_QUERY_RESPONSE_INTERVAL; in_dev->mr_qrv = READ_ONCE(net->ipv4.sysctl_igmp_qrv); } #else static void ip_mc_reset(struct in_device *in_dev) { } #endif void ip_mc_init_dev(struct in_device *in_dev) { ASSERT_RTNL(); #ifdef CONFIG_IP_MULTICAST timer_setup(&in_dev->mr_gq_timer, igmp_gq_timer_expire, 0); timer_setup(&in_dev->mr_ifc_timer, igmp_ifc_timer_expire, 0); #endif ip_mc_reset(in_dev); spin_lock_init(&in_dev->mc_tomb_lock); } /* Device going up */ void ip_mc_up(struct in_device *in_dev) { struct ip_mc_list *pmc; ASSERT_RTNL(); ip_mc_reset(in_dev); ip_mc_inc_group(in_dev, IGMP_ALL_HOSTS); for_each_pmc_rtnl(in_dev, pmc) { #ifdef CONFIG_IP_MULTICAST igmpv3_del_delrec(in_dev, pmc); #endif igmp_group_added(pmc); } } /* * Device is about to be destroyed: clean up. */ void ip_mc_destroy_dev(struct in_device *in_dev) { struct ip_mc_list *i; ASSERT_RTNL(); /* Deactivate timers */ ip_mc_down(in_dev); #ifdef CONFIG_IP_MULTICAST igmpv3_clear_delrec(in_dev); #endif while ((i = rtnl_dereference(in_dev->mc_list)) != NULL) { in_dev->mc_list = i->next_rcu; in_dev->mc_count--; ip_mc_clear_src(i); ip_ma_put(i); } } /* RTNL is locked */ static struct in_device *ip_mc_find_dev(struct net *net, struct ip_mreqn *imr) { struct net_device *dev = NULL; struct in_device *idev = NULL; if (imr->imr_ifindex) { idev = inetdev_by_index(net, imr->imr_ifindex); return idev; } if (imr->imr_address.s_addr) { dev = __ip_dev_find(net, imr->imr_address.s_addr, false); if (!dev) return NULL; } if (!dev) { struct rtable *rt = ip_route_output(net, imr->imr_multiaddr.s_addr, 0, 0, 0, RT_SCOPE_UNIVERSE); if (!IS_ERR(rt)) { dev = rt->dst.dev; ip_rt_put(rt); } } if (dev) { imr->imr_ifindex = dev->ifindex; idev = __in_dev_get_rtnl(dev); } return idev; } /* * Join a socket to a group */ static int ip_mc_del1_src(struct ip_mc_list *pmc, int sfmode, __be32 *psfsrc) { struct ip_sf_list *psf, *psf_prev; int rv = 0; psf_prev = NULL; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (psf->sf_inaddr == *psfsrc) break; psf_prev = psf; } if (!psf || psf->sf_count[sfmode] == 0) { /* source filter not found, or count wrong => bug */ return -ESRCH; } psf->sf_count[sfmode]--; if (psf->sf_count[sfmode] == 0) { ip_rt_multicast_event(pmc->interface); } if (!psf->sf_count[MCAST_INCLUDE] && !psf->sf_count[MCAST_EXCLUDE]) { #ifdef CONFIG_IP_MULTICAST struct in_device *in_dev = pmc->interface; struct net *net = dev_net(in_dev->dev); #endif /* no more filters for this source */ if (psf_prev) psf_prev->sf_next = psf->sf_next; else pmc->sources = psf->sf_next; #ifdef CONFIG_IP_MULTICAST if (psf->sf_oldin && !IGMP_V1_SEEN(in_dev) && !IGMP_V2_SEEN(in_dev)) { psf->sf_crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); psf->sf_next = pmc->tomb; pmc->tomb = psf; rv = 1; } else #endif kfree(psf); } return rv; } #ifndef CONFIG_IP_MULTICAST #define igmp_ifc_event(x) do { } while (0) #endif static int ip_mc_del_src(struct in_device *in_dev, __be32 *pmca, int sfmode, int sfcount, __be32 *psfsrc, int delta) { struct ip_mc_list *pmc; int changerec = 0; int i, err; if (!in_dev) return -ENODEV; rcu_read_lock(); for_each_pmc_rcu(in_dev, pmc) { if (*pmca == pmc->multiaddr) break; } if (!pmc) { /* MCA not found?? bug */ rcu_read_unlock(); return -ESRCH; } spin_lock_bh(&pmc->lock); rcu_read_unlock(); #ifdef CONFIG_IP_MULTICAST sf_markstate(pmc); #endif if (!delta) { err = -EINVAL; if (!pmc->sfcount[sfmode]) goto out_unlock; pmc->sfcount[sfmode]--; } err = 0; for (i = 0; i < sfcount; i++) { int rv = ip_mc_del1_src(pmc, sfmode, &psfsrc[i]); changerec |= rv > 0; if (!err && rv < 0) err = rv; } if (pmc->sfmode == MCAST_EXCLUDE && pmc->sfcount[MCAST_EXCLUDE] == 0 && pmc->sfcount[MCAST_INCLUDE]) { #ifdef CONFIG_IP_MULTICAST struct ip_sf_list *psf; struct net *net = dev_net(in_dev->dev); #endif /* filter mode change */ pmc->sfmode = MCAST_INCLUDE; #ifdef CONFIG_IP_MULTICAST pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount); for (psf = pmc->sources; psf; psf = psf->sf_next) psf->sf_crcount = 0; igmp_ifc_event(pmc->interface); } else if (sf_setstate(pmc) || changerec) { igmp_ifc_event(pmc->interface); #endif } out_unlock: spin_unlock_bh(&pmc->lock); return err; } /* * Add multicast single-source filter to the interface list */ static int ip_mc_add1_src(struct ip_mc_list *pmc, int sfmode, __be32 *psfsrc) { struct ip_sf_list *psf, *psf_prev; psf_prev = NULL; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (psf->sf_inaddr == *psfsrc) break; psf_prev = psf; } if (!psf) { psf = kzalloc(sizeof(*psf), GFP_ATOMIC); if (!psf) return -ENOBUFS; psf->sf_inaddr = *psfsrc; if (psf_prev) { psf_prev->sf_next = psf; } else pmc->sources = psf; } psf->sf_count[sfmode]++; if (psf->sf_count[sfmode] == 1) { ip_rt_multicast_event(pmc->interface); } return 0; } #ifdef CONFIG_IP_MULTICAST static void sf_markstate(struct ip_mc_list *pmc) { struct ip_sf_list *psf; int mca_xcount = pmc->sfcount[MCAST_EXCLUDE]; for (psf = pmc->sources; psf; psf = psf->sf_next) if (pmc->sfcount[MCAST_EXCLUDE]) { psf->sf_oldin = mca_xcount == psf->sf_count[MCAST_EXCLUDE] && !psf->sf_count[MCAST_INCLUDE]; } else psf->sf_oldin = psf->sf_count[MCAST_INCLUDE] != 0; } static int sf_setstate(struct ip_mc_list *pmc) { struct ip_sf_list *psf, *dpsf; int mca_xcount = pmc->sfcount[MCAST_EXCLUDE]; int qrv = pmc->interface->mr_qrv; int new_in, rv; rv = 0; for (psf = pmc->sources; psf; psf = psf->sf_next) { if (pmc->sfcount[MCAST_EXCLUDE]) { new_in = mca_xcount == psf->sf_count[MCAST_EXCLUDE] && !psf->sf_count[MCAST_INCLUDE]; } else new_in = psf->sf_count[MCAST_INCLUDE] != 0; if (new_in) { if (!psf->sf_oldin) { struct ip_sf_list *prev = NULL; for (dpsf = pmc->tomb; dpsf; dpsf = dpsf->sf_next) { if (dpsf->sf_inaddr == psf->sf_inaddr) break; prev = dpsf; } if (dpsf) { if (prev) prev->sf_next = dpsf->sf_next; else pmc->tomb = dpsf->sf_next; kfree(dpsf); } psf->sf_crcount = qrv; rv++; } } else if (psf->sf_oldin) { psf->sf_crcount = 0; /* * add or update "delete" records if an active filter * is now inactive */ for (dpsf = pmc->tomb; dpsf; dpsf = dpsf->sf_next) if (dpsf->sf_inaddr == psf->sf_inaddr) break; if (!dpsf) { dpsf = kmalloc(sizeof(*dpsf), GFP_ATOMIC); if (!dpsf) continue; *dpsf = *psf; /* pmc->lock held by callers */ dpsf->sf_next = pmc->tomb; pmc->tomb = dpsf; } dpsf->sf_crcount = qrv; rv++; } } return rv; } #endif /* * Add multicast source filter list to the interface list */ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode, int sfcount, __be32 *psfsrc, int delta) { struct ip_mc_list *pmc; int isexclude; int i, err; if (!in_dev) return -ENODEV; rcu_read_lock(); for_each_pmc_rcu(in_dev, pmc) { if (*pmca == pmc->multiaddr) break; } if (!pmc) { /* MCA not found?? bug */ rcu_read_unlock(); return -ESRCH; } spin_lock_bh(&pmc->lock); rcu_read_unlock(); #ifdef CONFIG_IP_MULTICAST sf_markstate(pmc); #endif isexclude = pmc->sfmode == MCAST_EXCLUDE; if (!delta) pmc->sfcount[sfmode]++; err = 0; for (i = 0; i < sfcount; i++) { err = ip_mc_add1_src(pmc, sfmode, &psfsrc[i]); if (err) break; } if (err) { int j; if (!delta) pmc->sfcount[sfmode]--; for (j = 0; j < i; j++) (void) ip_mc_del1_src(pmc, sfmode, &psfsrc[j]); } else if (isexclude != (pmc->sfcount[MCAST_EXCLUDE] != 0)) { #ifdef CONFIG_IP_MULTICAST struct ip_sf_list *psf; struct net *net = dev_net(pmc->interface->dev); in_dev = pmc->interface; #endif /* filter mode change */ if (pmc->sfcount[MCAST_EXCLUDE]) pmc->sfmode = MCAST_EXCLUDE; else if (pmc->sfcount[MCAST_INCLUDE]) pmc->sfmode = MCAST_INCLUDE; #ifdef CONFIG_IP_MULTICAST /* else no filters; keep old mode for reports */ pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount); for (psf = pmc->sources; psf; psf = psf->sf_next) psf->sf_crcount = 0; igmp_ifc_event(in_dev); } else if (sf_setstate(pmc)) { igmp_ifc_event(in_dev); #endif } spin_unlock_bh(&pmc->lock); return err; } static void ip_mc_clear_src(struct ip_mc_list *pmc) { struct ip_sf_list *tomb, *sources; spin_lock_bh(&pmc->lock); tomb = pmc->tomb; pmc->tomb = NULL; sources = pmc->sources; pmc->sources = NULL; pmc->sfmode = MCAST_EXCLUDE; pmc->sfcount[MCAST_INCLUDE] = 0; pmc->sfcount[MCAST_EXCLUDE] = 1; spin_unlock_bh(&pmc->lock); ip_sf_list_clear_all(tomb); ip_sf_list_clear_all(sources); } /* Join a multicast group */ static int __ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr, unsigned int mode) { __be32 addr = imr->imr_multiaddr.s_addr; struct ip_mc_socklist *iml, *i; struct in_device *in_dev; struct inet_sock *inet = inet_sk(sk); struct net *net = sock_net(sk); int ifindex; int count = 0; int err; ASSERT_RTNL(); if (!ipv4_is_multicast(addr)) return -EINVAL; in_dev = ip_mc_find_dev(net, imr); if (!in_dev) { err = -ENODEV; goto done; } err = -EADDRINUSE; ifindex = imr->imr_ifindex; for_each_pmc_rtnl(inet, i) { if (i->multi.imr_multiaddr.s_addr == addr && i->multi.imr_ifindex == ifindex) goto done; count++; } err = -ENOBUFS; if (count >= READ_ONCE(net->ipv4.sysctl_igmp_max_memberships)) goto done; iml = sock_kmalloc(sk, sizeof(*iml), GFP_KERNEL); if (!iml) goto done; memcpy(&iml->multi, imr, sizeof(*imr)); iml->next_rcu = inet->mc_list; iml->sflist = NULL; iml->sfmode = mode; rcu_assign_pointer(inet->mc_list, iml); ____ip_mc_inc_group(in_dev, addr, mode, GFP_KERNEL); err = 0; done: return err; } /* Join ASM (Any-Source Multicast) group */ int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr) { return __ip_mc_join_group(sk, imr, MCAST_EXCLUDE); } EXPORT_SYMBOL(ip_mc_join_group); /* Join SSM (Source-Specific Multicast) group */ int ip_mc_join_group_ssm(struct sock *sk, struct ip_mreqn *imr, unsigned int mode) { return __ip_mc_join_group(sk, imr, mode); } static int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, struct in_device *in_dev) { struct ip_sf_socklist *psf = rtnl_dereference(iml->sflist); int err; if (!psf) { /* any-source empty exclude case */ return ip_mc_del_src(in_dev, &iml->multi.imr_multiaddr.s_addr, iml->sfmode, 0, NULL, 0); } err = ip_mc_del_src(in_dev, &iml->multi.imr_multiaddr.s_addr, iml->sfmode, psf->sl_count, psf->sl_addr, 0); RCU_INIT_POINTER(iml->sflist, NULL); /* decrease mem now to avoid the memleak warning */ atomic_sub(struct_size(psf, sl_addr, psf->sl_max), &sk->sk_omem_alloc); kfree_rcu(psf, rcu); return err; } int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr) { struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml; struct ip_mc_socklist __rcu **imlp; struct in_device *in_dev; struct net *net = sock_net(sk); __be32 group = imr->imr_multiaddr.s_addr; u32 ifindex; int ret = -EADDRNOTAVAIL; ASSERT_RTNL(); in_dev = ip_mc_find_dev(net, imr); if (!imr->imr_ifindex && !imr->imr_address.s_addr && !in_dev) { ret = -ENODEV; goto out; } ifindex = imr->imr_ifindex; for (imlp = &inet->mc_list; (iml = rtnl_dereference(*imlp)) != NULL; imlp = &iml->next_rcu) { if (iml->multi.imr_multiaddr.s_addr != group) continue; if (ifindex) { if (iml->multi.imr_ifindex != ifindex) continue; } else if (imr->imr_address.s_addr && imr->imr_address.s_addr != iml->multi.imr_address.s_addr) continue; (void) ip_mc_leave_src(sk, iml, in_dev); *imlp = iml->next_rcu; if (in_dev) ip_mc_dec_group(in_dev, group); /* decrease mem now to avoid the memleak warning */ atomic_sub(sizeof(*iml), &sk->sk_omem_alloc); kfree_rcu(iml, rcu); return 0; } out: return ret; } EXPORT_SYMBOL(ip_mc_leave_group); int ip_mc_source(int add, int omode, struct sock *sk, struct ip_mreq_source *mreqs, int ifindex) { int err; struct ip_mreqn imr; __be32 addr = mreqs->imr_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev = NULL; struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; struct net *net = sock_net(sk); int leavegroup = 0; int i, j, rv; if (!ipv4_is_multicast(addr)) return -EINVAL; ASSERT_RTNL(); imr.imr_multiaddr.s_addr = mreqs->imr_multiaddr; imr.imr_address.s_addr = mreqs->imr_interface; imr.imr_ifindex = ifindex; in_dev = ip_mc_find_dev(net, &imr); if (!in_dev) { err = -ENODEV; goto done; } err = -EADDRNOTAVAIL; for_each_pmc_rtnl(inet, pmc) { if ((pmc->multi.imr_multiaddr.s_addr == imr.imr_multiaddr.s_addr) && (pmc->multi.imr_ifindex == imr.imr_ifindex)) break; } if (!pmc) { /* must have a prior join */ err = -EINVAL; goto done; } /* if a source filter was set, must be the same mode as before */ if (pmc->sflist) { if (pmc->sfmode != omode) { err = -EINVAL; goto done; } } else if (pmc->sfmode != omode) { /* allow mode switches for empty-set filters */ ip_mc_add_src(in_dev, &mreqs->imr_multiaddr, omode, 0, NULL, 0); ip_mc_del_src(in_dev, &mreqs->imr_multiaddr, pmc->sfmode, 0, NULL, 0); pmc->sfmode = omode; } psl = rtnl_dereference(pmc->sflist); if (!add) { if (!psl) goto done; /* err = -EADDRNOTAVAIL */ rv = !0; for (i = 0; i < psl->sl_count; i++) { rv = memcmp(&psl->sl_addr[i], &mreqs->imr_sourceaddr, sizeof(__be32)); if (rv == 0) break; } if (rv) /* source not found */ goto done; /* err = -EADDRNOTAVAIL */ /* special case - (INCLUDE, empty) == LEAVE_GROUP */ if (psl->sl_count == 1 && omode == MCAST_INCLUDE) { leavegroup = 1; goto done; } /* update the interface filter */ ip_mc_del_src(in_dev, &mreqs->imr_multiaddr, omode, 1, &mreqs->imr_sourceaddr, 1); for (j = i+1; j < psl->sl_count; j++) psl->sl_addr[j-1] = psl->sl_addr[j]; psl->sl_count--; err = 0; goto done; } /* else, add a new source to the filter */ if (psl && psl->sl_count >= READ_ONCE(net->ipv4.sysctl_igmp_max_msf)) { err = -ENOBUFS; goto done; } if (!psl || psl->sl_count == psl->sl_max) { struct ip_sf_socklist *newpsl; int count = IP_SFBLOCK; if (psl) count += psl->sl_max; newpsl = sock_kmalloc(sk, struct_size(newpsl, sl_addr, count), GFP_KERNEL); if (!newpsl) { err = -ENOBUFS; goto done; } newpsl->sl_max = count; newpsl->sl_count = count - IP_SFBLOCK; if (psl) { for (i = 0; i < psl->sl_count; i++) newpsl->sl_addr[i] = psl->sl_addr[i]; /* decrease mem now to avoid the memleak warning */ atomic_sub(struct_size(psl, sl_addr, psl->sl_max), &sk->sk_omem_alloc); } rcu_assign_pointer(pmc->sflist, newpsl); if (psl) kfree_rcu(psl, rcu); psl = newpsl; } rv = 1; /* > 0 for insert logic below if sl_count is 0 */ for (i = 0; i < psl->sl_count; i++) { rv = memcmp(&psl->sl_addr[i], &mreqs->imr_sourceaddr, sizeof(__be32)); if (rv == 0) break; } if (rv == 0) /* address already there is an error */ goto done; for (j = psl->sl_count-1; j >= i; j--) psl->sl_addr[j+1] = psl->sl_addr[j]; psl->sl_addr[i] = mreqs->imr_sourceaddr; psl->sl_count++; err = 0; /* update the interface list */ ip_mc_add_src(in_dev, &mreqs->imr_multiaddr, omode, 1, &mreqs->imr_sourceaddr, 1); done: if (leavegroup) err = ip_mc_leave_group(sk, &imr); return err; } int ip_mc_msfilter(struct sock *sk, struct ip_msfilter *msf, int ifindex) { int err = 0; struct ip_mreqn imr; __be32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *newpsl, *psl; struct net *net = sock_net(sk); int leavegroup = 0; if (!ipv4_is_multicast(addr)) return -EINVAL; if (msf->imsf_fmode != MCAST_INCLUDE && msf->imsf_fmode != MCAST_EXCLUDE) return -EINVAL; ASSERT_RTNL(); imr.imr_multiaddr.s_addr = msf->imsf_multiaddr; imr.imr_address.s_addr = msf->imsf_interface; imr.imr_ifindex = ifindex; in_dev = ip_mc_find_dev(net, &imr); if (!in_dev) { err = -ENODEV; goto done; } /* special case - (INCLUDE, empty) == LEAVE_GROUP */ if (msf->imsf_fmode == MCAST_INCLUDE && msf->imsf_numsrc == 0) { leavegroup = 1; goto done; } for_each_pmc_rtnl(inet, pmc) { if (pmc->multi.imr_multiaddr.s_addr == msf->imsf_multiaddr && pmc->multi.imr_ifindex == imr.imr_ifindex) break; } if (!pmc) { /* must have a prior join */ err = -EINVAL; goto done; } if (msf->imsf_numsrc) { newpsl = sock_kmalloc(sk, struct_size(newpsl, sl_addr, msf->imsf_numsrc), GFP_KERNEL); if (!newpsl) { err = -ENOBUFS; goto done; } newpsl->sl_max = newpsl->sl_count = msf->imsf_numsrc; memcpy(newpsl->sl_addr, msf->imsf_slist_flex, flex_array_size(msf, imsf_slist_flex, msf->imsf_numsrc)); err = ip_mc_add_src(in_dev, &msf->imsf_multiaddr, msf->imsf_fmode, newpsl->sl_count, newpsl->sl_addr, 0); if (err) { sock_kfree_s(sk, newpsl, struct_size(newpsl, sl_addr, newpsl->sl_max)); goto done; } } else { newpsl = NULL; (void) ip_mc_add_src(in_dev, &msf->imsf_multiaddr, msf->imsf_fmode, 0, NULL, 0); } psl = rtnl_dereference(pmc->sflist); if (psl) { (void) ip_mc_del_src(in_dev, &msf->imsf_multiaddr, pmc->sfmode, psl->sl_count, psl->sl_addr, 0); /* decrease mem now to avoid the memleak warning */ atomic_sub(struct_size(psl, sl_addr, psl->sl_max), &sk->sk_omem_alloc); } else { (void) ip_mc_del_src(in_dev, &msf->imsf_multiaddr, pmc->sfmode, 0, NULL, 0); } rcu_assign_pointer(pmc->sflist, newpsl); if (psl) kfree_rcu(psl, rcu); pmc->sfmode = msf->imsf_fmode; err = 0; done: if (leavegroup) err = ip_mc_leave_group(sk, &imr); return err; } int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf, sockptr_t optval, sockptr_t optlen) { int err, len, count, copycount, msf_size; struct ip_mreqn imr; __be32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; struct net *net = sock_net(sk); ASSERT_RTNL(); if (!ipv4_is_multicast(addr)) return -EINVAL; imr.imr_multiaddr.s_addr = msf->imsf_multiaddr; imr.imr_address.s_addr = msf->imsf_interface; imr.imr_ifindex = 0; in_dev = ip_mc_find_dev(net, &imr); if (!in_dev) { err = -ENODEV; goto done; } err = -EADDRNOTAVAIL; for_each_pmc_rtnl(inet, pmc) { if (pmc->multi.imr_multiaddr.s_addr == msf->imsf_multiaddr && pmc->multi.imr_ifindex == imr.imr_ifindex) break; } if (!pmc) /* must have a prior join */ goto done; msf->imsf_fmode = pmc->sfmode; psl = rtnl_dereference(pmc->sflist); if (!psl) { count = 0; } else { count = psl->sl_count; } copycount = count < msf->imsf_numsrc ? count : msf->imsf_numsrc; len = flex_array_size(psl, sl_addr, copycount); msf->imsf_numsrc = count; msf_size = IP_MSFILTER_SIZE(copycount); if (copy_to_sockptr(optlen, &msf_size, sizeof(int)) || copy_to_sockptr(optval, msf, IP_MSFILTER_SIZE(0))) { return -EFAULT; } if (len && copy_to_sockptr_offset(optval, offsetof(struct ip_msfilter, imsf_slist_flex), psl->sl_addr, len)) return -EFAULT; return 0; done: return err; } int ip_mc_gsfget(struct sock *sk, struct group_filter *gsf, sockptr_t optval, size_t ss_offset) { int i, count, copycount; struct sockaddr_in *psin; __be32 addr; struct ip_mc_socklist *pmc; struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; ASSERT_RTNL(); psin = (struct sockaddr_in *)&gsf->gf_group; if (psin->sin_family != AF_INET) return -EINVAL; addr = psin->sin_addr.s_addr; if (!ipv4_is_multicast(addr)) return -EINVAL; for_each_pmc_rtnl(inet, pmc) { if (pmc->multi.imr_multiaddr.s_addr == addr && pmc->multi.imr_ifindex == gsf->gf_interface) break; } if (!pmc) /* must have a prior join */ return -EADDRNOTAVAIL; gsf->gf_fmode = pmc->sfmode; psl = rtnl_dereference(pmc->sflist); count = psl ? psl->sl_count : 0; copycount = count < gsf->gf_numsrc ? count : gsf->gf_numsrc; gsf->gf_numsrc = count; for (i = 0; i < copycount; i++) { struct sockaddr_storage ss; psin = (struct sockaddr_in *)&ss; memset(&ss, 0, sizeof(ss)); psin->sin_family = AF_INET; psin->sin_addr.s_addr = psl->sl_addr[i]; if (copy_to_sockptr_offset(optval, ss_offset, &ss, sizeof(ss))) return -EFAULT; ss_offset += sizeof(ss); } return 0; } /* * check if a multicast source filter allows delivery for a given <src,dst,intf> */ int ip_mc_sf_allow(const struct sock *sk, __be32 loc_addr, __be32 rmt_addr, int dif, int sdif) { const struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *pmc; struct ip_sf_socklist *psl; int i; int ret; ret = 1; if (!ipv4_is_multicast(loc_addr)) goto out; rcu_read_lock(); for_each_pmc_rcu(inet, pmc) { if (pmc->multi.imr_multiaddr.s_addr == loc_addr && (pmc->multi.imr_ifindex == dif || (sdif && pmc->multi.imr_ifindex == sdif))) break; } ret = inet_test_bit(MC_ALL, sk); if (!pmc) goto unlock; psl = rcu_dereference(pmc->sflist); ret = (pmc->sfmode == MCAST_EXCLUDE); if (!psl) goto unlock; for (i = 0; i < psl->sl_count; i++) { if (psl->sl_addr[i] == rmt_addr) break; } ret = 0; if (pmc->sfmode == MCAST_INCLUDE && i >= psl->sl_count) goto unlock; if (pmc->sfmode == MCAST_EXCLUDE && i < psl->sl_count) goto unlock; ret = 1; unlock: rcu_read_unlock(); out: return ret; } /* * A socket is closing. */ void ip_mc_drop_socket(struct sock *sk) { struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml; struct net *net = sock_net(sk); if (!inet->mc_list) return; rtnl_lock(); while ((iml = rtnl_dereference(inet->mc_list)) != NULL) { struct in_device *in_dev; inet->mc_list = iml->next_rcu; in_dev = inetdev_by_index(net, iml->multi.imr_ifindex); (void) ip_mc_leave_src(sk, iml, in_dev); if (in_dev) ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr); /* decrease mem now to avoid the memleak warning */ atomic_sub(sizeof(*iml), &sk->sk_omem_alloc); kfree_rcu(iml, rcu); } rtnl_unlock(); } /* called with rcu_read_lock() */ int ip_check_mc_rcu(struct in_device *in_dev, __be32 mc_addr, __be32 src_addr, u8 proto) { struct ip_mc_list *im; struct ip_mc_list __rcu **mc_hash; struct ip_sf_list *psf; int rv = 0; mc_hash = rcu_dereference(in_dev->mc_hash); if (mc_hash) { u32 hash = hash_32((__force u32)mc_addr, MC_HASH_SZ_LOG); for (im = rcu_dereference(mc_hash[hash]); im != NULL; im = rcu_dereference(im->next_hash)) { if (im->multiaddr == mc_addr) break; } } else { for_each_pmc_rcu(in_dev, im) { if (im->multiaddr == mc_addr) break; } } if (im && proto == IPPROTO_IGMP) { rv = 1; } else if (im) { if (src_addr) { spin_lock_bh(&im->lock); for (psf = im->sources; psf; psf = psf->sf_next) { if (psf->sf_inaddr == src_addr) break; } if (psf) rv = psf->sf_count[MCAST_INCLUDE] || psf->sf_count[MCAST_EXCLUDE] != im->sfcount[MCAST_EXCLUDE]; else rv = im->sfcount[MCAST_EXCLUDE] != 0; spin_unlock_bh(&im->lock); } else rv = 1; /* unspecified source; tentatively allow */ } return rv; } #if defined(CONFIG_PROC_FS) struct igmp_mc_iter_state { struct seq_net_private p; struct net_device *dev; struct in_device *in_dev; }; #define igmp_mc_seq_private(seq) ((struct igmp_mc_iter_state *)(seq)->private) static inline struct ip_mc_list *igmp_mc_get_first(struct seq_file *seq) { struct net *net = seq_file_net(seq); struct ip_mc_list *im = NULL; struct igmp_mc_iter_state *state = igmp_mc_seq_private(seq); state->in_dev = NULL; for_each_netdev_rcu(net, state->dev) { struct in_device *in_dev; in_dev = __in_dev_get_rcu(state->dev); if (!in_dev) continue; im = rcu_dereference(in_dev->mc_list); if (im) { state->in_dev = in_dev; break; } } return im; } static struct ip_mc_list *igmp_mc_get_next(struct seq_file *seq, struct ip_mc_list *im) { struct igmp_mc_iter_state *state = igmp_mc_seq_private(seq); im = rcu_dereference(im->next_rcu); while (!im) { state->dev = next_net_device_rcu(state->dev); if (!state->dev) { state->in_dev = NULL; break; } state->in_dev = __in_dev_get_rcu(state->dev); if (!state->in_dev) continue; im = rcu_dereference(state->in_dev->mc_list); } return im; } static struct ip_mc_list *igmp_mc_get_idx(struct seq_file *seq, loff_t pos) { struct ip_mc_list *im = igmp_mc_get_first(seq); if (im) while (pos && (im = igmp_mc_get_next(seq, im)) != NULL) --pos; return pos ? NULL : im; } static void *igmp_mc_seq_start(struct seq_file *seq, loff_t *pos) __acquires(rcu) { rcu_read_lock(); return *pos ? igmp_mc_get_idx(seq, *pos - 1) : SEQ_START_TOKEN; } static void *igmp_mc_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct ip_mc_list *im; if (v == SEQ_START_TOKEN) im = igmp_mc_get_first(seq); else im = igmp_mc_get_next(seq, v); ++*pos; return im; } static void igmp_mc_seq_stop(struct seq_file *seq, void *v) __releases(rcu) { struct igmp_mc_iter_state *state = igmp_mc_seq_private(seq); state->in_dev = NULL; state->dev = NULL; rcu_read_unlock(); } static int igmp_mc_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) seq_puts(seq, "Idx\tDevice : Count Querier\tGroup Users Timer\tReporter\n"); else { struct ip_mc_list *im = v; struct igmp_mc_iter_state *state = igmp_mc_seq_private(seq); char *querier; long delta; #ifdef CONFIG_IP_MULTICAST querier = IGMP_V1_SEEN(state->in_dev) ? "V1" : IGMP_V2_SEEN(state->in_dev) ? "V2" : "V3"; #else querier = "NONE"; #endif if (rcu_access_pointer(state->in_dev->mc_list) == im) { seq_printf(seq, "%d\t%-10s: %5d %7s\n", state->dev->ifindex, state->dev->name, state->in_dev->mc_count, querier); } delta = im->timer.expires - jiffies; seq_printf(seq, "\t\t\t\t%08X %5d %d:%08lX\t\t%d\n", im->multiaddr, im->users, im->tm_running, im->tm_running ? jiffies_delta_to_clock_t(delta) : 0, im->reporter); } return 0; } static const struct seq_operations igmp_mc_seq_ops = { .start = igmp_mc_seq_start, .next = igmp_mc_seq_next, .stop = igmp_mc_seq_stop, .show = igmp_mc_seq_show, }; struct igmp_mcf_iter_state { struct seq_net_private p; struct net_device *dev; struct in_device *idev; struct ip_mc_list *im; }; #define igmp_mcf_seq_private(seq) ((struct igmp_mcf_iter_state *)(seq)->private) static inline struct ip_sf_list *igmp_mcf_get_first(struct seq_file *seq) { struct net *net = seq_file_net(seq); struct ip_sf_list *psf = NULL; struct ip_mc_list *im = NULL; struct igmp_mcf_iter_state *state = igmp_mcf_seq_private(seq); state->idev = NULL; state->im = NULL; for_each_netdev_rcu(net, state->dev) { struct in_device *idev; idev = __in_dev_get_rcu(state->dev); if (unlikely(!idev)) continue; im = rcu_dereference(idev->mc_list); if (likely(im)) { spin_lock_bh(&im->lock); psf = im->sources; if (likely(psf)) { state->im = im; state->idev = idev; break; } spin_unlock_bh(&im->lock); } } return psf; } static struct ip_sf_list *igmp_mcf_get_next(struct seq_file *seq, struct ip_sf_list *psf) { struct igmp_mcf_iter_state *state = igmp_mcf_seq_private(seq); psf = psf->sf_next; while (!psf) { spin_unlock_bh(&state->im->lock); state->im = state->im->next; while (!state->im) { state->dev = next_net_device_rcu(state->dev); if (!state->dev) { state->idev = NULL; goto out; } state->idev = __in_dev_get_rcu(state->dev); if (!state->idev) continue; state->im = rcu_dereference(state->idev->mc_list); } spin_lock_bh(&state->im->lock); psf = state->im->sources; } out: return psf; } static struct ip_sf_list *igmp_mcf_get_idx(struct seq_file *seq, loff_t pos) { struct ip_sf_list *psf = igmp_mcf_get_first(seq); if (psf) while (pos && (psf = igmp_mcf_get_next(seq, psf)) != NULL) --pos; return pos ? NULL : psf; } static void *igmp_mcf_seq_start(struct seq_file *seq, loff_t *pos) __acquires(rcu) { rcu_read_lock(); return *pos ? igmp_mcf_get_idx(seq, *pos - 1) : SEQ_START_TOKEN; } static void *igmp_mcf_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct ip_sf_list *psf; if (v == SEQ_START_TOKEN) psf = igmp_mcf_get_first(seq); else psf = igmp_mcf_get_next(seq, v); ++*pos; return psf; } static void igmp_mcf_seq_stop(struct seq_file *seq, void *v) __releases(rcu) { struct igmp_mcf_iter_state *state = igmp_mcf_seq_private(seq); if (likely(state->im)) { spin_unlock_bh(&state->im->lock); state->im = NULL; } state->idev = NULL; state->dev = NULL; rcu_read_unlock(); } static int igmp_mcf_seq_show(struct seq_file *seq, void *v) { struct ip_sf_list *psf = v; struct igmp_mcf_iter_state *state = igmp_mcf_seq_private(seq); if (v == SEQ_START_TOKEN) { seq_puts(seq, "Idx Device MCA SRC INC EXC\n"); } else { seq_printf(seq, "%3d %6.6s 0x%08x " "0x%08x %6lu %6lu\n", state->dev->ifindex, state->dev->name, ntohl(state->im->multiaddr), ntohl(psf->sf_inaddr), psf->sf_count[MCAST_INCLUDE], psf->sf_count[MCAST_EXCLUDE]); } return 0; } static const struct seq_operations igmp_mcf_seq_ops = { .start = igmp_mcf_seq_start, .next = igmp_mcf_seq_next, .stop = igmp_mcf_seq_stop, .show = igmp_mcf_seq_show, }; static int __net_init igmp_net_init(struct net *net) { struct proc_dir_entry *pde; int err; pde = proc_create_net("igmp", 0444, net->proc_net, &igmp_mc_seq_ops, sizeof(struct igmp_mc_iter_state)); if (!pde) goto out_igmp; pde = proc_create_net("mcfilter", 0444, net->proc_net, &igmp_mcf_seq_ops, sizeof(struct igmp_mcf_iter_state)); if (!pde) goto out_mcfilter; err = inet_ctl_sock_create(&net->ipv4.mc_autojoin_sk, AF_INET, SOCK_DGRAM, 0, net); if (err < 0) { pr_err("Failed to initialize the IGMP autojoin socket (err %d)\n", err); goto out_sock; } return 0; out_sock: remove_proc_entry("mcfilter", net->proc_net); out_mcfilter: remove_proc_entry("igmp", net->proc_net); out_igmp: return -ENOMEM; } static void __net_exit igmp_net_exit(struct net *net) { remove_proc_entry("mcfilter", net->proc_net); remove_proc_entry("igmp", net->proc_net); inet_ctl_sock_destroy(net->ipv4.mc_autojoin_sk); } static struct pernet_operations igmp_net_ops = { .init = igmp_net_init, .exit = igmp_net_exit, }; #endif static int igmp_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct in_device *in_dev; switch (event) { case NETDEV_RESEND_IGMP: in_dev = __in_dev_get_rtnl(dev); if (in_dev) ip_mc_rejoin_groups(in_dev); break; default: break; } return NOTIFY_DONE; } static struct notifier_block igmp_notifier = { .notifier_call = igmp_netdev_event, }; int __init igmp_mc_init(void) { #if defined(CONFIG_PROC_FS) int err; err = register_pernet_subsys(&igmp_net_ops); if (err) return err; err = register_netdevice_notifier(&igmp_notifier); if (err) goto reg_notif_fail; return 0; reg_notif_fail: unregister_pernet_subsys(&igmp_net_ops); return err; #else return register_netdevice_notifier(&igmp_notifier); #endif } |
| 26 25 26 26 22 26 76 77 60 59 59 59 59 59 2 66 7 1 65 77 4 77 78 17 348 336 172 337 170 336 337 197 336 220 220 220 219 220 62 133 256 147 117 147 52 256 133 62 62 62 62 62 45 45 45 45 45 45 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 | // SPDX-License-Identifier: GPL-2.0-only /* * linux/fs/fat/misc.c * * Written 1992,1993 by Werner Almesberger * 22/11/2000 - Fixed fat_date_unix2dos for dates earlier than 01/01/1980 * and date_dos2unix for date==0 by Igor Zhbanov(bsg@uniyar.ac.ru) */ #include "fat.h" #include <linux/iversion.h> /* * fat_fs_error reports a file system problem that might indicate fa data * corruption/inconsistency. Depending on 'errors' mount option the * panic() is called, or error message is printed FAT and nothing is done, * or filesystem is remounted read-only (default behavior). * In case the file system is remounted read-only, it can be made writable * again by remounting it. */ void __fat_fs_error(struct super_block *sb, int report, const char *fmt, ...) { struct fat_mount_options *opts = &MSDOS_SB(sb)->options; va_list args; struct va_format vaf; if (report) { va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; fat_msg(sb, KERN_ERR, "error, %pV", &vaf); va_end(args); } if (opts->errors == FAT_ERRORS_PANIC) panic("FAT-fs (%s): fs panic from previous error\n", sb->s_id); else if (opts->errors == FAT_ERRORS_RO && !sb_rdonly(sb)) { sb->s_flags |= SB_RDONLY; fat_msg(sb, KERN_ERR, "Filesystem has been set read-only"); } } EXPORT_SYMBOL_GPL(__fat_fs_error); /** * _fat_msg() - Print a preformatted FAT message based on a superblock. * @sb: A pointer to a &struct super_block * @level: A Kernel printk level constant * @fmt: The printf-style format string to print. * * Everything that is not fat_fs_error() should be fat_msg(). * * fat_msg() wraps _fat_msg() for printk indexing. */ void _fat_msg(struct super_block *sb, const char *level, const char *fmt, ...) { struct va_format vaf; va_list args; va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; _printk(FAT_PRINTK_PREFIX "%pV\n", level, sb->s_id, &vaf); va_end(args); } /* Flushes the number of free clusters on FAT32 */ /* XXX: Need to write one per FSINFO block. Currently only writes 1 */ int fat_clusters_flush(struct super_block *sb) { struct msdos_sb_info *sbi = MSDOS_SB(sb); struct buffer_head *bh; struct fat_boot_fsinfo *fsinfo; if (!is_fat32(sbi)) return 0; bh = sb_bread(sb, sbi->fsinfo_sector); if (bh == NULL) { fat_msg(sb, KERN_ERR, "bread failed in fat_clusters_flush"); return -EIO; } fsinfo = (struct fat_boot_fsinfo *)bh->b_data; /* Sanity check */ if (!IS_FSINFO(fsinfo)) { fat_msg(sb, KERN_ERR, "Invalid FSINFO signature: " "0x%08x, 0x%08x (sector = %lu)", le32_to_cpu(fsinfo->signature1), le32_to_cpu(fsinfo->signature2), sbi->fsinfo_sector); } else { if (sbi->free_clusters != -1) fsinfo->free_clusters = cpu_to_le32(sbi->free_clusters); if (sbi->prev_free != -1) fsinfo->next_cluster = cpu_to_le32(sbi->prev_free); mark_buffer_dirty(bh); } brelse(bh); return 0; } /* * fat_chain_add() adds a new cluster to the chain of clusters represented * by inode. */ int fat_chain_add(struct inode *inode, int new_dclus, int nr_cluster) { struct super_block *sb = inode->i_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); int ret, new_fclus, last; /* * We must locate the last cluster of the file to add this new * one (new_dclus) to the end of the link list (the FAT). */ last = new_fclus = 0; if (MSDOS_I(inode)->i_start) { int fclus, dclus; ret = fat_get_cluster(inode, FAT_ENT_EOF, &fclus, &dclus); if (ret < 0) return ret; new_fclus = fclus + 1; last = dclus; } /* add new one to the last of the cluster chain */ if (last) { struct fat_entry fatent; fatent_init(&fatent); ret = fat_ent_read(inode, &fatent, last); if (ret >= 0) { int wait = inode_needs_sync(inode); ret = fat_ent_write(inode, &fatent, new_dclus, wait); fatent_brelse(&fatent); } if (ret < 0) return ret; /* * FIXME:Although we can add this cache, fat_cache_add() is * assuming to be called after linear search with fat_cache_id. */ // fat_cache_add(inode, new_fclus, new_dclus); } else { MSDOS_I(inode)->i_start = new_dclus; MSDOS_I(inode)->i_logstart = new_dclus; /* * Since generic_write_sync() synchronizes regular files later, * we sync here only directories. */ if (S_ISDIR(inode->i_mode) && IS_DIRSYNC(inode)) { ret = fat_sync_inode(inode); if (ret) return ret; } else mark_inode_dirty(inode); } if (new_fclus != (inode->i_blocks >> (sbi->cluster_bits - 9))) { fat_fs_error(sb, "clusters badly computed (%d != %llu)", new_fclus, (llu)(inode->i_blocks >> (sbi->cluster_bits - 9))); fat_cache_inval_inode(inode); } inode->i_blocks += nr_cluster << (sbi->cluster_bits - 9); return 0; } /* * The epoch of FAT timestamp is 1980. * : bits : value * date: 0 - 4: day (1 - 31) * date: 5 - 8: month (1 - 12) * date: 9 - 15: year (0 - 127) from 1980 * time: 0 - 4: sec (0 - 29) 2sec counts * time: 5 - 10: min (0 - 59) * time: 11 - 15: hour (0 - 23) */ #define SECS_PER_MIN 60 #define SECS_PER_HOUR (60 * 60) #define SECS_PER_DAY (SECS_PER_HOUR * 24) /* days between 1.1.70 and 1.1.80 (2 leap days) */ #define DAYS_DELTA (365 * 10 + 2) /* 120 (2100 - 1980) isn't leap year */ #define YEAR_2100 120 #define IS_LEAP_YEAR(y) (!((y) & 3) && (y) != YEAR_2100) /* Linear day numbers of the respective 1sts in non-leap years. */ static long days_in_year[] = { /* Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec */ 0, 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 0, 0, 0, }; static inline int fat_tz_offset(const struct msdos_sb_info *sbi) { return (sbi->options.tz_set ? -sbi->options.time_offset : sys_tz.tz_minuteswest) * SECS_PER_MIN; } /* Convert a FAT time/date pair to a UNIX date (seconds since 1 1 70). */ void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec64 *ts, __le16 __time, __le16 __date, u8 time_cs) { u16 time = le16_to_cpu(__time), date = le16_to_cpu(__date); time64_t second; long day, leap_day, month, year; year = date >> 9; month = max(1, (date >> 5) & 0xf); day = max(1, date & 0x1f) - 1; leap_day = (year + 3) / 4; if (year > YEAR_2100) /* 2100 isn't leap year */ leap_day--; if (IS_LEAP_YEAR(year) && month > 2) leap_day++; second = (time & 0x1f) << 1; second += ((time >> 5) & 0x3f) * SECS_PER_MIN; second += (time >> 11) * SECS_PER_HOUR; second += (time64_t)(year * 365 + leap_day + days_in_year[month] + day + DAYS_DELTA) * SECS_PER_DAY; second += fat_tz_offset(sbi); if (time_cs) { ts->tv_sec = second + (time_cs / 100); ts->tv_nsec = (time_cs % 100) * 10000000; } else { ts->tv_sec = second; ts->tv_nsec = 0; } } /* Export fat_time_fat2unix() for the fat_test KUnit tests. */ EXPORT_SYMBOL_GPL(fat_time_fat2unix); /* Convert linear UNIX date to a FAT time/date pair. */ void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec64 *ts, __le16 *time, __le16 *date, u8 *time_cs) { struct tm tm; time64_to_tm(ts->tv_sec, -fat_tz_offset(sbi), &tm); /* FAT can only support year between 1980 to 2107 */ if (tm.tm_year < 1980 - 1900) { *time = 0; *date = cpu_to_le16((0 << 9) | (1 << 5) | 1); if (time_cs) *time_cs = 0; return; } if (tm.tm_year > 2107 - 1900) { *time = cpu_to_le16((23 << 11) | (59 << 5) | 29); *date = cpu_to_le16((127 << 9) | (12 << 5) | 31); if (time_cs) *time_cs = 199; return; } /* from 1900 -> from 1980 */ tm.tm_year -= 80; /* 0~11 -> 1~12 */ tm.tm_mon++; /* 0~59 -> 0~29(2sec counts) */ tm.tm_sec >>= 1; *time = cpu_to_le16(tm.tm_hour << 11 | tm.tm_min << 5 | tm.tm_sec); *date = cpu_to_le16(tm.tm_year << 9 | tm.tm_mon << 5 | tm.tm_mday); if (time_cs) *time_cs = (ts->tv_sec & 1) * 100 + ts->tv_nsec / 10000000; } EXPORT_SYMBOL_GPL(fat_time_unix2fat); static inline struct timespec64 fat_timespec64_trunc_2secs(struct timespec64 ts) { return (struct timespec64){ ts.tv_sec & ~1ULL, 0 }; } /* * truncate atime to 24 hour granularity (00:00:00 in local timezone) */ struct timespec64 fat_truncate_atime(const struct msdos_sb_info *sbi, const struct timespec64 *ts) { /* to localtime */ time64_t seconds = ts->tv_sec - fat_tz_offset(sbi); s32 remainder; div_s64_rem(seconds, SECS_PER_DAY, &remainder); /* to day boundary, and back to unix time */ seconds = seconds + fat_tz_offset(sbi) - remainder; return (struct timespec64){ seconds, 0 }; } /* * truncate mtime to 2 second granularity */ struct timespec64 fat_truncate_mtime(const struct msdos_sb_info *sbi, const struct timespec64 *ts) { return fat_timespec64_trunc_2secs(*ts); } /* * truncate the various times with appropriate granularity: * all times in root node are always 0 */ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); struct timespec64 ts; if (inode->i_ino == MSDOS_ROOT_INO) return 0; if (now == NULL) { now = &ts; ts = current_time(inode); } if (flags & S_ATIME) inode_set_atime_to_ts(inode, fat_truncate_atime(sbi, now)); /* * ctime and mtime share the same on-disk field, and should be * identical in memory. all mtime updates will be applied to ctime, * but ctime updates are ignored. */ if (flags & S_MTIME) inode_set_mtime_to_ts(inode, inode_set_ctime_to_ts(inode, fat_truncate_mtime(sbi, now))); return 0; } EXPORT_SYMBOL_GPL(fat_truncate_time); int fat_update_time(struct inode *inode, int flags) { int dirty_flags = 0; if (inode->i_ino == MSDOS_ROOT_INO) return 0; if (flags & (S_ATIME | S_CTIME | S_MTIME)) { fat_truncate_time(inode, NULL, flags); if (inode->i_sb->s_flags & SB_LAZYTIME) dirty_flags |= I_DIRTY_TIME; else dirty_flags |= I_DIRTY_SYNC; } __mark_inode_dirty(inode, dirty_flags); return 0; } EXPORT_SYMBOL_GPL(fat_update_time); int fat_sync_bhs(struct buffer_head **bhs, int nr_bhs) { int i, err = 0; for (i = 0; i < nr_bhs; i++) write_dirty_buffer(bhs[i], 0); for (i = 0; i < nr_bhs; i++) { wait_on_buffer(bhs[i]); if (!err && !buffer_uptodate(bhs[i])) err = -EIO; } return err; } |
| 75 114 24 113 95 114 110 110 114 48 157 41 10 109 110 31 26 26 45 42 45 8 4 118 25 23 24 11 17 12 16 24 16 16 14 9 7 2 7 6 2 7 4 4 4 7 6 3 5 180 178 173 119 118 118 102 80 80 1 1 80 111 112 115 114 106 99 98 1 98 97 97 1 98 75 1 15 25 25 25 25 25 12 12 12 12 12 10 2 1 12 12 11 8 7 6 6 6 6 6 6 1 5 3 5 11 51 51 50 5 46 47 10 10 1 9 10 1 50 15 15 15 14 15 15 12 9 8 8 1 7 7 7 3 7 15 7 5 7 3 2 2 1 5 4 5 2 10 3 5 15 5 15 15 7 2 5 1 7 7 7 7 4 7 1 3 3 1 3 1 7 4 4 4 4 2 3 2 1 7 7 7 7 1 1 7 1 6 1 1 1 3 1 2 3 2 2 2 65 45 39 37 65 49 47 4 45 3 57 64 64 1 64 55 54 53 50 50 50 2 1 49 19 15 59 42 13 7 6 6 4 4 4 2 2 1 1 1 1 1 1 4 6 12 282 276 276 274 272 286 4 282 9 286 276 10 10 10 10 1 14 14 240 240 238 238 231 16 15 13 12 7 3 2 1 229 10 224 5 1 1 1 6 5 6 3 3 3 6 7 7 5 2 2 2 1 1 7 40 39 30 39 26 6 6 3 3 3 1 3 3 12 12 9 8 6 4 25 4 25 4 25 10 25 6 5 25 7 4 19 4 38 38 38 4 2 36 32 34 31 6 6 6 1 6 30 11 30 27 30 30 15 15 30 7 7 5 25 6 2 2 2 25 10 3 1 25 25 25 6 25 4 1 25 10 2 1 25 7 15 12 5 10 2 1 2 1 10 2 8 2 8 9 2 10 15 12 8 12 28 27 8 26 26 25 3 2 1 1 24 25 24 15 2 15 1 29 28 2 1 24 26 145 111 4 2 4 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 | /* FUSE: Filesystem in Userspace Copyright (C) 2001-2008 Miklos Szeredi <miklos@szeredi.hu> This program can be distributed under the terms of the GNU GPL. See the file COPYING. */ #include "fuse_i.h" #include <linux/pagemap.h> #include <linux/file.h> #include <linux/fs_context.h> #include <linux/moduleparam.h> #include <linux/sched.h> #include <linux/namei.h> #include <linux/slab.h> #include <linux/xattr.h> #include <linux/iversion.h> #include <linux/posix_acl.h> #include <linux/security.h> #include <linux/types.h> #include <linux/kernel.h> static bool __read_mostly allow_sys_admin_access; module_param(allow_sys_admin_access, bool, 0644); MODULE_PARM_DESC(allow_sys_admin_access, "Allow users with CAP_SYS_ADMIN in initial userns to bypass allow_other access check"); static void fuse_advise_use_readdirplus(struct inode *dir) { struct fuse_inode *fi = get_fuse_inode(dir); set_bit(FUSE_I_ADVISE_RDPLUS, &fi->state); } #if BITS_PER_LONG >= 64 static inline void __fuse_dentry_settime(struct dentry *entry, u64 time) { entry->d_fsdata = (void *) time; } static inline u64 fuse_dentry_time(const struct dentry *entry) { return (u64)entry->d_fsdata; } #else union fuse_dentry { u64 time; struct rcu_head rcu; }; static inline void __fuse_dentry_settime(struct dentry *dentry, u64 time) { ((union fuse_dentry *) dentry->d_fsdata)->time = time; } static inline u64 fuse_dentry_time(const struct dentry *entry) { return ((union fuse_dentry *) entry->d_fsdata)->time; } #endif static void fuse_dentry_settime(struct dentry *dentry, u64 time) { struct fuse_conn *fc = get_fuse_conn_super(dentry->d_sb); bool delete = !time && fc->delete_stale; /* * Mess with DCACHE_OP_DELETE because dput() will be faster without it. * Don't care about races, either way it's just an optimization */ if ((!delete && (dentry->d_flags & DCACHE_OP_DELETE)) || (delete && !(dentry->d_flags & DCACHE_OP_DELETE))) { spin_lock(&dentry->d_lock); if (!delete) dentry->d_flags &= ~DCACHE_OP_DELETE; else dentry->d_flags |= DCACHE_OP_DELETE; spin_unlock(&dentry->d_lock); } __fuse_dentry_settime(dentry, time); } /* * FUSE caches dentries and attributes with separate timeout. The * time in jiffies until the dentry/attributes are valid is stored in * dentry->d_fsdata and fuse_inode->i_time respectively. */ /* * Calculate the time in jiffies until a dentry/attributes are valid */ u64 fuse_time_to_jiffies(u64 sec, u32 nsec) { if (sec || nsec) { struct timespec64 ts = { sec, min_t(u32, nsec, NSEC_PER_SEC - 1) }; return get_jiffies_64() + timespec64_to_jiffies(&ts); } else return 0; } /* * Set dentry and possibly attribute timeouts from the lookup/mk* * replies */ void fuse_change_entry_timeout(struct dentry *entry, struct fuse_entry_out *o) { fuse_dentry_settime(entry, fuse_time_to_jiffies(o->entry_valid, o->entry_valid_nsec)); } void fuse_invalidate_attr_mask(struct inode *inode, u32 mask) { set_mask_bits(&get_fuse_inode(inode)->inval_mask, 0, mask); } /* * Mark the attributes as stale, so that at the next call to * ->getattr() they will be fetched from userspace */ void fuse_invalidate_attr(struct inode *inode) { fuse_invalidate_attr_mask(inode, STATX_BASIC_STATS); } static void fuse_dir_changed(struct inode *dir) { fuse_invalidate_attr(dir); inode_maybe_inc_iversion(dir, false); } /* * Mark the attributes as stale due to an atime change. Avoid the invalidate if * atime is not used. */ void fuse_invalidate_atime(struct inode *inode) { if (!IS_RDONLY(inode)) fuse_invalidate_attr_mask(inode, STATX_ATIME); } /* * Just mark the entry as stale, so that a next attempt to look it up * will result in a new lookup call to userspace * * This is called when a dentry is about to become negative and the * timeout is unknown (unlink, rmdir, rename and in some cases * lookup) */ void fuse_invalidate_entry_cache(struct dentry *entry) { fuse_dentry_settime(entry, 0); } /* * Same as fuse_invalidate_entry_cache(), but also try to remove the * dentry from the hash */ static void fuse_invalidate_entry(struct dentry *entry) { d_invalidate(entry); fuse_invalidate_entry_cache(entry); } static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args, u64 nodeid, const struct qstr *name, struct fuse_entry_out *outarg) { memset(outarg, 0, sizeof(struct fuse_entry_out)); args->opcode = FUSE_LOOKUP; args->nodeid = nodeid; args->in_numargs = 1; args->in_args[0].size = name->len + 1; args->in_args[0].value = name->name; args->out_numargs = 1; args->out_args[0].size = sizeof(struct fuse_entry_out); args->out_args[0].value = outarg; } /* * Check whether the dentry is still valid * * If the entry validity timeout has expired and the dentry is * positive, try to redo the lookup. If the lookup results in a * different inode, then let the VFS invalidate the dentry and redo * the lookup once more. If the lookup results in the same inode, * then refresh the attributes, timeouts and mark the dentry valid. */ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags) { struct inode *inode; struct dentry *parent; struct fuse_mount *fm; struct fuse_inode *fi; int ret; inode = d_inode_rcu(entry); if (inode && fuse_is_bad(inode)) goto invalid; else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) || (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) { struct fuse_entry_out outarg; FUSE_ARGS(args); struct fuse_forget_link *forget; u64 attr_version; /* For negative dentries, always do a fresh lookup */ if (!inode) goto invalid; ret = -ECHILD; if (flags & LOOKUP_RCU) goto out; fm = get_fuse_mount(inode); forget = fuse_alloc_forget(); ret = -ENOMEM; if (!forget) goto out; attr_version = fuse_get_attr_version(fm->fc); parent = dget_parent(entry); fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)), &entry->d_name, &outarg); ret = fuse_simple_request(fm, &args); dput(parent); /* Zero nodeid is same as -ENOENT */ if (!ret && !outarg.nodeid) ret = -ENOENT; if (!ret) { fi = get_fuse_inode(inode); if (outarg.nodeid != get_node_id(inode) || (bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) { fuse_queue_forget(fm->fc, forget, outarg.nodeid, 1); goto invalid; } spin_lock(&fi->lock); fi->nlookup++; spin_unlock(&fi->lock); } kfree(forget); if (ret == -ENOMEM || ret == -EINTR) goto out; if (ret || fuse_invalid_attr(&outarg.attr) || fuse_stale_inode(inode, outarg.generation, &outarg.attr)) goto invalid; forget_all_cached_acls(inode); fuse_change_attributes(inode, &outarg.attr, NULL, ATTR_TIMEOUT(&outarg), attr_version); fuse_change_entry_timeout(entry, &outarg); } else if (inode) { fi = get_fuse_inode(inode); if (flags & LOOKUP_RCU) { if (test_bit(FUSE_I_INIT_RDPLUS, &fi->state)) return -ECHILD; } else if (test_and_clear_bit(FUSE_I_INIT_RDPLUS, &fi->state)) { parent = dget_parent(entry); fuse_advise_use_readdirplus(d_inode(parent)); dput(parent); } } ret = 1; out: return ret; invalid: ret = 0; goto out; } #if BITS_PER_LONG < 64 static int fuse_dentry_init(struct dentry *dentry) { dentry->d_fsdata = kzalloc(sizeof(union fuse_dentry), GFP_KERNEL_ACCOUNT | __GFP_RECLAIMABLE); return dentry->d_fsdata ? 0 : -ENOMEM; } static void fuse_dentry_release(struct dentry *dentry) { union fuse_dentry *fd = dentry->d_fsdata; kfree_rcu(fd, rcu); } #endif static int fuse_dentry_delete(const struct dentry *dentry) { return time_before64(fuse_dentry_time(dentry), get_jiffies_64()); } /* * Create a fuse_mount object with a new superblock (with path->dentry * as the root), and return that mount so it can be auto-mounted on * @path. */ static struct vfsmount *fuse_dentry_automount(struct path *path) { struct fs_context *fsc; struct vfsmount *mnt; struct fuse_inode *mp_fi = get_fuse_inode(d_inode(path->dentry)); fsc = fs_context_for_submount(path->mnt->mnt_sb->s_type, path->dentry); if (IS_ERR(fsc)) return ERR_CAST(fsc); /* Pass the FUSE inode of the mount for fuse_get_tree_submount() */ fsc->fs_private = mp_fi; /* Create the submount */ mnt = fc_mount(fsc); if (!IS_ERR(mnt)) mntget(mnt); put_fs_context(fsc); return mnt; } const struct dentry_operations fuse_dentry_operations = { .d_revalidate = fuse_dentry_revalidate, .d_delete = fuse_dentry_delete, #if BITS_PER_LONG < 64 .d_init = fuse_dentry_init, .d_release = fuse_dentry_release, #endif .d_automount = fuse_dentry_automount, }; const struct dentry_operations fuse_root_dentry_operations = { #if BITS_PER_LONG < 64 .d_init = fuse_dentry_init, .d_release = fuse_dentry_release, #endif }; int fuse_valid_type(int m) { return S_ISREG(m) || S_ISDIR(m) || S_ISLNK(m) || S_ISCHR(m) || S_ISBLK(m) || S_ISFIFO(m) || S_ISSOCK(m); } static bool fuse_valid_size(u64 size) { return size <= LLONG_MAX; } bool fuse_invalid_attr(struct fuse_attr *attr) { return !fuse_valid_type(attr->mode) || !fuse_valid_size(attr->size); } int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name, struct fuse_entry_out *outarg, struct inode **inode) { struct fuse_mount *fm = get_fuse_mount_super(sb); FUSE_ARGS(args); struct fuse_forget_link *forget; u64 attr_version; int err; *inode = NULL; err = -ENAMETOOLONG; if (name->len > FUSE_NAME_MAX) goto out; forget = fuse_alloc_forget(); err = -ENOMEM; if (!forget) goto out; attr_version = fuse_get_attr_version(fm->fc); fuse_lookup_init(fm->fc, &args, nodeid, name, outarg); err = fuse_simple_request(fm, &args); /* Zero nodeid is same as -ENOENT, but with valid timeout */ if (err || !outarg->nodeid) goto out_put_forget; err = -EIO; if (fuse_invalid_attr(&outarg->attr)) goto out_put_forget; if (outarg->nodeid == FUSE_ROOT_ID && outarg->generation != 0) { pr_warn_once("root generation should be zero\n"); outarg->generation = 0; } *inode = fuse_iget(sb, outarg->nodeid, outarg->generation, &outarg->attr, ATTR_TIMEOUT(outarg), attr_version); err = -ENOMEM; if (!*inode) { fuse_queue_forget(fm->fc, forget, outarg->nodeid, 1); goto out; } err = 0; out_put_forget: kfree(forget); out: return err; } static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry, unsigned int flags) { int err; struct fuse_entry_out outarg; struct inode *inode; struct dentry *newent; bool outarg_valid = true; bool locked; if (fuse_is_bad(dir)) return ERR_PTR(-EIO); locked = fuse_lock_inode(dir); err = fuse_lookup_name(dir->i_sb, get_node_id(dir), &entry->d_name, &outarg, &inode); fuse_unlock_inode(dir, locked); if (err == -ENOENT) { outarg_valid = false; err = 0; } if (err) goto out_err; err = -EIO; if (inode && get_node_id(inode) == FUSE_ROOT_ID) goto out_iput; newent = d_splice_alias(inode, entry); err = PTR_ERR(newent); if (IS_ERR(newent)) goto out_err; entry = newent ? newent : entry; if (outarg_valid) fuse_change_entry_timeout(entry, &outarg); else fuse_invalidate_entry_cache(entry); if (inode) fuse_advise_use_readdirplus(dir); return newent; out_iput: iput(inode); out_err: return ERR_PTR(err); } static int get_security_context(struct dentry *entry, umode_t mode, struct fuse_in_arg *ext) { struct fuse_secctx *fctx; struct fuse_secctx_header *header; void *ctx = NULL, *ptr; u32 ctxlen, total_len = sizeof(*header); int err, nr_ctx = 0; const char *name; size_t namelen; err = security_dentry_init_security(entry, mode, &entry->d_name, &name, &ctx, &ctxlen); if (err) { if (err != -EOPNOTSUPP) goto out_err; /* No LSM is supporting this security hook. Ignore error */ ctxlen = 0; ctx = NULL; } if (ctxlen) { nr_ctx = 1; namelen = strlen(name) + 1; err = -EIO; if (WARN_ON(namelen > XATTR_NAME_MAX + 1 || ctxlen > S32_MAX)) goto out_err; total_len += FUSE_REC_ALIGN(sizeof(*fctx) + namelen + ctxlen); } err = -ENOMEM; header = ptr = kzalloc(total_len, GFP_KERNEL); if (!ptr) goto out_err; header->nr_secctx = nr_ctx; header->size = total_len; ptr += sizeof(*header); if (nr_ctx) { fctx = ptr; fctx->size = ctxlen; ptr += sizeof(*fctx); strcpy(ptr, name); ptr += namelen; memcpy(ptr, ctx, ctxlen); } ext->size = total_len; ext->value = header; err = 0; out_err: kfree(ctx); return err; } static void *extend_arg(struct fuse_in_arg *buf, u32 bytes) { void *p; u32 newlen = buf->size + bytes; p = krealloc(buf->value, newlen, GFP_KERNEL); if (!p) { kfree(buf->value); buf->size = 0; buf->value = NULL; return NULL; } memset(p + buf->size, 0, bytes); buf->value = p; buf->size = newlen; return p + newlen - bytes; } static u32 fuse_ext_size(size_t size) { return FUSE_REC_ALIGN(sizeof(struct fuse_ext_header) + size); } /* * This adds just a single supplementary group that matches the parent's group. */ static int get_create_supp_group(struct inode *dir, struct fuse_in_arg *ext) { struct fuse_conn *fc = get_fuse_conn(dir); struct fuse_ext_header *xh; struct fuse_supp_groups *sg; kgid_t kgid = dir->i_gid; gid_t parent_gid = from_kgid(fc->user_ns, kgid); u32 sg_len = fuse_ext_size(sizeof(*sg) + sizeof(sg->groups[0])); if (parent_gid == (gid_t) -1 || gid_eq(kgid, current_fsgid()) || !in_group_p(kgid)) return 0; xh = extend_arg(ext, sg_len); if (!xh) return -ENOMEM; xh->size = sg_len; xh->type = FUSE_EXT_GROUPS; sg = (struct fuse_supp_groups *) &xh[1]; sg->nr_groups = 1; sg->groups[0] = parent_gid; return 0; } static int get_create_ext(struct fuse_args *args, struct inode *dir, struct dentry *dentry, umode_t mode) { struct fuse_conn *fc = get_fuse_conn_super(dentry->d_sb); struct fuse_in_arg ext = { .size = 0, .value = NULL }; int err = 0; if (fc->init_security) err = get_security_context(dentry, mode, &ext); if (!err && fc->create_supp_group) err = get_create_supp_group(dir, &ext); if (!err && ext.size) { WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args)); args->is_ext = true; args->ext_idx = args->in_numargs++; args->in_args[args->ext_idx] = ext; } else { kfree(ext.value); } return err; } static void free_ext_value(struct fuse_args *args) { if (args->is_ext) kfree(args->in_args[args->ext_idx].value); } /* * Atomic create+open operation * * If the filesystem doesn't support this, then fall back to separate * 'mknod' + 'open' requests. */ static int fuse_create_open(struct inode *dir, struct dentry *entry, struct file *file, unsigned int flags, umode_t mode, u32 opcode) { int err; struct inode *inode; struct fuse_mount *fm = get_fuse_mount(dir); FUSE_ARGS(args); struct fuse_forget_link *forget; struct fuse_create_in inarg; struct fuse_open_out *outopenp; struct fuse_entry_out outentry; struct fuse_inode *fi; struct fuse_file *ff; bool trunc = flags & O_TRUNC; /* Userspace expects S_IFREG in create mode */ BUG_ON((mode & S_IFMT) != S_IFREG); forget = fuse_alloc_forget(); err = -ENOMEM; if (!forget) goto out_err; err = -ENOMEM; ff = fuse_file_alloc(fm, true); if (!ff) goto out_put_forget_req; if (!fm->fc->dont_mask) mode &= ~current_umask(); flags &= ~O_NOCTTY; memset(&inarg, 0, sizeof(inarg)); memset(&outentry, 0, sizeof(outentry)); inarg.flags = flags; inarg.mode = mode; inarg.umask = current_umask(); if (fm->fc->handle_killpriv_v2 && trunc && !(flags & O_EXCL) && !capable(CAP_FSETID)) { inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID; } args.opcode = opcode; args.nodeid = get_node_id(dir); args.in_numargs = 2; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; args.out_numargs = 2; args.out_args[0].size = sizeof(outentry); args.out_args[0].value = &outentry; /* Store outarg for fuse_finish_open() */ outopenp = &ff->args->open_outarg; args.out_args[1].size = sizeof(*outopenp); args.out_args[1].value = outopenp; err = get_create_ext(&args, dir, entry, mode); if (err) goto out_free_ff; err = fuse_simple_request(fm, &args); free_ext_value(&args); if (err) goto out_free_ff; err = -EIO; if (!S_ISREG(outentry.attr.mode) || invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr)) goto out_free_ff; ff->fh = outopenp->fh; ff->nodeid = outentry.nodeid; ff->open_flags = outopenp->open_flags; inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation, &outentry.attr, ATTR_TIMEOUT(&outentry), 0); if (!inode) { flags &= ~(O_CREAT | O_EXCL | O_TRUNC); fuse_sync_release(NULL, ff, flags); fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1); err = -ENOMEM; goto out_err; } kfree(forget); d_instantiate(entry, inode); fuse_change_entry_timeout(entry, &outentry); fuse_dir_changed(dir); err = generic_file_open(inode, file); if (!err) { file->private_data = ff; err = finish_open(file, entry, fuse_finish_open); } if (err) { fi = get_fuse_inode(inode); fuse_sync_release(fi, ff, flags); } else { if (fm->fc->atomic_o_trunc && trunc) truncate_pagecache(inode, 0); else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) invalidate_inode_pages2(inode->i_mapping); } return err; out_free_ff: fuse_file_free(ff); out_put_forget_req: kfree(forget); out_err: return err; } static int fuse_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); static int fuse_atomic_open(struct inode *dir, struct dentry *entry, struct file *file, unsigned flags, umode_t mode) { int err; struct fuse_conn *fc = get_fuse_conn(dir); struct dentry *res = NULL; if (fuse_is_bad(dir)) return -EIO; if (d_in_lookup(entry)) { res = fuse_lookup(dir, entry, 0); if (IS_ERR(res)) return PTR_ERR(res); if (res) entry = res; } if (!(flags & O_CREAT) || d_really_is_positive(entry)) goto no_open; /* Only creates */ file->f_mode |= FMODE_CREATED; if (fc->no_create) goto mknod; err = fuse_create_open(dir, entry, file, flags, mode, FUSE_CREATE); if (err == -ENOSYS) { fc->no_create = 1; goto mknod; } else if (err == -EEXIST) fuse_invalidate_entry(entry); out_dput: dput(res); return err; mknod: err = fuse_mknod(&nop_mnt_idmap, dir, entry, mode, 0); if (err) goto out_dput; no_open: return finish_no_open(file, res); } /* * Code shared between mknod, mkdir, symlink and link */ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args, struct inode *dir, struct dentry *entry, umode_t mode) { struct fuse_entry_out outarg; struct inode *inode; struct dentry *d; int err; struct fuse_forget_link *forget; if (fuse_is_bad(dir)) return -EIO; forget = fuse_alloc_forget(); if (!forget) return -ENOMEM; memset(&outarg, 0, sizeof(outarg)); args->nodeid = get_node_id(dir); args->out_numargs = 1; args->out_args[0].size = sizeof(outarg); args->out_args[0].value = &outarg; if (args->opcode != FUSE_LINK) { err = get_create_ext(args, dir, entry, mode); if (err) goto out_put_forget_req; } err = fuse_simple_request(fm, args); free_ext_value(args); if (err) goto out_put_forget_req; err = -EIO; if (invalid_nodeid(outarg.nodeid) || fuse_invalid_attr(&outarg.attr)) goto out_put_forget_req; if ((outarg.attr.mode ^ mode) & S_IFMT) goto out_put_forget_req; inode = fuse_iget(dir->i_sb, outarg.nodeid, outarg.generation, &outarg.attr, ATTR_TIMEOUT(&outarg), 0); if (!inode) { fuse_queue_forget(fm->fc, forget, outarg.nodeid, 1); return -ENOMEM; } kfree(forget); d_drop(entry); d = d_splice_alias(inode, entry); if (IS_ERR(d)) return PTR_ERR(d); if (d) { fuse_change_entry_timeout(d, &outarg); dput(d); } else { fuse_change_entry_timeout(entry, &outarg); } fuse_dir_changed(dir); return 0; out_put_forget_req: if (err == -EEXIST) fuse_invalidate_entry(entry); kfree(forget); return err; } static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev) { struct fuse_mknod_in inarg; struct fuse_mount *fm = get_fuse_mount(dir); FUSE_ARGS(args); if (!fm->fc->dont_mask) mode &= ~current_umask(); memset(&inarg, 0, sizeof(inarg)); inarg.mode = mode; inarg.rdev = new_encode_dev(rdev); inarg.umask = current_umask(); args.opcode = FUSE_MKNOD; args.in_numargs = 2; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; return create_new_entry(fm, &args, dir, entry, mode); } static int fuse_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *entry, umode_t mode, bool excl) { return fuse_mknod(&nop_mnt_idmap, dir, entry, mode, 0); } static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir, struct file *file, umode_t mode) { struct fuse_conn *fc = get_fuse_conn(dir); int err; if (fc->no_tmpfile) return -EOPNOTSUPP; err = fuse_create_open(dir, file->f_path.dentry, file, file->f_flags, mode, FUSE_TMPFILE); if (err == -ENOSYS) { fc->no_tmpfile = 1; err = -EOPNOTSUPP; } return err; } static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir, struct dentry *entry, umode_t mode) { struct fuse_mkdir_in inarg; struct fuse_mount *fm = get_fuse_mount(dir); FUSE_ARGS(args); if (!fm->fc->dont_mask) mode &= ~current_umask(); memset(&inarg, 0, sizeof(inarg)); inarg.mode = mode; inarg.umask = current_umask(); args.opcode = FUSE_MKDIR; args.in_numargs = 2; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.in_args[1].size = entry->d_name.len + 1; args.in_args[1].value = entry->d_name.name; return create_new_entry(fm, &args, dir, entry, S_IFDIR); } static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir, struct dentry *entry, const char *link) { struct fuse_mount *fm = get_fuse_mount(dir); unsigned len = strlen(link) + 1; FUSE_ARGS(args); args.opcode = FUSE_SYMLINK; args.in_numargs = 2; args.in_args[0].size = entry->d_name.len + 1; args.in_args[0].value = entry->d_name.name; args.in_args[1].size = len; args.in_args[1].value = link; return create_new_entry(fm, &args, dir, entry, S_IFLNK); } void fuse_flush_time_update(struct inode *inode) { int err = sync_inode_metadata(inode, 1); mapping_set_error(inode->i_mapping, err); } static void fuse_update_ctime_in_cache(struct inode *inode) { if (!IS_NOCMTIME(inode)) { inode_set_ctime_current(inode); mark_inode_dirty_sync(inode); fuse_flush_time_update(inode); } } void fuse_update_ctime(struct inode *inode) { fuse_invalidate_attr_mask(inode, STATX_CTIME); fuse_update_ctime_in_cache(inode); } static void fuse_entry_unlinked(struct dentry *entry) { struct inode *inode = d_inode(entry); struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_inode *fi = get_fuse_inode(inode); spin_lock(&fi->lock); fi->attr_version = atomic64_inc_return(&fc->attr_version); /* * If i_nlink == 0 then unlink doesn't make sense, yet this can * happen if userspace filesystem is careless. It would be * difficult to enforce correct nlink usage so just ignore this * condition here */ if (S_ISDIR(inode->i_mode)) clear_nlink(inode); else if (inode->i_nlink > 0) drop_nlink(inode); spin_unlock(&fi->lock); fuse_invalidate_entry_cache(entry); fuse_update_ctime(inode); } static int fuse_unlink(struct inode *dir, struct dentry *entry) { int err; struct fuse_mount *fm = get_fuse_mount(dir); FUSE_ARGS(args); if (fuse_is_bad(dir)) return -EIO; args.opcode = FUSE_UNLINK; args.nodeid = get_node_id(dir); args.in_numargs = 1; args.in_args[0].size = entry->d_name.len + 1; args.in_args[0].value = entry->d_name.name; err = fuse_simple_request(fm, &args); if (!err) { fuse_dir_changed(dir); fuse_entry_unlinked(entry); } else if (err == -EINTR || err == -ENOENT) fuse_invalidate_entry(entry); return err; } static int fuse_rmdir(struct inode *dir, struct dentry *entry) { int err; struct fuse_mount *fm = get_fuse_mount(dir); FUSE_ARGS(args); if (fuse_is_bad(dir)) return -EIO; args.opcode = FUSE_RMDIR; args.nodeid = get_node_id(dir); args.in_numargs = 1; args.in_args[0].size = entry->d_name.len + 1; args.in_args[0].value = entry->d_name.name; err = fuse_simple_request(fm, &args); if (!err) { fuse_dir_changed(dir); fuse_entry_unlinked(entry); } else if (err == -EINTR || err == -ENOENT) fuse_invalidate_entry(entry); return err; } static int fuse_rename_common(struct inode *olddir, struct dentry *oldent, struct inode *newdir, struct dentry *newent, unsigned int flags, int opcode, size_t argsize) { int err; struct fuse_rename2_in inarg; struct fuse_mount *fm = get_fuse_mount(olddir); FUSE_ARGS(args); memset(&inarg, 0, argsize); inarg.newdir = get_node_id(newdir); inarg.flags = flags; args.opcode = opcode; args.nodeid = get_node_id(olddir); args.in_numargs = 3; args.in_args[0].size = argsize; args.in_args[0].value = &inarg; args.in_args[1].size = oldent->d_name.len + 1; args.in_args[1].value = oldent->d_name.name; args.in_args[2].size = newent->d_name.len + 1; args.in_args[2].value = newent->d_name.name; err = fuse_simple_request(fm, &args); if (!err) { /* ctime changes */ fuse_update_ctime(d_inode(oldent)); if (flags & RENAME_EXCHANGE) fuse_update_ctime(d_inode(newent)); fuse_dir_changed(olddir); if (olddir != newdir) fuse_dir_changed(newdir); /* newent will end up negative */ if (!(flags & RENAME_EXCHANGE) && d_really_is_positive(newent)) fuse_entry_unlinked(newent); } else if (err == -EINTR || err == -ENOENT) { /* If request was interrupted, DEITY only knows if the rename actually took place. If the invalidation fails (e.g. some process has CWD under the renamed directory), then there can be inconsistency between the dcache and the real filesystem. Tough luck. */ fuse_invalidate_entry(oldent); if (d_really_is_positive(newent)) fuse_invalidate_entry(newent); } return err; } static int fuse_rename2(struct mnt_idmap *idmap, struct inode *olddir, struct dentry *oldent, struct inode *newdir, struct dentry *newent, unsigned int flags) { struct fuse_conn *fc = get_fuse_conn(olddir); int err; if (fuse_is_bad(olddir)) return -EIO; if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) return -EINVAL; if (flags) { if (fc->no_rename2 || fc->minor < 23) return -EINVAL; err = fuse_rename_common(olddir, oldent, newdir, newent, flags, FUSE_RENAME2, sizeof(struct fuse_rename2_in)); if (err == -ENOSYS) { fc->no_rename2 = 1; err = -EINVAL; } } else { err = fuse_rename_common(olddir, oldent, newdir, newent, 0, FUSE_RENAME, sizeof(struct fuse_rename_in)); } return err; } static int fuse_link(struct dentry *entry, struct inode *newdir, struct dentry *newent) { int err; struct fuse_link_in inarg; struct inode *inode = d_inode(entry); struct fuse_mount *fm = get_fuse_mount(inode); FUSE_ARGS(args); memset(&inarg, 0, sizeof(inarg)); inarg.oldnodeid = get_node_id(inode); args.opcode = FUSE_LINK; args.in_numargs = 2; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.in_args[1].size = newent->d_name.len + 1; args.in_args[1].value = newent->d_name.name; err = create_new_entry(fm, &args, newdir, newent, inode->i_mode); if (!err) fuse_update_ctime_in_cache(inode); else if (err == -EINTR) fuse_invalidate_attr(inode); return err; } static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr, struct kstat *stat) { unsigned int blkbits; struct fuse_conn *fc = get_fuse_conn(inode); stat->dev = inode->i_sb->s_dev; stat->ino = attr->ino; stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777); stat->nlink = attr->nlink; stat->uid = make_kuid(fc->user_ns, attr->uid); stat->gid = make_kgid(fc->user_ns, attr->gid); stat->rdev = inode->i_rdev; stat->atime.tv_sec = attr->atime; stat->atime.tv_nsec = attr->atimensec; stat->mtime.tv_sec = attr->mtime; stat->mtime.tv_nsec = attr->mtimensec; stat->ctime.tv_sec = attr->ctime; stat->ctime.tv_nsec = attr->ctimensec; stat->size = attr->size; stat->blocks = attr->blocks; if (attr->blksize != 0) blkbits = ilog2(attr->blksize); else blkbits = inode->i_sb->s_blocksize_bits; stat->blksize = 1 << blkbits; } static void fuse_statx_to_attr(struct fuse_statx *sx, struct fuse_attr *attr) { memset(attr, 0, sizeof(*attr)); attr->ino = sx->ino; attr->size = sx->size; attr->blocks = sx->blocks; attr->atime = sx->atime.tv_sec; attr->mtime = sx->mtime.tv_sec; attr->ctime = sx->ctime.tv_sec; attr->atimensec = sx->atime.tv_nsec; attr->mtimensec = sx->mtime.tv_nsec; attr->ctimensec = sx->ctime.tv_nsec; attr->mode = sx->mode; attr->nlink = sx->nlink; attr->uid = sx->uid; attr->gid = sx->gid; attr->rdev = new_encode_dev(MKDEV(sx->rdev_major, sx->rdev_minor)); attr->blksize = sx->blksize; } static int fuse_do_statx(struct inode *inode, struct file *file, struct kstat *stat) { int err; struct fuse_attr attr; struct fuse_statx *sx; struct fuse_statx_in inarg; struct fuse_statx_out outarg; struct fuse_mount *fm = get_fuse_mount(inode); u64 attr_version = fuse_get_attr_version(fm->fc); FUSE_ARGS(args); memset(&inarg, 0, sizeof(inarg)); memset(&outarg, 0, sizeof(outarg)); /* Directories have separate file-handle space */ if (file && S_ISREG(inode->i_mode)) { struct fuse_file *ff = file->private_data; inarg.getattr_flags |= FUSE_GETATTR_FH; inarg.fh = ff->fh; } /* For now leave sync hints as the default, request all stats. */ inarg.sx_flags = 0; inarg.sx_mask = STATX_BASIC_STATS | STATX_BTIME; args.opcode = FUSE_STATX; args.nodeid = get_node_id(inode); args.in_numargs = 1; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.out_numargs = 1; args.out_args[0].size = sizeof(outarg); args.out_args[0].value = &outarg; err = fuse_simple_request(fm, &args); if (err) return err; sx = &outarg.stat; if (((sx->mask & STATX_SIZE) && !fuse_valid_size(sx->size)) || ((sx->mask & STATX_TYPE) && (!fuse_valid_type(sx->mode) || inode_wrong_type(inode, sx->mode)))) { fuse_make_bad(inode); return -EIO; } fuse_statx_to_attr(&outarg.stat, &attr); if ((sx->mask & STATX_BASIC_STATS) == STATX_BASIC_STATS) { fuse_change_attributes(inode, &attr, &outarg.stat, ATTR_TIMEOUT(&outarg), attr_version); } if (stat) { stat->result_mask = sx->mask & (STATX_BASIC_STATS | STATX_BTIME); stat->btime.tv_sec = sx->btime.tv_sec; stat->btime.tv_nsec = min_t(u32, sx->btime.tv_nsec, NSEC_PER_SEC - 1); fuse_fillattr(inode, &attr, stat); stat->result_mask |= STATX_TYPE; } return 0; } static int fuse_do_getattr(struct inode *inode, struct kstat *stat, struct file *file) { int err; struct fuse_getattr_in inarg; struct fuse_attr_out outarg; struct fuse_mount *fm = get_fuse_mount(inode); FUSE_ARGS(args); u64 attr_version; attr_version = fuse_get_attr_version(fm->fc); memset(&inarg, 0, sizeof(inarg)); memset(&outarg, 0, sizeof(outarg)); /* Directories have separate file-handle space */ if (file && S_ISREG(inode->i_mode)) { struct fuse_file *ff = file->private_data; inarg.getattr_flags |= FUSE_GETATTR_FH; inarg.fh = ff->fh; } args.opcode = FUSE_GETATTR; args.nodeid = get_node_id(inode); args.in_numargs = 1; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; args.out_numargs = 1; args.out_args[0].size = sizeof(outarg); args.out_args[0].value = &outarg; err = fuse_simple_request(fm, &args); if (!err) { if (fuse_invalid_attr(&outarg.attr) || inode_wrong_type(inode, outarg.attr.mode)) { fuse_make_bad(inode); err = -EIO; } else { fuse_change_attributes(inode, &outarg.attr, NULL, ATTR_TIMEOUT(&outarg), attr_version); if (stat) fuse_fillattr(inode, &outarg.attr, stat); } } return err; } static int fuse_update_get_attr(struct inode *inode, struct file *file, struct kstat *stat, u32 request_mask, unsigned int flags) { struct fuse_inode *fi = get_fuse_inode(inode); struct fuse_conn *fc = get_fuse_conn(inode); int err = 0; bool sync; u32 inval_mask = READ_ONCE(fi->inval_mask); u32 cache_mask = fuse_get_cache_mask(inode); /* FUSE only supports basic stats and possibly btime */ request_mask &= STATX_BASIC_STATS | STATX_BTIME; retry: if (fc->no_statx) request_mask &= STATX_BASIC_STATS; if (!request_mask) sync = false; else if (flags & AT_STATX_FORCE_SYNC) sync = true; else if (flags & AT_STATX_DONT_SYNC) sync = false; else if (request_mask & inval_mask & ~cache_mask) sync = true; else sync = time_before64(fi->i_time, get_jiffies_64()); if (sync) { forget_all_cached_acls(inode); /* Try statx if BTIME is requested */ if (!fc->no_statx && (request_mask & ~STATX_BASIC_STATS)) { err = fuse_do_statx(inode, file, stat); if (err == -ENOSYS) { fc->no_statx = 1; err = 0; goto retry; } } else { err = fuse_do_getattr(inode, stat, file); } } else if (stat) { generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat); stat->mode = fi->orig_i_mode; stat->ino = fi->orig_ino; if (test_bit(FUSE_I_BTIME, &fi->state)) { stat->btime = fi->i_btime; stat->result_mask |= STATX_BTIME; } } return err; } int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask) { return fuse_update_get_attr(inode, file, NULL, mask, 0); } int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid, u64 child_nodeid, struct qstr *name, u32 flags) { int err = -ENOTDIR; struct inode *parent; struct dentry *dir; struct dentry *entry; parent = fuse_ilookup(fc, parent_nodeid, NULL); if (!parent) return -ENOENT; inode_lock_nested(parent, I_MUTEX_PARENT); if (!S_ISDIR(parent->i_mode)) goto unlock; err = -ENOENT; dir = d_find_alias(parent); if (!dir) goto unlock; name->hash = full_name_hash(dir, name->name, name->len); entry = d_lookup(dir, name); dput(dir); if (!entry) goto unlock; fuse_dir_changed(parent); if (!(flags & FUSE_EXPIRE_ONLY)) d_invalidate(entry); fuse_invalidate_entry_cache(entry); if (child_nodeid != 0 && d_really_is_positive(entry)) { inode_lock(d_inode(entry)); if (get_node_id(d_inode(entry)) != child_nodeid) { err = -ENOENT; goto badentry; } if (d_mountpoint(entry)) { err = -EBUSY; goto badentry; } if (d_is_dir(entry)) { shrink_dcache_parent(entry); if (!simple_empty(entry)) { err = -ENOTEMPTY; goto badentry; } d_inode(entry)->i_flags |= S_DEAD; } dont_mount(entry); clear_nlink(d_inode(entry)); err = 0; badentry: inode_unlock(d_inode(entry)); if (!err) d_delete(entry); } else { err = 0; } dput(entry); unlock: inode_unlock(parent); iput(parent); return err; } static inline bool fuse_permissible_uidgid(struct fuse_conn *fc) { const struct cred *cred = current_cred(); return (uid_eq(cred->euid, fc->user_id) && uid_eq(cred->suid, fc->user_id) && uid_eq(cred->uid, fc->user_id) && gid_eq(cred->egid, fc->group_id) && gid_eq(cred->sgid, fc->group_id) && gid_eq(cred->gid, fc->group_id)); } /* * Calling into a user-controlled filesystem gives the filesystem * daemon ptrace-like capabilities over the current process. This * means, that the filesystem daemon is able to record the exact * filesystem operations performed, and can also control the behavior * of the requester process in otherwise impossible ways. For example * it can delay the operation for arbitrary length of time allowing * DoS against the requester. * * For this reason only those processes can call into the filesystem, * for which the owner of the mount has ptrace privilege. This * excludes processes started by other users, suid or sgid processes. */ bool fuse_allow_current_process(struct fuse_conn *fc) { bool allow; if (fc->allow_other) allow = current_in_userns(fc->user_ns); else allow = fuse_permissible_uidgid(fc); if (!allow && allow_sys_admin_access && capable(CAP_SYS_ADMIN)) allow = true; return allow; } static int fuse_access(struct inode *inode, int mask) { struct fuse_mount *fm = get_fuse_mount(inode); FUSE_ARGS(args); struct fuse_access_in inarg; int err; BUG_ON(mask & MAY_NOT_BLOCK); if (fm->fc->no_access) return 0; memset(&inarg, 0, sizeof(inarg)); inarg.mask = mask & (MAY_READ | MAY_WRITE | MAY_EXEC); args.opcode = FUSE_ACCESS; args.nodeid = get_node_id(inode); args.in_numargs = 1; args.in_args[0].size = sizeof(inarg); args.in_args[0].value = &inarg; err = fuse_simple_request(fm, &args); if (err == -ENOSYS) { fm->fc->no_access = 1; err = 0; } return err; } static int fuse_perm_getattr(struct inode *inode, int mask) { if (mask & MAY_NOT_BLOCK) return -ECHILD; forget_all_cached_acls(inode); return fuse_do_getattr(inode, NULL, NULL); } /* * Check permission. The two basic access models of FUSE are: * * 1) Local access checking ('default_permissions' mount option) based * on file mode. This is the plain old disk filesystem permission * model. * * 2) "Remote" access checking, where server is responsible for * checking permission in each inode operation. An exception to this * is if ->permission() was invoked from sys_access() in which case an * access request is sent. Execute permission is still checked * locally based on file mode. */ static int fuse_permission(struct mnt_idmap *idmap, struct inode *inode, int mask) { struct fuse_conn *fc = get_fuse_conn(inode); bool refreshed = false; int err = 0; if (fuse_is_bad(inode)) return -EIO; if (!fuse_allow_current_process(fc)) return -EACCES; /* * If attributes are needed, refresh them before proceeding */ if (fc->default_permissions || ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))) { struct fuse_inode *fi = get_fuse_inode(inode); u32 perm_mask = STATX_MODE | STATX_UID | STATX_GID; if (perm_mask & READ_ONCE(fi->inval_mask) || time_before64(fi->i_time, get_jiffies_64())) { refreshed = true; err = fuse_perm_getattr(inode, mask); if (err) return err; } } if (fc->default_permissions) { err = generic_permission(&nop_mnt_idmap, inode, mask); /* If permission is denied, try to refresh file attributes. This is also needed, because the root node will at first have no permissions */ if (err == -EACCES && !refreshed) { err = fuse_perm_getattr(inode, mask); if (!err) err = generic_permission(&nop_mnt_idmap, inode, mask); } /* Note: the opposite of the above test does not exist. So if permissions are revoked this won't be noticed immediately, only after the attribute timeout has expired */ } else if (mask & (MAY_ACCESS | MAY_CHDIR)) { err = fuse_access(inode, mask); } else if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) { if (!(inode->i_mode & S_IXUGO)) { if (refreshed) return -EACCES; err = fuse_perm_getattr(inode, mask); if (!err && !(inode->i_mode & S_IXUGO)) return -EACCES; } } return err; } static int fuse_readlink_page(struct inode *inode, struct page *page) { struct fuse_mount *fm = get_fuse_mount(inode); struct fuse_page_desc desc = { .length = PAGE_SIZE - 1 }; struct fuse_args_pages ap = { .num_pages = 1, .pages = &page, .descs = &desc, }; char *link; ssize_t res; ap.args.opcode = FUSE_READLINK; ap.args.nodeid = get_node_id(inode); ap.args.out_pages = true; ap.args.out_argvar = true; ap.args.page_zeroing = true; ap.args.out_numargs = 1; ap.args.out_args[0].size = desc.length; res = fuse_simple_request(fm, &ap.args); fuse_invalidate_atime(inode); if (res < 0) return res; if (WARN_ON(res >= PAGE_SIZE)) return -EIO; link = page_address(page); link[res] = '\0'; return 0; } static const char *fuse_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *callback) { struct fuse_conn *fc = get_fuse_conn(inode); struct page *page; int err; err = -EIO; if (fuse_is_bad(inode)) goto out_err; if (fc->cache_symlinks) return page_get_link(dentry, inode, callback); err = -ECHILD; if (!dentry) goto out_err; page = alloc_page(GFP_KERNEL); err = -ENOMEM; if (!page) goto out_err; err = fuse_readlink_page(inode, page); if (err) { __free_page(page); goto out_err; } set_delayed_call(callback, page_put_link, page); return page_address(page); out_err: return ERR_PTR(err); } static int fuse_dir_open(struct inode *inode, struct file *file) { struct fuse_mount *fm = get_fuse_mount(inode); int err; if (fuse_is_bad(inode)) return -EIO; err = generic_file_open(inode, file); if (err) return err; err = fuse_do_open(fm, get_node_id(inode), file, true); if (!err) { struct fuse_file *ff = file->private_data; /* * Keep handling FOPEN_STREAM and FOPEN_NONSEEKABLE for * directories for backward compatibility, though it's unlikely * to be useful. */ if (ff->open_flags & (FOPEN_STREAM | FOPEN_NONSEEKABLE)) nonseekable_open(inode, file); } return err; } static int fuse_dir_release(struct inode *inode, struct file *file) { fuse_release_common(file, true); return 0; } static int fuse_dir_fsync(struct file *file, loff_t start, loff_t end, int datasync) { struct inode *inode = file->f_mapping->host; struct fuse_conn *fc = get_fuse_conn(inode); int err; if (fuse_is_bad(inode)) return -EIO; if (fc->no_fsyncdir) return 0; inode_lock(inode); err = fuse_fsync_common(file, start, end, datasync, FUSE_FSYNCDIR); if (err == -ENOSYS) { fc->no_fsyncdir = 1; err = 0; } inode_unlock(inode); return err; } static long fuse_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct fuse_conn *fc = get_fuse_conn(file->f_mapping->host); /* FUSE_IOCTL_DIR only supported for API version >= 7.18 */ if (fc->minor < 18) return -ENOTTY; return fuse_ioctl_common(file, cmd, arg, FUSE_IOCTL_DIR); } static long fuse_dir_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct fuse_conn *fc = get_fuse_conn(file->f_mapping->host); if (fc->minor < 18) return -ENOTTY; return fuse_ioctl_common(file, cmd, arg, FUSE_IOCTL_COMPAT | FUSE_IOCTL_DIR); } static bool update_mtime(unsigned ivalid, bool trust_local_mtime) { /* Always update if mtime is explicitly set */ if (ivalid & ATTR_MTIME_SET) return true; /* Or if kernel i_mtime is the official one */ if (trust_local_mtime) return true; /* If it's an open(O_TRUNC) or an ftruncate(), don't update */ if ((ivalid & ATTR_SIZE) && (ivalid & (ATTR_OPEN | ATTR_FILE))) return false; /* In all other cases update */ return true; } static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr, struct fuse_setattr_in *arg, bool trust_local_cmtime) { unsigned ivalid = iattr->ia_valid; if (ivalid & ATTR_MODE) arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode; if (ivalid & ATTR_UID) arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid); if (ivalid & ATTR_GID) arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid); if (ivalid & ATTR_SIZE) arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size; if (ivalid & ATTR_ATIME) { arg->valid |= FATTR_ATIME; arg->atime = iattr->ia_atime.tv_sec; arg->atimensec = iattr->ia_atime.tv_nsec; if (!(ivalid & ATTR_ATIME_SET)) arg->valid |= FATTR_ATIME_NOW; } if ((ivalid & ATTR_MTIME) && update_mtime(ivalid, trust_local_cmtime)) { arg->valid |= FATTR_MTIME; arg->mtime = iattr->ia_mtime.tv_sec; arg->mtimensec = iattr->ia_mtime.tv_nsec; if (!(ivalid & ATTR_MTIME_SET) && !trust_local_cmtime) arg->valid |= FATTR_MTIME_NOW; } if ((ivalid & ATTR_CTIME) && trust_local_cmtime) { arg->valid |= FATTR_CTIME; arg->ctime = iattr->ia_ctime.tv_sec; arg->ctimensec = iattr->ia_ctime.tv_nsec; } } /* * Prevent concurrent writepages on inode * * This is done by adding a negative bias to the inode write counter * and waiting for all pending writes to finish. */ void fuse_set_nowrite(struct inode *inode) { struct fuse_inode *fi = get_fuse_inode(inode); BUG_ON(!inode_is_locked(inode)); spin_lock(&fi->lock); BUG_ON(fi->writectr < 0); fi->writectr += FUSE_NOWRITE; spin_unlock(&fi->lock); wait_event(fi->page_waitq, fi->writectr == FUSE_NOWRITE); } /* * Allow writepages on inode * * Remove the bias from the writecounter and send any queued * writepages. */ static void __fuse_release_nowrite(struct inode *inode) { struct fuse_inode *fi = get_fuse_inode(inode); BUG_ON(fi->writectr != FUSE_NOWRITE); fi->writectr = 0; fuse_flush_writepages(inode); } void fuse_release_nowrite(struct inode *inode) { struct fuse_inode *fi = get_fuse_inode(inode); spin_lock(&fi->lock); __fuse_release_nowrite(inode); spin_unlock(&fi->lock); } static void fuse_setattr_fill(struct fuse_conn *fc, struct fuse_args *args, struct inode *inode, struct fuse_setattr_in *inarg_p, struct fuse_attr_out *outarg_p) { args->opcode = FUSE_SETATTR; args->nodeid = get_node_id(inode); args->in_numargs = 1; args->in_args[0].size = sizeof(*inarg_p); args->in_args[0].value = inarg_p; args->out_numargs = 1; args->out_args[0].size = sizeof(*outarg_p); args->out_args[0].value = outarg_p; } /* * Flush inode->i_mtime to the server */ int fuse_flush_times(struct inode *inode, struct fuse_file *ff) { struct fuse_mount *fm = get_fuse_mount(inode); FUSE_ARGS(args); struct fuse_setattr_in inarg; struct fuse_attr_out outarg; memset(&inarg, 0, sizeof(inarg)); memset(&outarg, 0, sizeof(outarg)); inarg.valid = FATTR_MTIME; inarg.mtime = inode_get_mtime_sec(inode); inarg.mtimensec = inode_get_mtime_nsec(inode); if (fm->fc->minor >= 23) { inarg.valid |= FATTR_CTIME; inarg.ctime = inode_get_ctime_sec(inode); inarg.ctimensec = inode_get_ctime_nsec(inode); } if (ff) { inarg.valid |= FATTR_FH; inarg.fh = ff->fh; } fuse_setattr_fill(fm->fc, &args, inode, &inarg, &outarg); return fuse_simple_request(fm, &args); } /* * Set attributes, and at the same time refresh them. * * Truncation is slightly complicated, because the 'truncate' request * may fail, in which case we don't want to touch the mapping. * vmtruncate() doesn't allow for this case, so do the rlimit checking * and the actual truncation by hand. */ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr, struct file *file) { struct inode *inode = d_inode(dentry); struct fuse_mount *fm = get_fuse_mount(inode); struct fuse_conn *fc = fm->fc; struct fuse_inode *fi = get_fuse_inode(inode); struct address_space *mapping = inode->i_mapping; FUSE_ARGS(args); struct fuse_setattr_in inarg; struct fuse_attr_out outarg; bool is_truncate = false; bool is_wb = fc->writeback_cache && S_ISREG(inode->i_mode); loff_t oldsize; int err; bool trust_local_cmtime = is_wb; bool fault_blocked = false; if (!fc->default_permissions) attr->ia_valid |= ATTR_FORCE; err = setattr_prepare(&nop_mnt_idmap, dentry, attr); if (err) return err; if (attr->ia_valid & ATTR_SIZE) { if (WARN_ON(!S_ISREG(inode->i_mode))) return -EIO; is_truncate = true; } if (FUSE_IS_DAX(inode) && is_truncate) { filemap_invalidate_lock(mapping); fault_blocked = true; err = fuse_dax_break_layouts(inode, 0, 0); if (err) { filemap_invalidate_unlock(mapping); return err; } } if (attr->ia_valid & ATTR_OPEN) { /* This is coming from open(..., ... | O_TRUNC); */ WARN_ON(!(attr->ia_valid & ATTR_SIZE)); WARN_ON(attr->ia_size != 0); if (fc->atomic_o_trunc) { /* * No need to send request to userspace, since actual * truncation has already been done by OPEN. But still * need to truncate page cache. */ i_size_write(inode, 0); truncate_pagecache(inode, 0); goto out; } file = NULL; } /* Flush dirty data/metadata before non-truncate SETATTR */ if (is_wb && attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_MTIME_SET | ATTR_TIMES_SET)) { err = write_inode_now(inode, true); if (err) return err; fuse_set_nowrite(inode); fuse_release_nowrite(inode); } if (is_truncate) { fuse_set_nowrite(inode); set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state); if (trust_local_cmtime && attr->ia_size != inode->i_size) attr->ia_valid |= ATTR_MTIME | ATTR_CTIME; } memset(&inarg, 0, sizeof(inarg)); memset(&outarg, 0, sizeof(outarg)); iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime); if (file) { struct fuse_file *ff = file->private_data; inarg.valid |= FATTR_FH; inarg.fh = ff->fh; } /* Kill suid/sgid for non-directory chown unconditionally */ if (fc->handle_killpriv_v2 && !S_ISDIR(inode->i_mode) && attr->ia_valid & (ATTR_UID | ATTR_GID)) inarg.valid |= FATTR_KILL_SUIDGID; if (attr->ia_valid & ATTR_SIZE) { /* For mandatory locking in truncate */ inarg.valid |= FATTR_LOCKOWNER; inarg.lock_owner = fuse_lock_owner_id(fc, current->files); /* Kill suid/sgid for truncate only if no CAP_FSETID */ if (fc->handle_killpriv_v2 && !capable(CAP_FSETID)) inarg.valid |= FATTR_KILL_SUIDGID; } fuse_setattr_fill(fc, &args, inode, &inarg, &outarg); err = fuse_simple_request(fm, &args); if (err) { if (err == -EINTR) fuse_invalidate_attr(inode); goto error; } if (fuse_invalid_attr(&outarg.attr) || inode_wrong_type(inode, outarg.attr.mode)) { fuse_make_bad(inode); err = -EIO; goto error; } spin_lock(&fi->lock); /* the kernel maintains i_mtime locally */ if (trust_local_cmtime) { if (attr->ia_valid & ATTR_MTIME) inode_set_mtime_to_ts(inode, attr->ia_mtime); if (attr->ia_valid & ATTR_CTIME) inode_set_ctime_to_ts(inode, attr->ia_ctime); /* FIXME: clear I_DIRTY_SYNC? */ } fuse_change_attributes_common(inode, &outarg.attr, NULL, ATTR_TIMEOUT(&outarg), fuse_get_cache_mask(inode)); oldsize = inode->i_size; /* see the comment in fuse_change_attributes() */ if (!is_wb || is_truncate) i_size_write(inode, outarg.attr.size); if (is_truncate) { /* NOTE: this may release/reacquire fi->lock */ __fuse_release_nowrite(inode); } spin_unlock(&fi->lock); /* * Only call invalidate_inode_pages2() after removing * FUSE_NOWRITE, otherwise fuse_launder_folio() would deadlock. */ if ((is_truncate || !is_wb) && S_ISREG(inode->i_mode) && oldsize != outarg.attr.size) { truncate_pagecache(inode, outarg.attr.size); invalidate_inode_pages2(mapping); } clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state); out: if (fault_blocked) filemap_invalidate_unlock(mapping); return 0; error: if (is_truncate) fuse_release_nowrite(inode); clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state); if (fault_blocked) filemap_invalidate_unlock(mapping); return err; } static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry, struct iattr *attr) { struct inode *inode = d_inode(entry); struct fuse_conn *fc = get_fuse_conn(inode); struct file *file = (attr->ia_valid & ATTR_FILE) ? attr->ia_file : NULL; int ret; if (fuse_is_bad(inode)) return -EIO; if (!fuse_allow_current_process(get_fuse_conn(inode))) return -EACCES; if (attr->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID)) { attr->ia_valid &= ~(ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_MODE); /* * The only sane way to reliably kill suid/sgid is to do it in * the userspace filesystem * * This should be done on write(), truncate() and chown(). */ if (!fc->handle_killpriv && !fc->handle_killpriv_v2) { /* * ia_mode calculation may have used stale i_mode. * Refresh and recalculate. */ ret = fuse_do_getattr(inode, NULL, file); if (ret) return ret; attr->ia_mode = inode->i_mode; if (inode->i_mode & S_ISUID) { attr->ia_valid |= ATTR_MODE; attr->ia_mode &= ~S_ISUID; } if ((inode->i_mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { attr->ia_valid |= ATTR_MODE; attr->ia_mode &= ~S_ISGID; } } } if (!attr->ia_valid) return 0; ret = fuse_do_setattr(entry, attr, file); if (!ret) { /* * If filesystem supports acls it may have updated acl xattrs in * the filesystem, so forget cached acls for the inode. */ if (fc->posix_acl) forget_all_cached_acls(inode); /* Directory mode changed, may need to revalidate access */ if (d_is_dir(entry) && (attr->ia_valid & ATTR_MODE)) fuse_invalidate_entry_cache(entry); } return ret; } static int fuse_getattr(struct mnt_idmap *idmap, const struct path *path, struct kstat *stat, u32 request_mask, unsigned int flags) { struct inode *inode = d_inode(path->dentry); struct fuse_conn *fc = get_fuse_conn(inode); if (fuse_is_bad(inode)) return -EIO; if (!fuse_allow_current_process(fc)) { if (!request_mask) { /* * If user explicitly requested *nothing* then don't * error out, but return st_dev only. */ stat->result_mask = 0; stat->dev = inode->i_sb->s_dev; return 0; } return -EACCES; } return fuse_update_get_attr(inode, NULL, stat, request_mask, flags); } static const struct inode_operations fuse_dir_inode_operations = { .lookup = fuse_lookup, .mkdir = fuse_mkdir, .symlink = fuse_symlink, .unlink = fuse_unlink, .rmdir = fuse_rmdir, .rename = fuse_rename2, .link = fuse_link, .setattr = fuse_setattr, .create = fuse_create, .atomic_open = fuse_atomic_open, .tmpfile = fuse_tmpfile, .mknod = fuse_mknod, .permission = fuse_permission, .getattr = fuse_getattr, .listxattr = fuse_listxattr, .get_inode_acl = fuse_get_inode_acl, .get_acl = fuse_get_acl, .set_acl = fuse_set_acl, .fileattr_get = fuse_fileattr_get, .fileattr_set = fuse_fileattr_set, }; static const struct file_operations fuse_dir_operations = { .llseek = generic_file_llseek, .read = generic_read_dir, .iterate_shared = fuse_readdir, .open = fuse_dir_open, .release = fuse_dir_release, .fsync = fuse_dir_fsync, .unlocked_ioctl = fuse_dir_ioctl, .compat_ioctl = fuse_dir_compat_ioctl, }; static const struct inode_operations fuse_common_inode_operations = { .setattr = fuse_setattr, .permission = fuse_permission, .getattr = fuse_getattr, .listxattr = fuse_listxattr, .get_inode_acl = fuse_get_inode_acl, .get_acl = fuse_get_acl, .set_acl = fuse_set_acl, .fileattr_get = fuse_fileattr_get, .fileattr_set = fuse_fileattr_set, }; static const struct inode_operations fuse_symlink_inode_operations = { .setattr = fuse_setattr, .get_link = fuse_get_link, .getattr = fuse_getattr, .listxattr = fuse_listxattr, }; void fuse_init_common(struct inode *inode) { inode->i_op = &fuse_common_inode_operations; } void fuse_init_dir(struct inode *inode) { struct fuse_inode *fi = get_fuse_inode(inode); inode->i_op = &fuse_dir_inode_operations; inode->i_fop = &fuse_dir_operations; spin_lock_init(&fi->rdc.lock); fi->rdc.cached = false; fi->rdc.size = 0; fi->rdc.pos = 0; fi->rdc.version = 0; } static int fuse_symlink_read_folio(struct file *null, struct folio *folio) { int err = fuse_readlink_page(folio->mapping->host, &folio->page); if (!err) folio_mark_uptodate(folio); folio_unlock(folio); return err; } static const struct address_space_operations fuse_symlink_aops = { .read_folio = fuse_symlink_read_folio, }; void fuse_init_symlink(struct inode *inode) { inode->i_op = &fuse_symlink_inode_operations; inode->i_data.a_ops = &fuse_symlink_aops; inode_nohighmem(inode); } |
| 12 12 10 5 7 1 39 4 4 1 39 1 6 6 6 5 5 5 2 2 2 3 3 2 1 4 4 2 3 1 1 2 4 3 3 2 2 2 1 1 1 3 2 1 1 2 2 2 1 1 7 6 5 5 5 4 4 1 6 6 5 1 3 2 12 12 14 9 22 11 3 2 1 1 31 16 31 30 7 22 20 13 13 8 7 189 1 1 2 7 6 2 5 4 1 4 3 22 4 3 2 2 2 5 2 2 10 1 5 5 5 5 5 5 5 20 20 7 20 13 7 5 18 6 3 3 12 13 17 17 17 16 17 10 17 4 13 318 318 137 34 189 148 52 3 2 50 1 1 1 2 11 3 1 1 1 1 1 1 1 1 1 1 1 1 2 1 305 1 1 4 4 5 5 4 4 3 6 11 10 5 5 10 7 6 5 5 5 5 5 21 20 1 1 1 1 2 2 49 77 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 | // SPDX-License-Identifier: GPL-2.0 /* * Copyright (C) 1992 obz under the linux copyright * * Dynamic diacritical handling - aeb@cwi.nl - Dec 1993 * Dynamic keymap and string allocation - aeb@cwi.nl - May 1994 * Restrict VT switching via ioctl() - grif@cs.ucr.edu - Dec 1995 * Some code moved for less code duplication - Andi Kleen - Mar 1997 * Check put/get_user, cleanups - acme@conectiva.com.br - Jun 2001 */ #include <linux/types.h> #include <linux/errno.h> #include <linux/sched/signal.h> #include <linux/tty.h> #include <linux/timer.h> #include <linux/kernel.h> #include <linux/compat.h> #include <linux/module.h> #include <linux/kd.h> #include <linux/vt.h> #include <linux/string.h> #include <linux/slab.h> #include <linux/major.h> #include <linux/fs.h> #include <linux/console.h> #include <linux/consolemap.h> #include <linux/signal.h> #include <linux/suspend.h> #include <linux/timex.h> #include <asm/io.h> #include <linux/uaccess.h> #include <linux/nospec.h> #include <linux/kbd_kern.h> #include <linux/vt_kern.h> #include <linux/kbd_diacr.h> #include <linux/selection.h> bool vt_dont_switch; static inline bool vt_in_use(unsigned int i) { const struct vc_data *vc = vc_cons[i].d; /* * console_lock must be held to prevent the vc from being deallocated * while we're checking whether it's in-use. */ WARN_CONSOLE_UNLOCKED(); return vc && kref_read(&vc->port.kref) > 1; } static inline bool vt_busy(int i) { if (vt_in_use(i)) return true; if (i == fg_console) return true; if (vc_is_sel(vc_cons[i].d)) return true; return false; } /* * Console (vt and kd) routines, as defined by USL SVR4 manual, and by * experimentation and study of X386 SYSV handling. * * One point of difference: SYSV vt's are /dev/vtX, which X >= 0, and * /dev/console is a separate ttyp. Under Linux, /dev/tty0 is /dev/console, * and the vc start at /dev/ttyX, X >= 1. We maintain that here, so we will * always treat our set of vt as numbered 1..MAX_NR_CONSOLES (corresponding to * ttys 0..MAX_NR_CONSOLES-1). Explicitly naming VT 0 is illegal, but using * /dev/tty0 (fg_console) as a target is legal, since an implicit aliasing * to the current console is done by the main ioctl code. */ #ifdef CONFIG_X86 #include <asm/syscalls.h> #endif static void complete_change_console(struct vc_data *vc); /* * User space VT_EVENT handlers */ struct vt_event_wait { struct list_head list; struct vt_event event; int done; }; static LIST_HEAD(vt_events); static DEFINE_SPINLOCK(vt_event_lock); static DECLARE_WAIT_QUEUE_HEAD(vt_event_waitqueue); /** * vt_event_post * @event: the event that occurred * @old: old console * @new: new console * * Post an VT event to interested VT handlers */ void vt_event_post(unsigned int event, unsigned int old, unsigned int new) { struct list_head *pos, *head; unsigned long flags; int wake = 0; spin_lock_irqsave(&vt_event_lock, flags); head = &vt_events; list_for_each(pos, head) { struct vt_event_wait *ve = list_entry(pos, struct vt_event_wait, list); if (!(ve->event.event & event)) continue; ve->event.event = event; /* kernel view is consoles 0..n-1, user space view is console 1..n with 0 meaning current, so we must bias */ ve->event.oldev = old + 1; ve->event.newev = new + 1; wake = 1; ve->done = 1; } spin_unlock_irqrestore(&vt_event_lock, flags); if (wake) wake_up_interruptible(&vt_event_waitqueue); } static void __vt_event_queue(struct vt_event_wait *vw) { unsigned long flags; /* Prepare the event */ INIT_LIST_HEAD(&vw->list); vw->done = 0; /* Queue our event */ spin_lock_irqsave(&vt_event_lock, flags); list_add(&vw->list, &vt_events); spin_unlock_irqrestore(&vt_event_lock, flags); } static void __vt_event_wait(struct vt_event_wait *vw) { /* Wait for it to pass */ wait_event_interruptible(vt_event_waitqueue, vw->done); } static void __vt_event_dequeue(struct vt_event_wait *vw) { unsigned long flags; /* Dequeue it */ spin_lock_irqsave(&vt_event_lock, flags); list_del(&vw->list); spin_unlock_irqrestore(&vt_event_lock, flags); } /** * vt_event_wait - wait for an event * @vw: our event * * Waits for an event to occur which completes our vt_event_wait * structure. On return the structure has wv->done set to 1 for success * or 0 if some event such as a signal ended the wait. */ static void vt_event_wait(struct vt_event_wait *vw) { __vt_event_queue(vw); __vt_event_wait(vw); __vt_event_dequeue(vw); } /** * vt_event_wait_ioctl - event ioctl handler * @event: argument to ioctl (the event) * * Implement the VT_WAITEVENT ioctl using the VT event interface */ static int vt_event_wait_ioctl(struct vt_event __user *event) { struct vt_event_wait vw; if (copy_from_user(&vw.event, event, sizeof(struct vt_event))) return -EFAULT; /* Highest supported event for now */ if (vw.event.event & ~VT_MAX_EVENT) return -EINVAL; vt_event_wait(&vw); /* If it occurred report it */ if (vw.done) { if (copy_to_user(event, &vw.event, sizeof(struct vt_event))) return -EFAULT; return 0; } return -EINTR; } /** * vt_waitactive - active console wait * @n: new console * * Helper for event waits. Used to implement the legacy * event waiting ioctls in terms of events */ int vt_waitactive(int n) { struct vt_event_wait vw; do { vw.event.event = VT_EVENT_SWITCH; __vt_event_queue(&vw); if (n == fg_console + 1) { __vt_event_dequeue(&vw); break; } __vt_event_wait(&vw); __vt_event_dequeue(&vw); if (vw.done == 0) return -EINTR; } while (vw.event.newev != n); return 0; } /* * these are the valid i/o ports we're allowed to change. they map all the * video ports */ #define GPFIRST 0x3b4 #define GPLAST 0x3df #define GPNUM (GPLAST - GPFIRST + 1) /* * currently, setting the mode from KD_TEXT to KD_GRAPHICS doesn't do a whole * lot. i'm not sure if it should do any restoration of modes or what... * * XXX It should at least call into the driver, fbdev's definitely need to * restore their engine state. --BenH * * Called with the console lock held. */ static int vt_kdsetmode(struct vc_data *vc, unsigned long mode) { switch (mode) { case KD_GRAPHICS: break; case KD_TEXT0: case KD_TEXT1: mode = KD_TEXT; fallthrough; case KD_TEXT: break; default: return -EINVAL; } if (vc->vc_mode == mode) return 0; vc->vc_mode = mode; if (vc->vc_num != fg_console) return 0; /* explicitly blank/unblank the screen if switching modes */ if (mode == KD_TEXT) do_unblank_screen(1); else do_blank_screen(1); return 0; } static int vt_k_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg, bool perm) { struct vc_data *vc = tty->driver_data; void __user *up = (void __user *)arg; unsigned int console = vc->vc_num; int ret; switch (cmd) { case KIOCSOUND: if (!perm) return -EPERM; /* * The use of PIT_TICK_RATE is historic, it used to be * the platform-dependent CLOCK_TICK_RATE between 2.6.12 * and 2.6.36, which was a minor but unfortunate ABI * change. kd_mksound is locked by the input layer. */ if (arg) arg = PIT_TICK_RATE / arg; kd_mksound(arg, 0); break; case KDMKTONE: if (!perm) return -EPERM; { unsigned int ticks, count; /* * Generate the tone for the appropriate number of ticks. * If the time is zero, turn off sound ourselves. */ ticks = msecs_to_jiffies((arg >> 16) & 0xffff); count = ticks ? (arg & 0xffff) : 0; if (count) count = PIT_TICK_RATE / count; kd_mksound(count, ticks); break; } case KDGKBTYPE: /* * this is naïve. */ return put_user(KB_101, (char __user *)arg); /* * These cannot be implemented on any machine that implements * ioperm() in user level (such as Alpha PCs) or not at all. * * XXX: you should never use these, just call ioperm directly.. */ #ifdef CONFIG_X86 case KDADDIO: case KDDELIO: /* * KDADDIO and KDDELIO may be able to add ports beyond what * we reject here, but to be safe... * * These are locked internally via sys_ioperm */ if (arg < GPFIRST || arg > GPLAST) return -EINVAL; return ksys_ioperm(arg, 1, (cmd == KDADDIO)) ? -ENXIO : 0; case KDENABIO: case KDDISABIO: return ksys_ioperm(GPFIRST, GPNUM, (cmd == KDENABIO)) ? -ENXIO : 0; #endif /* Linux m68k/i386 interface for setting the keyboard delay/repeat rate */ case KDKBDREP: { struct kbd_repeat kbrep; if (!capable(CAP_SYS_TTY_CONFIG)) return -EPERM; if (copy_from_user(&kbrep, up, sizeof(struct kbd_repeat))) return -EFAULT; ret = kbd_rate(&kbrep); if (ret) return ret; if (copy_to_user(up, &kbrep, sizeof(struct kbd_repeat))) return -EFAULT; break; } case KDSETMODE: if (!perm) return -EPERM; console_lock(); ret = vt_kdsetmode(vc, arg); console_unlock(); return ret; case KDGETMODE: return put_user(vc->vc_mode, (int __user *)arg); case KDMAPDISP: case KDUNMAPDISP: /* * these work like a combination of mmap and KDENABIO. * this could be easily finished. */ return -EINVAL; case KDSKBMODE: if (!perm) return -EPERM; ret = vt_do_kdskbmode(console, arg); if (ret) return ret; tty_ldisc_flush(tty); break; case KDGKBMODE: return put_user(vt_do_kdgkbmode(console), (int __user *)arg); /* this could be folded into KDSKBMODE, but for compatibility reasons it is not so easy to fold KDGKBMETA into KDGKBMODE */ case KDSKBMETA: return vt_do_kdskbmeta(console, arg); case KDGKBMETA: /* FIXME: should review whether this is worth locking */ return put_user(vt_do_kdgkbmeta(console), (int __user *)arg); case KDGETKEYCODE: case KDSETKEYCODE: if(!capable(CAP_SYS_TTY_CONFIG)) perm = 0; return vt_do_kbkeycode_ioctl(cmd, up, perm); case KDGKBENT: case KDSKBENT: return vt_do_kdsk_ioctl(cmd, up, perm, console); case KDGKBSENT: case KDSKBSENT: return vt_do_kdgkb_ioctl(cmd, up, perm); /* Diacritical processing. Handled in keyboard.c as it has to operate on the keyboard locks and structures */ case KDGKBDIACR: case KDGKBDIACRUC: case KDSKBDIACR: case KDSKBDIACRUC: return vt_do_diacrit(cmd, up, perm); /* the ioctls below read/set the flags usually shown in the leds */ /* don't use them - they will go away without warning */ case KDGKBLED: case KDSKBLED: case KDGETLED: case KDSETLED: return vt_do_kdskled(console, cmd, arg, perm); /* * A process can indicate its willingness to accept signals * generated by pressing an appropriate key combination. * Thus, one can have a daemon that e.g. spawns a new console * upon a keypress and then changes to it. * See also the kbrequest field of inittab(5). */ case KDSIGACCEPT: if (!perm || !capable(CAP_KILL)) return -EPERM; if (!valid_signal(arg) || arg < 1 || arg == SIGKILL) return -EINVAL; spin_lock_irq(&vt_spawn_con.lock); put_pid(vt_spawn_con.pid); vt_spawn_con.pid = get_pid(task_pid(current)); vt_spawn_con.sig = arg; spin_unlock_irq(&vt_spawn_con.lock); break; case KDFONTOP: { struct console_font_op op; if (copy_from_user(&op, up, sizeof(op))) return -EFAULT; if (!perm && op.op != KD_FONT_OP_GET) return -EPERM; ret = con_font_op(vc, &op); if (ret) return ret; if (copy_to_user(up, &op, sizeof(op))) return -EFAULT; break; } default: return -ENOIOCTLCMD; } return 0; } static inline int do_unimap_ioctl(int cmd, struct unimapdesc __user *user_ud, bool perm, struct vc_data *vc) { struct unimapdesc tmp; if (copy_from_user(&tmp, user_ud, sizeof tmp)) return -EFAULT; switch (cmd) { case PIO_UNIMAP: if (!perm) return -EPERM; return con_set_unimap(vc, tmp.entry_ct, tmp.entries); case GIO_UNIMAP: if (!perm && fg_console != vc->vc_num) return -EPERM; return con_get_unimap(vc, tmp.entry_ct, &(user_ud->entry_ct), tmp.entries); } return 0; } static int vt_io_ioctl(struct vc_data *vc, unsigned int cmd, void __user *up, bool perm) { switch (cmd) { case PIO_CMAP: if (!perm) return -EPERM; return con_set_cmap(up); case GIO_CMAP: return con_get_cmap(up); case PIO_SCRNMAP: if (!perm) return -EPERM; return con_set_trans_old(up); case GIO_SCRNMAP: return con_get_trans_old(up); case PIO_UNISCRNMAP: if (!perm) return -EPERM; return con_set_trans_new(up); case GIO_UNISCRNMAP: return con_get_trans_new(up); case PIO_UNIMAPCLR: if (!perm) return -EPERM; con_clear_unimap(vc); break; case PIO_UNIMAP: case GIO_UNIMAP: return do_unimap_ioctl(cmd, up, perm, vc); default: return -ENOIOCTLCMD; } return 0; } static int vt_reldisp(struct vc_data *vc, unsigned int swtch) { int newvt, ret; if (vc->vt_mode.mode != VT_PROCESS) return -EINVAL; /* Switched-to response */ if (vc->vt_newvt < 0) { /* If it's just an ACK, ignore it */ return swtch == VT_ACKACQ ? 0 : -EINVAL; } /* Switching-from response */ if (swtch == 0) { /* Switch disallowed, so forget we were trying to do it. */ vc->vt_newvt = -1; return 0; } /* The current vt has been released, so complete the switch. */ newvt = vc->vt_newvt; vc->vt_newvt = -1; ret = vc_allocate(newvt); if (ret) return ret; /* * When we actually do the console switch, make sure we are atomic with * respect to other console switches.. */ complete_change_console(vc_cons[newvt].d); return 0; } static int vt_setactivate(struct vt_setactivate __user *sa) { struct vt_setactivate vsa; struct vc_data *nvc; int ret; if (copy_from_user(&vsa, sa, sizeof(vsa))) return -EFAULT; if (vsa.console == 0 || vsa.console > MAX_NR_CONSOLES) return -ENXIO; vsa.console--; vsa.console = array_index_nospec(vsa.console, MAX_NR_CONSOLES); console_lock(); ret = vc_allocate(vsa.console); if (ret) { console_unlock(); return ret; } /* * This is safe providing we don't drop the console sem between * vc_allocate and finishing referencing nvc. */ nvc = vc_cons[vsa.console].d; nvc->vt_mode = vsa.mode; nvc->vt_mode.frsig = 0; put_pid(nvc->vt_pid); nvc->vt_pid = get_pid(task_pid(current)); console_unlock(); /* Commence switch and lock */ /* Review set_console locks */ set_console(vsa.console); return 0; } /* deallocate a single console, if possible (leave 0) */ static int vt_disallocate(unsigned int vc_num) { struct vc_data *vc = NULL; int ret = 0; console_lock(); if (vt_busy(vc_num)) ret = -EBUSY; else if (vc_num) vc = vc_deallocate(vc_num); console_unlock(); if (vc && vc_num >= MIN_NR_CONSOLES) tty_port_put(&vc->port); return ret; } /* deallocate all unused consoles, but leave 0 */ static void vt_disallocate_all(void) { struct vc_data *vc[MAX_NR_CONSOLES]; int i; console_lock(); for (i = 1; i < MAX_NR_CONSOLES; i++) if (!vt_busy(i)) vc[i] = vc_deallocate(i); else vc[i] = NULL; console_unlock(); for (i = 1; i < MAX_NR_CONSOLES; i++) { if (vc[i] && i >= MIN_NR_CONSOLES) tty_port_put(&vc[i]->port); } } static int vt_resizex(struct vc_data *vc, struct vt_consize __user *cs) { struct vt_consize v; int i; if (copy_from_user(&v, cs, sizeof(struct vt_consize))) return -EFAULT; /* FIXME: Should check the copies properly */ if (!v.v_vlin) v.v_vlin = vc->vc_scan_lines; if (v.v_clin) { int rows = v.v_vlin / v.v_clin; if (v.v_rows != rows) { if (v.v_rows) /* Parameters don't add up */ return -EINVAL; v.v_rows = rows; } } if (v.v_vcol && v.v_ccol) { int cols = v.v_vcol / v.v_ccol; if (v.v_cols != cols) { if (v.v_cols) return -EINVAL; v.v_cols = cols; } } if (v.v_clin > 32) return -EINVAL; for (i = 0; i < MAX_NR_CONSOLES; i++) { struct vc_data *vcp; if (!vc_cons[i].d) continue; console_lock(); vcp = vc_cons[i].d; if (vcp) { int ret; int save_scan_lines = vcp->vc_scan_lines; int save_cell_height = vcp->vc_cell_height; if (v.v_vlin) vcp->vc_scan_lines = v.v_vlin; if (v.v_clin) vcp->vc_cell_height = v.v_clin; ret = __vc_resize(vcp, v.v_cols, v.v_rows, true); if (ret) { vcp->vc_scan_lines = save_scan_lines; vcp->vc_cell_height = save_cell_height; console_unlock(); return ret; } } console_unlock(); } return 0; } /* * We handle the console-specific ioctl's here. We allow the * capability to modify any console, not just the fg_console. */ int vt_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { struct vc_data *vc = tty->driver_data; void __user *up = (void __user *)arg; int i, perm; int ret; /* * To have permissions to do most of the vt ioctls, we either have * to be the owner of the tty, or have CAP_SYS_TTY_CONFIG. */ perm = 0; if (current->signal->tty == tty || capable(CAP_SYS_TTY_CONFIG)) perm = 1; ret = vt_k_ioctl(tty, cmd, arg, perm); if (ret != -ENOIOCTLCMD) return ret; ret = vt_io_ioctl(vc, cmd, up, perm); if (ret != -ENOIOCTLCMD) return ret; switch (cmd) { case TIOCLINUX: return tioclinux(tty, arg); case VT_SETMODE: { struct vt_mode tmp; if (!perm) return -EPERM; if (copy_from_user(&tmp, up, sizeof(struct vt_mode))) return -EFAULT; if (tmp.mode != VT_AUTO && tmp.mode != VT_PROCESS) return -EINVAL; console_lock(); vc->vt_mode = tmp; /* the frsig is ignored, so we set it to 0 */ vc->vt_mode.frsig = 0; put_pid(vc->vt_pid); vc->vt_pid = get_pid(task_pid(current)); /* no switch is required -- saw@shade.msu.ru */ vc->vt_newvt = -1; console_unlock(); break; } case VT_GETMODE: { struct vt_mode tmp; int rc; console_lock(); memcpy(&tmp, &vc->vt_mode, sizeof(struct vt_mode)); console_unlock(); rc = copy_to_user(up, &tmp, sizeof(struct vt_mode)); if (rc) return -EFAULT; break; } /* * Returns global vt state. Note that VT 0 is always open, since * it's an alias for the current VT, and people can't use it here. * We cannot return state for more than 16 VTs, since v_state is short. */ case VT_GETSTATE: { struct vt_stat __user *vtstat = up; unsigned short state, mask; if (put_user(fg_console + 1, &vtstat->v_active)) return -EFAULT; state = 1; /* /dev/tty0 is always open */ console_lock(); /* required by vt_in_use() */ for (i = 0, mask = 2; i < MAX_NR_CONSOLES && mask; ++i, mask <<= 1) if (vt_in_use(i)) state |= mask; console_unlock(); return put_user(state, &vtstat->v_state); } /* * Returns the first available (non-opened) console. */ case VT_OPENQRY: console_lock(); /* required by vt_in_use() */ for (i = 0; i < MAX_NR_CONSOLES; ++i) if (!vt_in_use(i)) break; console_unlock(); i = i < MAX_NR_CONSOLES ? (i+1) : -1; return put_user(i, (int __user *)arg); /* * ioctl(fd, VT_ACTIVATE, num) will cause us to switch to vt # num, * with num >= 1 (switches to vt 0, our console, are not allowed, just * to preserve sanity). */ case VT_ACTIVATE: if (!perm) return -EPERM; if (arg == 0 || arg > MAX_NR_CONSOLES) return -ENXIO; arg--; arg = array_index_nospec(arg, MAX_NR_CONSOLES); console_lock(); ret = vc_allocate(arg); console_unlock(); if (ret) return ret; set_console(arg); break; case VT_SETACTIVATE: if (!perm) return -EPERM; return vt_setactivate(up); /* * wait until the specified VT has been activated */ case VT_WAITACTIVE: if (!perm) return -EPERM; if (arg == 0 || arg > MAX_NR_CONSOLES) return -ENXIO; return vt_waitactive(arg); /* * If a vt is under process control, the kernel will not switch to it * immediately, but postpone the operation until the process calls this * ioctl, allowing the switch to complete. * * According to the X sources this is the behavior: * 0: pending switch-from not OK * 1: pending switch-from OK * 2: completed switch-to OK */ case VT_RELDISP: if (!perm) return -EPERM; console_lock(); ret = vt_reldisp(vc, arg); console_unlock(); return ret; /* * Disallocate memory associated to VT (but leave VT1) */ case VT_DISALLOCATE: if (arg > MAX_NR_CONSOLES) return -ENXIO; if (arg == 0) { vt_disallocate_all(); break; } arg = array_index_nospec(arg - 1, MAX_NR_CONSOLES); return vt_disallocate(arg); case VT_RESIZE: { struct vt_sizes __user *vtsizes = up; struct vc_data *vc; ushort ll,cc; if (!perm) return -EPERM; if (get_user(ll, &vtsizes->v_rows) || get_user(cc, &vtsizes->v_cols)) return -EFAULT; console_lock(); for (i = 0; i < MAX_NR_CONSOLES; i++) { vc = vc_cons[i].d; if (vc) { /* FIXME: review v tty lock */ __vc_resize(vc_cons[i].d, cc, ll, true); } } console_unlock(); break; } case VT_RESIZEX: if (!perm) return -EPERM; return vt_resizex(vc, up); case VT_LOCKSWITCH: if (!capable(CAP_SYS_TTY_CONFIG)) return -EPERM; vt_dont_switch = true; break; case VT_UNLOCKSWITCH: if (!capable(CAP_SYS_TTY_CONFIG)) return -EPERM; vt_dont_switch = false; break; case VT_GETHIFONTMASK: return put_user(vc->vc_hi_font_mask, (unsigned short __user *)arg); case VT_WAITEVENT: return vt_event_wait_ioctl((struct vt_event __user *)arg); default: return -ENOIOCTLCMD; } return 0; } void reset_vc(struct vc_data *vc) { vc->vc_mode = KD_TEXT; vt_reset_unicode(vc->vc_num); vc->vt_mode.mode = VT_AUTO; vc->vt_mode.waitv = 0; vc->vt_mode.relsig = 0; vc->vt_mode.acqsig = 0; vc->vt_mode.frsig = 0; put_pid(vc->vt_pid); vc->vt_pid = NULL; vc->vt_newvt = -1; reset_palette(vc); } void vc_SAK(struct work_struct *work) { struct vc *vc_con = container_of(work, struct vc, SAK_work); struct vc_data *vc; struct tty_struct *tty; console_lock(); vc = vc_con->d; if (vc) { /* FIXME: review tty ref counting */ tty = vc->port.tty; /* * SAK should also work in all raw modes and reset * them properly. */ if (tty) __do_SAK(tty); reset_vc(vc); } console_unlock(); } #ifdef CONFIG_COMPAT struct compat_console_font_op { compat_uint_t op; /* operation code KD_FONT_OP_* */ compat_uint_t flags; /* KD_FONT_FLAG_* */ compat_uint_t width, height; /* font size */ compat_uint_t charcount; compat_caddr_t data; /* font data with height fixed to 32 */ }; static inline int compat_kdfontop_ioctl(struct compat_console_font_op __user *fontop, int perm, struct console_font_op *op, struct vc_data *vc) { int i; if (copy_from_user(op, fontop, sizeof(struct compat_console_font_op))) return -EFAULT; if (!perm && op->op != KD_FONT_OP_GET) return -EPERM; op->data = compat_ptr(((struct compat_console_font_op *)op)->data); i = con_font_op(vc, op); if (i) return i; ((struct compat_console_font_op *)op)->data = (unsigned long)op->data; if (copy_to_user(fontop, op, sizeof(struct compat_console_font_op))) return -EFAULT; return 0; } struct compat_unimapdesc { unsigned short entry_ct; compat_caddr_t entries; }; static inline int compat_unimap_ioctl(unsigned int cmd, struct compat_unimapdesc __user *user_ud, int perm, struct vc_data *vc) { struct compat_unimapdesc tmp; struct unipair __user *tmp_entries; if (copy_from_user(&tmp, user_ud, sizeof tmp)) return -EFAULT; tmp_entries = compat_ptr(tmp.entries); switch (cmd) { case PIO_UNIMAP: if (!perm) return -EPERM; return con_set_unimap(vc, tmp.entry_ct, tmp_entries); case GIO_UNIMAP: if (!perm && fg_console != vc->vc_num) return -EPERM; return con_get_unimap(vc, tmp.entry_ct, &(user_ud->entry_ct), tmp_entries); } return 0; } long vt_compat_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) { struct vc_data *vc = tty->driver_data; struct console_font_op op; /* used in multiple places here */ void __user *up = compat_ptr(arg); int perm; /* * To have permissions to do most of the vt ioctls, we either have * to be the owner of the tty, or have CAP_SYS_TTY_CONFIG. */ perm = 0; if (current->signal->tty == tty || capable(CAP_SYS_TTY_CONFIG)) perm = 1; switch (cmd) { /* * these need special handlers for incompatible data structures */ case KDFONTOP: return compat_kdfontop_ioctl(up, perm, &op, vc); case PIO_UNIMAP: case GIO_UNIMAP: return compat_unimap_ioctl(cmd, up, perm, vc); /* * all these treat 'arg' as an integer */ case KIOCSOUND: case KDMKTONE: #ifdef CONFIG_X86 case KDADDIO: case KDDELIO: #endif case KDSETMODE: case KDMAPDISP: case KDUNMAPDISP: case KDSKBMODE: case KDSKBMETA: case KDSKBLED: case KDSETLED: case KDSIGACCEPT: case VT_ACTIVATE: case VT_WAITACTIVE: case VT_RELDISP: case VT_DISALLOCATE: case VT_RESIZE: case VT_RESIZEX: return vt_ioctl(tty, cmd, arg); /* * the rest has a compatible data structure behind arg, * but we have to convert it to a proper 64 bit pointer. */ default: return vt_ioctl(tty, cmd, (unsigned long)up); } } #endif /* CONFIG_COMPAT */ /* * Performs the back end of a vt switch. Called under the console * semaphore. */ static void complete_change_console(struct vc_data *vc) { unsigned char old_vc_mode; int old = fg_console; last_console = fg_console; /* * If we're switching, we could be going from KD_GRAPHICS to * KD_TEXT mode or vice versa, which means we need to blank or * unblank the screen later. */ old_vc_mode = vc_cons[fg_console].d->vc_mode; switch_screen(vc); /* * This can't appear below a successful kill_pid(). If it did, * then the *blank_screen operation could occur while X, having * received acqsig, is waking up on another processor. This * condition can lead to overlapping accesses to the VGA range * and the framebuffer (causing system lockups). * * To account for this we duplicate this code below only if the * controlling process is gone and we've called reset_vc. */ if (old_vc_mode != vc->vc_mode) { if (vc->vc_mode == KD_TEXT) do_unblank_screen(1); else do_blank_screen(1); } /* * If this new console is under process control, send it a signal * telling it that it has acquired. Also check if it has died and * clean up (similar to logic employed in change_console()) */ if (vc->vt_mode.mode == VT_PROCESS) { /* * Send the signal as privileged - kill_pid() will * tell us if the process has gone or something else * is awry */ if (kill_pid(vc->vt_pid, vc->vt_mode.acqsig, 1) != 0) { /* * The controlling process has died, so we revert back to * normal operation. In this case, we'll also change back * to KD_TEXT mode. I'm not sure if this is strictly correct * but it saves the agony when the X server dies and the screen * remains blanked due to KD_GRAPHICS! It would be nice to do * this outside of VT_PROCESS but there is no single process * to account for and tracking tty count may be undesirable. */ reset_vc(vc); if (old_vc_mode != vc->vc_mode) { if (vc->vc_mode == KD_TEXT) do_unblank_screen(1); else do_blank_screen(1); } } } /* * Wake anyone waiting for their VT to activate */ vt_event_post(VT_EVENT_SWITCH, old, vc->vc_num); return; } /* * Performs the front-end of a vt switch */ void change_console(struct vc_data *new_vc) { struct vc_data *vc; if (!new_vc || new_vc->vc_num == fg_console || vt_dont_switch) return; /* * If this vt is in process mode, then we need to handshake with * that process before switching. Essentially, we store where that * vt wants to switch to and wait for it to tell us when it's done * (via VT_RELDISP ioctl). * * We also check to see if the controlling process still exists. * If it doesn't, we reset this vt to auto mode and continue. * This is a cheap way to track process control. The worst thing * that can happen is: we send a signal to a process, it dies, and * the switch gets "lost" waiting for a response; hopefully, the * user will try again, we'll detect the process is gone (unless * the user waits just the right amount of time :-) and revert the * vt to auto control. */ vc = vc_cons[fg_console].d; if (vc->vt_mode.mode == VT_PROCESS) { /* * Send the signal as privileged - kill_pid() will * tell us if the process has gone or something else * is awry. * * We need to set vt_newvt *before* sending the signal or we * have a race. */ vc->vt_newvt = new_vc->vc_num; if (kill_pid(vc->vt_pid, vc->vt_mode.relsig, 1) == 0) { /* * It worked. Mark the vt to switch to and * return. The process needs to send us a * VT_RELDISP ioctl to complete the switch. */ return; } /* * The controlling process has died, so we revert back to * normal operation. In this case, we'll also change back * to KD_TEXT mode. I'm not sure if this is strictly correct * but it saves the agony when the X server dies and the screen * remains blanked due to KD_GRAPHICS! It would be nice to do * this outside of VT_PROCESS but there is no single process * to account for and tracking tty count may be undesirable. */ reset_vc(vc); /* * Fall through to normal (VT_AUTO) handling of the switch... */ } /* * Ignore all switches in KD_GRAPHICS+VT_AUTO mode */ if (vc->vc_mode == KD_GRAPHICS) return; complete_change_console(new_vc); } /* Perform a kernel triggered VT switch for suspend/resume */ static int disable_vt_switch; int vt_move_to_console(unsigned int vt, int alloc) { int prev; console_lock(); /* Graphics mode - up to X */ if (disable_vt_switch) { console_unlock(); return 0; } prev = fg_console; if (alloc && vc_allocate(vt)) { /* we can't have a free VC for now. Too bad, * we don't want to mess the screen for now. */ console_unlock(); return -ENOSPC; } if (set_console(vt)) { /* * We're unable to switch to the SUSPEND_CONSOLE. * Let the calling function know so it can decide * what to do. */ console_unlock(); return -EIO; } console_unlock(); if (vt_waitactive(vt + 1)) { pr_debug("Suspend: Can't switch VCs."); return -EINTR; } return prev; } /* * Normally during a suspend, we allocate a new console and switch to it. * When we resume, we switch back to the original console. This switch * can be slow, so on systems where the framebuffer can handle restoration * of video registers anyways, there's little point in doing the console * switch. This function allows you to disable it by passing it '0'. */ void pm_set_vt_switch(int do_switch) { console_lock(); disable_vt_switch = !do_switch; console_unlock(); } EXPORT_SYMBOL(pm_set_vt_switch); |
| 2 2 387 2 564 2 173 56 1 1 2 239 129 58 4 2 565 1 143 129 99 45 547 124 1 2 416 416 432 348 34 1 54 54 364 356 8 4 781 250 833 1 90 713 132 206 209 16 715 385 2 1145 2 10 12 14 449 19 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 | /* SPDX-License-Identifier: GPL-2.0 */ #undef TRACE_SYSTEM #define TRACE_SYSTEM ext4 #if !defined(_TRACE_EXT4_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_EXT4_H #include <linux/writeback.h> #include <linux/tracepoint.h> struct ext4_allocation_context; struct ext4_allocation_request; struct ext4_extent; struct ext4_prealloc_space; struct ext4_inode_info; struct mpage_da_data; struct ext4_map_blocks; struct extent_status; struct ext4_fsmap; struct partial_cluster; #define EXT4_I(inode) (container_of(inode, struct ext4_inode_info, vfs_inode)) #define show_mballoc_flags(flags) __print_flags(flags, "|", \ { EXT4_MB_HINT_MERGE, "HINT_MERGE" }, \ { EXT4_MB_HINT_RESERVED, "HINT_RESV" }, \ { EXT4_MB_HINT_METADATA, "HINT_MDATA" }, \ { EXT4_MB_HINT_FIRST, "HINT_FIRST" }, \ { EXT4_MB_HINT_BEST, "HINT_BEST" }, \ { EXT4_MB_HINT_DATA, "HINT_DATA" }, \ { EXT4_MB_HINT_NOPREALLOC, "HINT_NOPREALLOC" }, \ { EXT4_MB_HINT_GROUP_ALLOC, "HINT_GRP_ALLOC" }, \ { EXT4_MB_HINT_GOAL_ONLY, "HINT_GOAL_ONLY" }, \ { EXT4_MB_HINT_TRY_GOAL, "HINT_TRY_GOAL" }, \ { EXT4_MB_DELALLOC_RESERVED, "DELALLOC_RESV" }, \ { EXT4_MB_STREAM_ALLOC, "STREAM_ALLOC" }, \ { EXT4_MB_USE_ROOT_BLOCKS, "USE_ROOT_BLKS" }, \ { EXT4_MB_USE_RESERVED, "USE_RESV" }, \ { EXT4_MB_STRICT_CHECK, "STRICT_CHECK" }) #define show_map_flags(flags) __print_flags(flags, "|", \ { EXT4_GET_BLOCKS_CREATE, "CREATE" }, \ { EXT4_GET_BLOCKS_UNWRIT_EXT, "UNWRIT" }, \ { EXT4_GET_BLOCKS_DELALLOC_RESERVE, "DELALLOC" }, \ { EXT4_GET_BLOCKS_PRE_IO, "PRE_IO" }, \ { EXT4_GET_BLOCKS_CONVERT, "CONVERT" }, \ { EXT4_GET_BLOCKS_METADATA_NOFAIL, "METADATA_NOFAIL" }, \ { EXT4_GET_BLOCKS_NO_NORMALIZE, "NO_NORMALIZE" }, \ { EXT4_GET_BLOCKS_CONVERT_UNWRITTEN, "CONVERT_UNWRITTEN" }, \ { EXT4_GET_BLOCKS_ZERO, "ZERO" }, \ { EXT4_GET_BLOCKS_IO_SUBMIT, "IO_SUBMIT" }, \ { EXT4_EX_NOCACHE, "EX_NOCACHE" }) /* * __print_flags() requires that all enum values be wrapped in the * TRACE_DEFINE_ENUM macro so that the enum value can be encoded in the ftrace * ring buffer. */ TRACE_DEFINE_ENUM(BH_New); TRACE_DEFINE_ENUM(BH_Mapped); TRACE_DEFINE_ENUM(BH_Unwritten); TRACE_DEFINE_ENUM(BH_Boundary); #define show_mflags(flags) __print_flags(flags, "", \ { EXT4_MAP_NEW, "N" }, \ { EXT4_MAP_MAPPED, "M" }, \ { EXT4_MAP_UNWRITTEN, "U" }, \ { EXT4_MAP_BOUNDARY, "B" }) #define show_free_flags(flags) __print_flags(flags, "|", \ { EXT4_FREE_BLOCKS_METADATA, "METADATA" }, \ { EXT4_FREE_BLOCKS_FORGET, "FORGET" }, \ { EXT4_FREE_BLOCKS_VALIDATED, "VALIDATED" }, \ { EXT4_FREE_BLOCKS_NO_QUOT_UPDATE, "NO_QUOTA" }, \ { EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER,"1ST_CLUSTER" },\ { EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER, "LAST_CLUSTER" }) TRACE_DEFINE_ENUM(ES_WRITTEN_B); TRACE_DEFINE_ENUM(ES_UNWRITTEN_B); TRACE_DEFINE_ENUM(ES_DELAYED_B); TRACE_DEFINE_ENUM(ES_HOLE_B); TRACE_DEFINE_ENUM(ES_REFERENCED_B); #define show_extent_status(status) __print_flags(status, "", \ { EXTENT_STATUS_WRITTEN, "W" }, \ { EXTENT_STATUS_UNWRITTEN, "U" }, \ { EXTENT_STATUS_DELAYED, "D" }, \ { EXTENT_STATUS_HOLE, "H" }, \ { EXTENT_STATUS_REFERENCED, "R" }) #define show_falloc_mode(mode) __print_flags(mode, "|", \ { FALLOC_FL_KEEP_SIZE, "KEEP_SIZE"}, \ { FALLOC_FL_PUNCH_HOLE, "PUNCH_HOLE"}, \ { FALLOC_FL_NO_HIDE_STALE, "NO_HIDE_STALE"}, \ { FALLOC_FL_COLLAPSE_RANGE, "COLLAPSE_RANGE"}, \ { FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"}) TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR); TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME); TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE); TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM); TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT); TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE); TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR); TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE); TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA); TRACE_DEFINE_ENUM(EXT4_FC_REASON_ENCRYPTED_FILENAME); TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX); #define show_fc_reason(reason) \ __print_symbolic(reason, \ { EXT4_FC_REASON_XATTR, "XATTR"}, \ { EXT4_FC_REASON_CROSS_RENAME, "CROSS_RENAME"}, \ { EXT4_FC_REASON_JOURNAL_FLAG_CHANGE, "JOURNAL_FLAG_CHANGE"}, \ { EXT4_FC_REASON_NOMEM, "NO_MEM"}, \ { EXT4_FC_REASON_SWAP_BOOT, "SWAP_BOOT"}, \ { EXT4_FC_REASON_RESIZE, "RESIZE"}, \ { EXT4_FC_REASON_RENAME_DIR, "RENAME_DIR"}, \ { EXT4_FC_REASON_FALLOC_RANGE, "FALLOC_RANGE"}, \ { EXT4_FC_REASON_INODE_JOURNAL_DATA, "INODE_JOURNAL_DATA"}, \ { EXT4_FC_REASON_ENCRYPTED_FILENAME, "ENCRYPTED_FILENAME"}) TRACE_DEFINE_ENUM(CR_POWER2_ALIGNED); TRACE_DEFINE_ENUM(CR_GOAL_LEN_FAST); TRACE_DEFINE_ENUM(CR_BEST_AVAIL_LEN); TRACE_DEFINE_ENUM(CR_GOAL_LEN_SLOW); TRACE_DEFINE_ENUM(CR_ANY_FREE); #define show_criteria(cr) \ __print_symbolic(cr, \ { CR_POWER2_ALIGNED, "CR_POWER2_ALIGNED" }, \ { CR_GOAL_LEN_FAST, "CR_GOAL_LEN_FAST" }, \ { CR_BEST_AVAIL_LEN, "CR_BEST_AVAIL_LEN" }, \ { CR_GOAL_LEN_SLOW, "CR_GOAL_LEN_SLOW" }, \ { CR_ANY_FREE, "CR_ANY_FREE" }) TRACE_EVENT(ext4_other_inode_update_time, TP_PROTO(struct inode *inode, ino_t orig_ino), TP_ARGS(inode, orig_ino), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ino_t, orig_ino ) __field( uid_t, uid ) __field( gid_t, gid ) __field( __u16, mode ) ), TP_fast_assign( __entry->orig_ino = orig_ino; __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->uid = i_uid_read(inode); __entry->gid = i_gid_read(inode); __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d orig_ino %lu ino %lu mode 0%o uid %u gid %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->orig_ino, (unsigned long) __entry->ino, __entry->mode, __entry->uid, __entry->gid) ); TRACE_EVENT(ext4_free_inode, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( uid_t, uid ) __field( gid_t, gid ) __field( __u64, blocks ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->uid = i_uid_read(inode); __entry->gid = i_gid_read(inode); __entry->blocks = inode->i_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o uid %u gid %u blocks %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->uid, __entry->gid, __entry->blocks) ); TRACE_EVENT(ext4_request_inode, TP_PROTO(struct inode *dir, int mode), TP_ARGS(dir, mode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, dir ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = dir->i_sb->s_dev; __entry->dir = dir->i_ino; __entry->mode = mode; ), TP_printk("dev %d,%d dir %lu mode 0%o", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->dir, __entry->mode) ); TRACE_EVENT(ext4_allocate_inode, TP_PROTO(struct inode *inode, struct inode *dir, int mode), TP_ARGS(inode, dir, mode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ino_t, dir ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->dir = dir->i_ino; __entry->mode = mode; ), TP_printk("dev %d,%d ino %lu dir %lu mode 0%o", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned long) __entry->dir, __entry->mode) ); TRACE_EVENT(ext4_evict_inode, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, nlink ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->nlink = inode->i_nlink; ), TP_printk("dev %d,%d ino %lu nlink %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->nlink) ); TRACE_EVENT(ext4_drop_inode, TP_PROTO(struct inode *inode, int drop), TP_ARGS(inode, drop), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, drop ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->drop = drop; ), TP_printk("dev %d,%d ino %lu drop %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->drop) ); TRACE_EVENT(ext4_nfs_commit_metadata, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; ), TP_printk("dev %d,%d ino %lu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino) ); TRACE_EVENT(ext4_mark_inode_dirty, TP_PROTO(struct inode *inode, unsigned long IP), TP_ARGS(inode, IP), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field(unsigned long, ip ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->ip = IP; ), TP_printk("dev %d,%d ino %lu caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (void *)__entry->ip) ); TRACE_EVENT(ext4_begin_ordered_truncate, TP_PROTO(struct inode *inode, loff_t new_size), TP_ARGS(inode, new_size), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, new_size ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->new_size = new_size; ), TP_printk("dev %d,%d ino %lu new_size %lld", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->new_size) ); DECLARE_EVENT_CLASS(ext4__write_begin, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len), TP_ARGS(inode, pos, len), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, pos ) __field( unsigned int, len ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pos = pos; __entry->len = len; ), TP_printk("dev %d,%d ino %lu pos %lld len %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->pos, __entry->len) ); DEFINE_EVENT(ext4__write_begin, ext4_write_begin, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len), TP_ARGS(inode, pos, len) ); DEFINE_EVENT(ext4__write_begin, ext4_da_write_begin, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len), TP_ARGS(inode, pos, len) ); DECLARE_EVENT_CLASS(ext4__write_end, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len, unsigned int copied), TP_ARGS(inode, pos, len, copied), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, pos ) __field( unsigned int, len ) __field( unsigned int, copied ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pos = pos; __entry->len = len; __entry->copied = copied; ), TP_printk("dev %d,%d ino %lu pos %lld len %u copied %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->pos, __entry->len, __entry->copied) ); DEFINE_EVENT(ext4__write_end, ext4_write_end, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len, unsigned int copied), TP_ARGS(inode, pos, len, copied) ); DEFINE_EVENT(ext4__write_end, ext4_journalled_write_end, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len, unsigned int copied), TP_ARGS(inode, pos, len, copied) ); DEFINE_EVENT(ext4__write_end, ext4_da_write_end, TP_PROTO(struct inode *inode, loff_t pos, unsigned int len, unsigned int copied), TP_ARGS(inode, pos, len, copied) ); TRACE_EVENT(ext4_writepages, TP_PROTO(struct inode *inode, struct writeback_control *wbc), TP_ARGS(inode, wbc), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( long, nr_to_write ) __field( long, pages_skipped ) __field( loff_t, range_start ) __field( loff_t, range_end ) __field( pgoff_t, writeback_index ) __field( int, sync_mode ) __field( char, for_kupdate ) __field( char, range_cyclic ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->nr_to_write = wbc->nr_to_write; __entry->pages_skipped = wbc->pages_skipped; __entry->range_start = wbc->range_start; __entry->range_end = wbc->range_end; __entry->writeback_index = inode->i_mapping->writeback_index; __entry->sync_mode = wbc->sync_mode; __entry->for_kupdate = wbc->for_kupdate; __entry->range_cyclic = wbc->range_cyclic; ), TP_printk("dev %d,%d ino %lu nr_to_write %ld pages_skipped %ld " "range_start %lld range_end %lld sync_mode %d " "for_kupdate %d range_cyclic %d writeback_index %lu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->nr_to_write, __entry->pages_skipped, __entry->range_start, __entry->range_end, __entry->sync_mode, __entry->for_kupdate, __entry->range_cyclic, (unsigned long) __entry->writeback_index) ); TRACE_EVENT(ext4_da_write_pages, TP_PROTO(struct inode *inode, pgoff_t first_page, struct writeback_control *wbc), TP_ARGS(inode, first_page, wbc), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( pgoff_t, first_page ) __field( long, nr_to_write ) __field( int, sync_mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->first_page = first_page; __entry->nr_to_write = wbc->nr_to_write; __entry->sync_mode = wbc->sync_mode; ), TP_printk("dev %d,%d ino %lu first_page %lu nr_to_write %ld " "sync_mode %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->first_page, __entry->nr_to_write, __entry->sync_mode) ); TRACE_EVENT(ext4_da_write_pages_extent, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map), TP_ARGS(inode, map), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, lblk ) __field( __u32, len ) __field( __u32, flags ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = map->m_lblk; __entry->len = map->m_len; __entry->flags = map->m_flags; ), TP_printk("dev %d,%d ino %lu lblk %llu len %u flags %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, show_mflags(__entry->flags)) ); TRACE_EVENT(ext4_writepages_result, TP_PROTO(struct inode *inode, struct writeback_control *wbc, int ret, int pages_written), TP_ARGS(inode, wbc, ret, pages_written), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, ret ) __field( int, pages_written ) __field( long, pages_skipped ) __field( pgoff_t, writeback_index ) __field( int, sync_mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->ret = ret; __entry->pages_written = pages_written; __entry->pages_skipped = wbc->pages_skipped; __entry->writeback_index = inode->i_mapping->writeback_index; __entry->sync_mode = wbc->sync_mode; ), TP_printk("dev %d,%d ino %lu ret %d pages_written %d pages_skipped %ld " "sync_mode %d writeback_index %lu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->ret, __entry->pages_written, __entry->pages_skipped, __entry->sync_mode, (unsigned long) __entry->writeback_index) ); DECLARE_EVENT_CLASS(ext4__folio_op, TP_PROTO(struct inode *inode, struct folio *folio), TP_ARGS(inode, folio), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( pgoff_t, index ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->index = folio->index; ), TP_printk("dev %d,%d ino %lu folio_index %lu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned long) __entry->index) ); DEFINE_EVENT(ext4__folio_op, ext4_read_folio, TP_PROTO(struct inode *inode, struct folio *folio), TP_ARGS(inode, folio) ); DEFINE_EVENT(ext4__folio_op, ext4_release_folio, TP_PROTO(struct inode *inode, struct folio *folio), TP_ARGS(inode, folio) ); DECLARE_EVENT_CLASS(ext4_invalidate_folio_op, TP_PROTO(struct folio *folio, size_t offset, size_t length), TP_ARGS(folio, offset, length), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( pgoff_t, index ) __field( size_t, offset ) __field( size_t, length ) ), TP_fast_assign( __entry->dev = folio->mapping->host->i_sb->s_dev; __entry->ino = folio->mapping->host->i_ino; __entry->index = folio->index; __entry->offset = offset; __entry->length = length; ), TP_printk("dev %d,%d ino %lu folio_index %lu offset %zu length %zu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned long) __entry->index, __entry->offset, __entry->length) ); DEFINE_EVENT(ext4_invalidate_folio_op, ext4_invalidate_folio, TP_PROTO(struct folio *folio, size_t offset, size_t length), TP_ARGS(folio, offset, length) ); DEFINE_EVENT(ext4_invalidate_folio_op, ext4_journalled_invalidate_folio, TP_PROTO(struct folio *folio, size_t offset, size_t length), TP_ARGS(folio, offset, length) ); TRACE_EVENT(ext4_discard_blocks, TP_PROTO(struct super_block *sb, unsigned long long blk, unsigned long long count), TP_ARGS(sb, blk, count), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u64, blk ) __field( __u64, count ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->blk = blk; __entry->count = count; ), TP_printk("dev %d,%d blk %llu count %llu", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->blk, __entry->count) ); DECLARE_EVENT_CLASS(ext4__mb_new_pa, TP_PROTO(struct ext4_allocation_context *ac, struct ext4_prealloc_space *pa), TP_ARGS(ac, pa), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, pa_pstart ) __field( __u64, pa_lstart ) __field( __u32, pa_len ) ), TP_fast_assign( __entry->dev = ac->ac_sb->s_dev; __entry->ino = ac->ac_inode->i_ino; __entry->pa_pstart = pa->pa_pstart; __entry->pa_lstart = pa->pa_lstart; __entry->pa_len = pa->pa_len; ), TP_printk("dev %d,%d ino %lu pstart %llu len %u lstart %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->pa_pstart, __entry->pa_len, __entry->pa_lstart) ); DEFINE_EVENT(ext4__mb_new_pa, ext4_mb_new_inode_pa, TP_PROTO(struct ext4_allocation_context *ac, struct ext4_prealloc_space *pa), TP_ARGS(ac, pa) ); DEFINE_EVENT(ext4__mb_new_pa, ext4_mb_new_group_pa, TP_PROTO(struct ext4_allocation_context *ac, struct ext4_prealloc_space *pa), TP_ARGS(ac, pa) ); TRACE_EVENT(ext4_mb_release_inode_pa, TP_PROTO(struct ext4_prealloc_space *pa, unsigned long long block, unsigned int count), TP_ARGS(pa, block, count), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, block ) __field( __u32, count ) ), TP_fast_assign( __entry->dev = pa->pa_inode->i_sb->s_dev; __entry->ino = pa->pa_inode->i_ino; __entry->block = block; __entry->count = count; ), TP_printk("dev %d,%d ino %lu block %llu count %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->block, __entry->count) ); TRACE_EVENT(ext4_mb_release_group_pa, TP_PROTO(struct super_block *sb, struct ext4_prealloc_space *pa), TP_ARGS(sb, pa), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u64, pa_pstart ) __field( __u32, pa_len ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->pa_pstart = pa->pa_pstart; __entry->pa_len = pa->pa_len; ), TP_printk("dev %d,%d pstart %llu len %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->pa_pstart, __entry->pa_len) ); TRACE_EVENT(ext4_discard_preallocations, TP_PROTO(struct inode *inode, unsigned int len), TP_ARGS(inode, len), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( unsigned int, len ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->len = len; ), TP_printk("dev %d,%d ino %lu len: %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->len) ); TRACE_EVENT(ext4_mb_discard_preallocations, TP_PROTO(struct super_block *sb, int needed), TP_ARGS(sb, needed), TP_STRUCT__entry( __field( dev_t, dev ) __field( int, needed ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->needed = needed; ), TP_printk("dev %d,%d needed %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->needed) ); TRACE_EVENT(ext4_request_blocks, TP_PROTO(struct ext4_allocation_request *ar), TP_ARGS(ar), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( unsigned int, len ) __field( __u32, logical ) __field( __u32, lleft ) __field( __u32, lright ) __field( __u64, goal ) __field( __u64, pleft ) __field( __u64, pright ) __field( unsigned int, flags ) ), TP_fast_assign( __entry->dev = ar->inode->i_sb->s_dev; __entry->ino = ar->inode->i_ino; __entry->len = ar->len; __entry->logical = ar->logical; __entry->goal = ar->goal; __entry->lleft = ar->lleft; __entry->lright = ar->lright; __entry->pleft = ar->pleft; __entry->pright = ar->pright; __entry->flags = ar->flags; ), TP_printk("dev %d,%d ino %lu flags %s len %u lblk %u goal %llu " "lleft %u lright %u pleft %llu pright %llu ", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, show_mballoc_flags(__entry->flags), __entry->len, __entry->logical, __entry->goal, __entry->lleft, __entry->lright, __entry->pleft, __entry->pright) ); TRACE_EVENT(ext4_allocate_blocks, TP_PROTO(struct ext4_allocation_request *ar, unsigned long long block), TP_ARGS(ar, block), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, block ) __field( unsigned int, len ) __field( __u32, logical ) __field( __u32, lleft ) __field( __u32, lright ) __field( __u64, goal ) __field( __u64, pleft ) __field( __u64, pright ) __field( unsigned int, flags ) ), TP_fast_assign( __entry->dev = ar->inode->i_sb->s_dev; __entry->ino = ar->inode->i_ino; __entry->block = block; __entry->len = ar->len; __entry->logical = ar->logical; __entry->goal = ar->goal; __entry->lleft = ar->lleft; __entry->lright = ar->lright; __entry->pleft = ar->pleft; __entry->pright = ar->pright; __entry->flags = ar->flags; ), TP_printk("dev %d,%d ino %lu flags %s len %u block %llu lblk %u " "goal %llu lleft %u lright %u pleft %llu pright %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, show_mballoc_flags(__entry->flags), __entry->len, __entry->block, __entry->logical, __entry->goal, __entry->lleft, __entry->lright, __entry->pleft, __entry->pright) ); TRACE_EVENT(ext4_free_blocks, TP_PROTO(struct inode *inode, __u64 block, unsigned long count, int flags), TP_ARGS(inode, block, count, flags), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, block ) __field( unsigned long, count ) __field( int, flags ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->block = block; __entry->count = count; __entry->flags = flags; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o block %llu count %lu flags %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->block, __entry->count, show_free_flags(__entry->flags)) ); TRACE_EVENT(ext4_sync_file_enter, TP_PROTO(struct file *file, int datasync), TP_ARGS(file, datasync), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ino_t, parent ) __field( int, datasync ) ), TP_fast_assign( struct dentry *dentry = file->f_path.dentry; __entry->dev = dentry->d_sb->s_dev; __entry->ino = d_inode(dentry)->i_ino; __entry->datasync = datasync; __entry->parent = d_inode(dentry->d_parent)->i_ino; ), TP_printk("dev %d,%d ino %lu parent %lu datasync %d ", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned long) __entry->parent, __entry->datasync) ); TRACE_EVENT(ext4_sync_file_exit, TP_PROTO(struct inode *inode, int ret), TP_ARGS(inode, ret), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, ret ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->ret = ret; ), TP_printk("dev %d,%d ino %lu ret %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->ret) ); TRACE_EVENT(ext4_sync_fs, TP_PROTO(struct super_block *sb, int wait), TP_ARGS(sb, wait), TP_STRUCT__entry( __field( dev_t, dev ) __field( int, wait ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->wait = wait; ), TP_printk("dev %d,%d wait %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->wait) ); TRACE_EVENT(ext4_alloc_da_blocks, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( unsigned int, data_blocks ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->data_blocks = EXT4_I(inode)->i_reserved_data_blocks; ), TP_printk("dev %d,%d ino %lu reserved_data_blocks %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->data_blocks) ); TRACE_EVENT(ext4_mballoc_alloc, TP_PROTO(struct ext4_allocation_context *ac), TP_ARGS(ac), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u32, orig_logical ) __field( int, orig_start ) __field( __u32, orig_group ) __field( int, orig_len ) __field( __u32, goal_logical ) __field( int, goal_start ) __field( __u32, goal_group ) __field( int, goal_len ) __field( __u32, result_logical ) __field( int, result_start ) __field( __u32, result_group ) __field( int, result_len ) __field( __u16, found ) __field( __u16, groups ) __field( __u16, buddy ) __field( __u16, flags ) __field( __u16, tail ) __field( __u8, cr ) ), TP_fast_assign( __entry->dev = ac->ac_inode->i_sb->s_dev; __entry->ino = ac->ac_inode->i_ino; __entry->orig_logical = ac->ac_o_ex.fe_logical; __entry->orig_start = ac->ac_o_ex.fe_start; __entry->orig_group = ac->ac_o_ex.fe_group; __entry->orig_len = ac->ac_o_ex.fe_len; __entry->goal_logical = ac->ac_g_ex.fe_logical; __entry->goal_start = ac->ac_g_ex.fe_start; __entry->goal_group = ac->ac_g_ex.fe_group; __entry->goal_len = ac->ac_g_ex.fe_len; __entry->result_logical = ac->ac_f_ex.fe_logical; __entry->result_start = ac->ac_f_ex.fe_start; __entry->result_group = ac->ac_f_ex.fe_group; __entry->result_len = ac->ac_f_ex.fe_len; __entry->found = ac->ac_found; __entry->flags = ac->ac_flags; __entry->groups = ac->ac_groups_scanned; __entry->buddy = ac->ac_buddy; __entry->tail = ac->ac_tail; __entry->cr = ac->ac_criteria; ), TP_printk("dev %d,%d inode %lu orig %u/%d/%u@%u goal %u/%d/%u@%u " "result %u/%d/%u@%u blks %u grps %u cr %s flags %s " "tail %u broken %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->orig_group, __entry->orig_start, __entry->orig_len, __entry->orig_logical, __entry->goal_group, __entry->goal_start, __entry->goal_len, __entry->goal_logical, __entry->result_group, __entry->result_start, __entry->result_len, __entry->result_logical, __entry->found, __entry->groups, show_criteria(__entry->cr), show_mballoc_flags(__entry->flags), __entry->tail, __entry->buddy ? 1 << __entry->buddy : 0) ); TRACE_EVENT(ext4_mballoc_prealloc, TP_PROTO(struct ext4_allocation_context *ac), TP_ARGS(ac), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u32, orig_logical ) __field( int, orig_start ) __field( __u32, orig_group ) __field( int, orig_len ) __field( __u32, result_logical ) __field( int, result_start ) __field( __u32, result_group ) __field( int, result_len ) ), TP_fast_assign( __entry->dev = ac->ac_inode->i_sb->s_dev; __entry->ino = ac->ac_inode->i_ino; __entry->orig_logical = ac->ac_o_ex.fe_logical; __entry->orig_start = ac->ac_o_ex.fe_start; __entry->orig_group = ac->ac_o_ex.fe_group; __entry->orig_len = ac->ac_o_ex.fe_len; __entry->result_logical = ac->ac_b_ex.fe_logical; __entry->result_start = ac->ac_b_ex.fe_start; __entry->result_group = ac->ac_b_ex.fe_group; __entry->result_len = ac->ac_b_ex.fe_len; ), TP_printk("dev %d,%d inode %lu orig %u/%d/%u@%u result %u/%d/%u@%u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->orig_group, __entry->orig_start, __entry->orig_len, __entry->orig_logical, __entry->result_group, __entry->result_start, __entry->result_len, __entry->result_logical) ); DECLARE_EVENT_CLASS(ext4__mballoc, TP_PROTO(struct super_block *sb, struct inode *inode, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, inode, group, start, len), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, result_start ) __field( __u32, result_group ) __field( int, result_len ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->ino = inode ? inode->i_ino : 0; __entry->result_start = start; __entry->result_group = group; __entry->result_len = len; ), TP_printk("dev %d,%d inode %lu extent %u/%d/%d ", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->result_group, __entry->result_start, __entry->result_len) ); DEFINE_EVENT(ext4__mballoc, ext4_mballoc_discard, TP_PROTO(struct super_block *sb, struct inode *inode, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, inode, group, start, len) ); DEFINE_EVENT(ext4__mballoc, ext4_mballoc_free, TP_PROTO(struct super_block *sb, struct inode *inode, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, inode, group, start, len) ); TRACE_EVENT(ext4_forget, TP_PROTO(struct inode *inode, int is_metadata, __u64 block), TP_ARGS(inode, is_metadata, block), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, block ) __field( int, is_metadata ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->block = block; __entry->is_metadata = is_metadata; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o is_metadata %d block %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->is_metadata, __entry->block) ); TRACE_EVENT(ext4_da_update_reserve_space, TP_PROTO(struct inode *inode, int used_blocks, int quota_claim), TP_ARGS(inode, used_blocks, quota_claim), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, used_blocks ) __field( int, reserved_data_blocks ) __field( int, quota_claim ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->used_blocks = used_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->quota_claim = quota_claim; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d " "reserved_data_blocks %d quota_claim %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, __entry->used_blocks, __entry->reserved_data_blocks, __entry->quota_claim) ); TRACE_EVENT(ext4_da_reserve_space, TP_PROTO(struct inode *inode, int nr_resv), TP_ARGS(inode, nr_resv), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, reserve_blocks ) __field( int, reserved_data_blocks ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->reserve_blocks = nr_resv; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu reserve_blocks %d" "reserved_data_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, __entry->reserve_blocks, __entry->reserved_data_blocks) ); TRACE_EVENT(ext4_da_release_space, TP_PROTO(struct inode *inode, int freed_blocks), TP_ARGS(inode, freed_blocks), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, freed_blocks ) __field( int, reserved_data_blocks ) __field( __u16, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->freed_blocks = freed_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d " "reserved_data_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, __entry->freed_blocks, __entry->reserved_data_blocks) ); DECLARE_EVENT_CLASS(ext4__bitmap_load, TP_PROTO(struct super_block *sb, unsigned long group), TP_ARGS(sb, group), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u32, group ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->group = group; ), TP_printk("dev %d,%d group %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->group) ); DEFINE_EVENT(ext4__bitmap_load, ext4_mb_bitmap_load, TP_PROTO(struct super_block *sb, unsigned long group), TP_ARGS(sb, group) ); DEFINE_EVENT(ext4__bitmap_load, ext4_mb_buddy_bitmap_load, TP_PROTO(struct super_block *sb, unsigned long group), TP_ARGS(sb, group) ); DEFINE_EVENT(ext4__bitmap_load, ext4_load_inode_bitmap, TP_PROTO(struct super_block *sb, unsigned long group), TP_ARGS(sb, group) ); TRACE_EVENT(ext4_read_block_bitmap_load, TP_PROTO(struct super_block *sb, unsigned long group, bool prefetch), TP_ARGS(sb, group, prefetch), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u32, group ) __field( bool, prefetch ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->group = group; __entry->prefetch = prefetch; ), TP_printk("dev %d,%d group %u prefetch %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->group, __entry->prefetch) ); DECLARE_EVENT_CLASS(ext4__fallocate_mode, TP_PROTO(struct inode *inode, loff_t offset, loff_t len, int mode), TP_ARGS(inode, offset, len, mode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, offset ) __field( loff_t, len ) __field( int, mode ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->offset = offset; __entry->len = len; __entry->mode = mode; ), TP_printk("dev %d,%d ino %lu offset %lld len %lld mode %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->offset, __entry->len, show_falloc_mode(__entry->mode)) ); DEFINE_EVENT(ext4__fallocate_mode, ext4_fallocate_enter, TP_PROTO(struct inode *inode, loff_t offset, loff_t len, int mode), TP_ARGS(inode, offset, len, mode) ); DEFINE_EVENT(ext4__fallocate_mode, ext4_punch_hole, TP_PROTO(struct inode *inode, loff_t offset, loff_t len, int mode), TP_ARGS(inode, offset, len, mode) ); DEFINE_EVENT(ext4__fallocate_mode, ext4_zero_range, TP_PROTO(struct inode *inode, loff_t offset, loff_t len, int mode), TP_ARGS(inode, offset, len, mode) ); TRACE_EVENT(ext4_fallocate_exit, TP_PROTO(struct inode *inode, loff_t offset, unsigned int max_blocks, int ret), TP_ARGS(inode, offset, max_blocks, ret), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, pos ) __field( unsigned int, blocks ) __field( int, ret ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pos = offset; __entry->blocks = max_blocks; __entry->ret = ret; ), TP_printk("dev %d,%d ino %lu pos %lld blocks %u ret %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->pos, __entry->blocks, __entry->ret) ); TRACE_EVENT(ext4_unlink_enter, TP_PROTO(struct inode *parent, struct dentry *dentry), TP_ARGS(parent, dentry), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ino_t, parent ) __field( loff_t, size ) ), TP_fast_assign( __entry->dev = dentry->d_sb->s_dev; __entry->ino = d_inode(dentry)->i_ino; __entry->parent = parent->i_ino; __entry->size = d_inode(dentry)->i_size; ), TP_printk("dev %d,%d ino %lu size %lld parent %lu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->size, (unsigned long) __entry->parent) ); TRACE_EVENT(ext4_unlink_exit, TP_PROTO(struct dentry *dentry, int ret), TP_ARGS(dentry, ret), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, ret ) ), TP_fast_assign( __entry->dev = dentry->d_sb->s_dev; __entry->ino = d_inode(dentry)->i_ino; __entry->ret = ret; ), TP_printk("dev %d,%d ino %lu ret %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->ret) ); DECLARE_EVENT_CLASS(ext4__truncate, TP_PROTO(struct inode *inode), TP_ARGS(inode), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, blocks ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->blocks = inode->i_blocks; ), TP_printk("dev %d,%d ino %lu blocks %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->blocks) ); DEFINE_EVENT(ext4__truncate, ext4_truncate_enter, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); DEFINE_EVENT(ext4__truncate, ext4_truncate_exit, TP_PROTO(struct inode *inode), TP_ARGS(inode) ); /* 'ux' is the unwritten extent. */ TRACE_EVENT(ext4_ext_convert_to_initialized_enter, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map, struct ext4_extent *ux), TP_ARGS(inode, map, ux), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, m_lblk ) __field( unsigned, m_len ) __field( ext4_lblk_t, u_lblk ) __field( unsigned, u_len ) __field( ext4_fsblk_t, u_pblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->m_lblk = map->m_lblk; __entry->m_len = map->m_len; __entry->u_lblk = le32_to_cpu(ux->ee_block); __entry->u_len = ext4_ext_get_actual_len(ux); __entry->u_pblk = ext4_ext_pblock(ux); ), TP_printk("dev %d,%d ino %lu m_lblk %u m_len %u u_lblk %u u_len %u " "u_pblk %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->m_lblk, __entry->m_len, __entry->u_lblk, __entry->u_len, __entry->u_pblk) ); /* * 'ux' is the unwritten extent. * 'ix' is the initialized extent to which blocks are transferred. */ TRACE_EVENT(ext4_ext_convert_to_initialized_fastpath, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map, struct ext4_extent *ux, struct ext4_extent *ix), TP_ARGS(inode, map, ux, ix), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, m_lblk ) __field( unsigned, m_len ) __field( ext4_lblk_t, u_lblk ) __field( unsigned, u_len ) __field( ext4_fsblk_t, u_pblk ) __field( ext4_lblk_t, i_lblk ) __field( unsigned, i_len ) __field( ext4_fsblk_t, i_pblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->m_lblk = map->m_lblk; __entry->m_len = map->m_len; __entry->u_lblk = le32_to_cpu(ux->ee_block); __entry->u_len = ext4_ext_get_actual_len(ux); __entry->u_pblk = ext4_ext_pblock(ux); __entry->i_lblk = le32_to_cpu(ix->ee_block); __entry->i_len = ext4_ext_get_actual_len(ix); __entry->i_pblk = ext4_ext_pblock(ix); ), TP_printk("dev %d,%d ino %lu m_lblk %u m_len %u " "u_lblk %u u_len %u u_pblk %llu " "i_lblk %u i_len %u i_pblk %llu ", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->m_lblk, __entry->m_len, __entry->u_lblk, __entry->u_len, __entry->u_pblk, __entry->i_lblk, __entry->i_len, __entry->i_pblk) ); DECLARE_EVENT_CLASS(ext4__map_blocks_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, unsigned int len, unsigned int flags), TP_ARGS(inode, lblk, len, flags), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) __field( unsigned int, len ) __field( unsigned int, flags ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = lblk; __entry->len = len; __entry->flags = flags; ), TP_printk("dev %d,%d ino %lu lblk %u len %u flags %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, show_map_flags(__entry->flags)) ); DEFINE_EVENT(ext4__map_blocks_enter, ext4_ext_map_blocks_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, unsigned len, unsigned flags), TP_ARGS(inode, lblk, len, flags) ); DEFINE_EVENT(ext4__map_blocks_enter, ext4_ind_map_blocks_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, unsigned len, unsigned flags), TP_ARGS(inode, lblk, len, flags) ); DECLARE_EVENT_CLASS(ext4__map_blocks_exit, TP_PROTO(struct inode *inode, unsigned flags, struct ext4_map_blocks *map, int ret), TP_ARGS(inode, flags, map, ret), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( unsigned int, flags ) __field( ext4_fsblk_t, pblk ) __field( ext4_lblk_t, lblk ) __field( unsigned int, len ) __field( unsigned int, mflags ) __field( int, ret ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->flags = flags; __entry->pblk = map->m_pblk; __entry->lblk = map->m_lblk; __entry->len = map->m_len; __entry->mflags = map->m_flags; __entry->ret = ret; ), TP_printk("dev %d,%d ino %lu flags %s lblk %u pblk %llu len %u " "mflags %s ret %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, show_map_flags(__entry->flags), __entry->lblk, __entry->pblk, __entry->len, show_mflags(__entry->mflags), __entry->ret) ); DEFINE_EVENT(ext4__map_blocks_exit, ext4_ext_map_blocks_exit, TP_PROTO(struct inode *inode, unsigned flags, struct ext4_map_blocks *map, int ret), TP_ARGS(inode, flags, map, ret) ); DEFINE_EVENT(ext4__map_blocks_exit, ext4_ind_map_blocks_exit, TP_PROTO(struct inode *inode, unsigned flags, struct ext4_map_blocks *map, int ret), TP_ARGS(inode, flags, map, ret) ); TRACE_EVENT(ext4_ext_load_extent, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk), TP_ARGS(inode, lblk, pblk), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_fsblk_t, pblk ) __field( ext4_lblk_t, lblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pblk = pblk; __entry->lblk = lblk; ), TP_printk("dev %d,%d ino %lu lblk %u pblk %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->pblk) ); TRACE_EVENT(ext4_load_inode, TP_PROTO(struct super_block *sb, unsigned long ino), TP_ARGS(sb, ino), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->ino = ino; ), TP_printk("dev %d,%d ino %ld", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino) ); TRACE_EVENT(ext4_journal_start_sb, TP_PROTO(struct super_block *sb, int blocks, int rsv_blocks, int revoke_creds, int type, unsigned long IP), TP_ARGS(sb, blocks, rsv_blocks, revoke_creds, type, IP), TP_STRUCT__entry( __field( dev_t, dev ) __field( unsigned long, ip ) __field( int, blocks ) __field( int, rsv_blocks ) __field( int, revoke_creds ) __field( int, type ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->ip = IP; __entry->blocks = blocks; __entry->rsv_blocks = rsv_blocks; __entry->revoke_creds = revoke_creds; __entry->type = type; ), TP_printk("dev %d,%d blocks %d, rsv_blocks %d, revoke_creds %d," " type %d, caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->blocks, __entry->rsv_blocks, __entry->revoke_creds, __entry->type, (void *)__entry->ip) ); TRACE_EVENT(ext4_journal_start_inode, TP_PROTO(struct inode *inode, int blocks, int rsv_blocks, int revoke_creds, int type, unsigned long IP), TP_ARGS(inode, blocks, rsv_blocks, revoke_creds, type, IP), TP_STRUCT__entry( __field( unsigned long, ino ) __field( dev_t, dev ) __field( unsigned long, ip ) __field( int, blocks ) __field( int, rsv_blocks ) __field( int, revoke_creds ) __field( int, type ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ip = IP; __entry->blocks = blocks; __entry->rsv_blocks = rsv_blocks; __entry->revoke_creds = revoke_creds; __entry->type = type; __entry->ino = inode->i_ino; ), TP_printk("dev %d,%d blocks %d, rsv_blocks %d, revoke_creds %d," " type %d, ino %lu, caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->blocks, __entry->rsv_blocks, __entry->revoke_creds, __entry->type, __entry->ino, (void *)__entry->ip) ); TRACE_EVENT(ext4_journal_start_reserved, TP_PROTO(struct super_block *sb, int blocks, unsigned long IP), TP_ARGS(sb, blocks, IP), TP_STRUCT__entry( __field( dev_t, dev ) __field(unsigned long, ip ) __field( int, blocks ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->ip = IP; __entry->blocks = blocks; ), TP_printk("dev %d,%d blocks, %d caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->blocks, (void *)__entry->ip) ); DECLARE_EVENT_CLASS(ext4__trim, TP_PROTO(struct super_block *sb, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, group, start, len), TP_STRUCT__entry( __field( int, dev_major ) __field( int, dev_minor ) __field( __u32, group ) __field( int, start ) __field( int, len ) ), TP_fast_assign( __entry->dev_major = MAJOR(sb->s_dev); __entry->dev_minor = MINOR(sb->s_dev); __entry->group = group; __entry->start = start; __entry->len = len; ), TP_printk("dev %d,%d group %u, start %d, len %d", __entry->dev_major, __entry->dev_minor, __entry->group, __entry->start, __entry->len) ); DEFINE_EVENT(ext4__trim, ext4_trim_extent, TP_PROTO(struct super_block *sb, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, group, start, len) ); DEFINE_EVENT(ext4__trim, ext4_trim_all_free, TP_PROTO(struct super_block *sb, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t len), TP_ARGS(sb, group, start, len) ); TRACE_EVENT(ext4_ext_handle_unwritten_extents, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map, int flags, unsigned int allocated, ext4_fsblk_t newblock), TP_ARGS(inode, map, flags, allocated, newblock), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( int, flags ) __field( ext4_lblk_t, lblk ) __field( ext4_fsblk_t, pblk ) __field( unsigned int, len ) __field( unsigned int, allocated ) __field( ext4_fsblk_t, newblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->flags = flags; __entry->lblk = map->m_lblk; __entry->pblk = map->m_pblk; __entry->len = map->m_len; __entry->allocated = allocated; __entry->newblk = newblock; ), TP_printk("dev %d,%d ino %lu m_lblk %u m_pblk %llu m_len %u flags %s " "allocated %d newblock %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->lblk, (unsigned long long) __entry->pblk, __entry->len, show_map_flags(__entry->flags), (unsigned int) __entry->allocated, (unsigned long long) __entry->newblk) ); TRACE_EVENT(ext4_get_implied_cluster_alloc_exit, TP_PROTO(struct super_block *sb, struct ext4_map_blocks *map, int ret), TP_ARGS(sb, map, ret), TP_STRUCT__entry( __field( dev_t, dev ) __field( unsigned int, flags ) __field( ext4_lblk_t, lblk ) __field( ext4_fsblk_t, pblk ) __field( unsigned int, len ) __field( int, ret ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->flags = map->m_flags; __entry->lblk = map->m_lblk; __entry->pblk = map->m_pblk; __entry->len = map->m_len; __entry->ret = ret; ), TP_printk("dev %d,%d m_lblk %u m_pblk %llu m_len %u m_flags %s ret %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->lblk, (unsigned long long) __entry->pblk, __entry->len, show_mflags(__entry->flags), __entry->ret) ); TRACE_EVENT(ext4_ext_show_extent, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk, unsigned short len), TP_ARGS(inode, lblk, pblk, len), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_fsblk_t, pblk ) __field( ext4_lblk_t, lblk ) __field( unsigned short, len ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pblk = pblk; __entry->lblk = lblk; __entry->len = len; ), TP_printk("dev %d,%d ino %lu lblk %u pblk %llu len %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->lblk, (unsigned long long) __entry->pblk, (unsigned short) __entry->len) ); TRACE_EVENT(ext4_remove_blocks, TP_PROTO(struct inode *inode, struct ext4_extent *ex, ext4_lblk_t from, ext4_fsblk_t to, struct partial_cluster *pc), TP_ARGS(inode, ex, from, to, pc), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, from ) __field( ext4_lblk_t, to ) __field( ext4_fsblk_t, ee_pblk ) __field( ext4_lblk_t, ee_lblk ) __field( unsigned short, ee_len ) __field( ext4_fsblk_t, pc_pclu ) __field( ext4_lblk_t, pc_lblk ) __field( int, pc_state) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->from = from; __entry->to = to; __entry->ee_pblk = ext4_ext_pblock(ex); __entry->ee_lblk = le32_to_cpu(ex->ee_block); __entry->ee_len = ext4_ext_get_actual_len(ex); __entry->pc_pclu = pc->pclu; __entry->pc_lblk = pc->lblk; __entry->pc_state = pc->state; ), TP_printk("dev %d,%d ino %lu extent [%u(%llu), %u]" "from %u to %u partial [pclu %lld lblk %u state %d]", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->ee_lblk, (unsigned long long) __entry->ee_pblk, (unsigned short) __entry->ee_len, (unsigned) __entry->from, (unsigned) __entry->to, (long long) __entry->pc_pclu, (unsigned int) __entry->pc_lblk, (int) __entry->pc_state) ); TRACE_EVENT(ext4_ext_rm_leaf, TP_PROTO(struct inode *inode, ext4_lblk_t start, struct ext4_extent *ex, struct partial_cluster *pc), TP_ARGS(inode, start, ex, pc), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, start ) __field( ext4_lblk_t, ee_lblk ) __field( ext4_fsblk_t, ee_pblk ) __field( short, ee_len ) __field( ext4_fsblk_t, pc_pclu ) __field( ext4_lblk_t, pc_lblk ) __field( int, pc_state) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->start = start; __entry->ee_lblk = le32_to_cpu(ex->ee_block); __entry->ee_pblk = ext4_ext_pblock(ex); __entry->ee_len = ext4_ext_get_actual_len(ex); __entry->pc_pclu = pc->pclu; __entry->pc_lblk = pc->lblk; __entry->pc_state = pc->state; ), TP_printk("dev %d,%d ino %lu start_lblk %u last_extent [%u(%llu), %u]" "partial [pclu %lld lblk %u state %d]", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->start, (unsigned) __entry->ee_lblk, (unsigned long long) __entry->ee_pblk, (unsigned short) __entry->ee_len, (long long) __entry->pc_pclu, (unsigned int) __entry->pc_lblk, (int) __entry->pc_state) ); TRACE_EVENT(ext4_ext_rm_idx, TP_PROTO(struct inode *inode, ext4_fsblk_t pblk), TP_ARGS(inode, pblk), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_fsblk_t, pblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pblk = pblk; ), TP_printk("dev %d,%d ino %lu index_pblk %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned long long) __entry->pblk) ); TRACE_EVENT(ext4_ext_remove_space, TP_PROTO(struct inode *inode, ext4_lblk_t start, ext4_lblk_t end, int depth), TP_ARGS(inode, start, end, depth), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, start ) __field( ext4_lblk_t, end ) __field( int, depth ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->start = start; __entry->end = end; __entry->depth = depth; ), TP_printk("dev %d,%d ino %lu since %u end %u depth %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->start, (unsigned) __entry->end, __entry->depth) ); TRACE_EVENT(ext4_ext_remove_space_done, TP_PROTO(struct inode *inode, ext4_lblk_t start, ext4_lblk_t end, int depth, struct partial_cluster *pc, __le16 eh_entries), TP_ARGS(inode, start, end, depth, pc, eh_entries), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, start ) __field( ext4_lblk_t, end ) __field( int, depth ) __field( ext4_fsblk_t, pc_pclu ) __field( ext4_lblk_t, pc_lblk ) __field( int, pc_state ) __field( unsigned short, eh_entries ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->start = start; __entry->end = end; __entry->depth = depth; __entry->pc_pclu = pc->pclu; __entry->pc_lblk = pc->lblk; __entry->pc_state = pc->state; __entry->eh_entries = le16_to_cpu(eh_entries); ), TP_printk("dev %d,%d ino %lu since %u end %u depth %d " "partial [pclu %lld lblk %u state %d] " "remaining_entries %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, (unsigned) __entry->start, (unsigned) __entry->end, __entry->depth, (long long) __entry->pc_pclu, (unsigned int) __entry->pc_lblk, (int) __entry->pc_state, (unsigned short) __entry->eh_entries) ); DECLARE_EVENT_CLASS(ext4__es_extent, TP_PROTO(struct inode *inode, struct extent_status *es), TP_ARGS(inode, es), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) __field( char, status ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = es->es_lblk; __entry->len = es->es_len; __entry->pblk = ext4_es_show_pblock(es); __entry->status = ext4_es_status(es); ), TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, __entry->pblk, show_extent_status(__entry->status)) ); DEFINE_EVENT(ext4__es_extent, ext4_es_insert_extent, TP_PROTO(struct inode *inode, struct extent_status *es), TP_ARGS(inode, es) ); DEFINE_EVENT(ext4__es_extent, ext4_es_cache_extent, TP_PROTO(struct inode *inode, struct extent_status *es), TP_ARGS(inode, es) ); TRACE_EVENT(ext4_es_remove_extent, TP_PROTO(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len), TP_ARGS(inode, lblk, len), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, lblk ) __field( loff_t, len ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = lblk; __entry->len = len; ), TP_printk("dev %d,%d ino %lu es [%lld/%lld)", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len) ); TRACE_EVENT(ext4_es_find_extent_range_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk), TP_ARGS(inode, lblk), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = lblk; ), TP_printk("dev %d,%d ino %lu lblk %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk) ); TRACE_EVENT(ext4_es_find_extent_range_exit, TP_PROTO(struct inode *inode, struct extent_status *es), TP_ARGS(inode, es), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) __field( char, status ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = es->es_lblk; __entry->len = es->es_len; __entry->pblk = ext4_es_show_pblock(es); __entry->status = ext4_es_status(es); ), TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, __entry->pblk, show_extent_status(__entry->status)) ); TRACE_EVENT(ext4_es_lookup_extent_enter, TP_PROTO(struct inode *inode, ext4_lblk_t lblk), TP_ARGS(inode, lblk), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = lblk; ), TP_printk("dev %d,%d ino %lu lblk %u", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk) ); TRACE_EVENT(ext4_es_lookup_extent_exit, TP_PROTO(struct inode *inode, struct extent_status *es, int found), TP_ARGS(inode, es, found), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) __field( char, status ) __field( int, found ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = es->es_lblk; __entry->len = es->es_len; __entry->pblk = ext4_es_show_pblock(es); __entry->status = ext4_es_status(es); __entry->found = found; ), TP_printk("dev %d,%d ino %lu found %d [%u/%u) %llu %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->found, __entry->lblk, __entry->len, __entry->found ? __entry->pblk : 0, show_extent_status(__entry->found ? __entry->status : 0)) ); DECLARE_EVENT_CLASS(ext4__es_shrink_enter, TP_PROTO(struct super_block *sb, int nr_to_scan, int cache_cnt), TP_ARGS(sb, nr_to_scan, cache_cnt), TP_STRUCT__entry( __field( dev_t, dev ) __field( int, nr_to_scan ) __field( int, cache_cnt ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->nr_to_scan = nr_to_scan; __entry->cache_cnt = cache_cnt; ), TP_printk("dev %d,%d nr_to_scan %d cache_cnt %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->nr_to_scan, __entry->cache_cnt) ); DEFINE_EVENT(ext4__es_shrink_enter, ext4_es_shrink_count, TP_PROTO(struct super_block *sb, int nr_to_scan, int cache_cnt), TP_ARGS(sb, nr_to_scan, cache_cnt) ); DEFINE_EVENT(ext4__es_shrink_enter, ext4_es_shrink_scan_enter, TP_PROTO(struct super_block *sb, int nr_to_scan, int cache_cnt), TP_ARGS(sb, nr_to_scan, cache_cnt) ); TRACE_EVENT(ext4_es_shrink_scan_exit, TP_PROTO(struct super_block *sb, int nr_shrunk, int cache_cnt), TP_ARGS(sb, nr_shrunk, cache_cnt), TP_STRUCT__entry( __field( dev_t, dev ) __field( int, nr_shrunk ) __field( int, cache_cnt ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->nr_shrunk = nr_shrunk; __entry->cache_cnt = cache_cnt; ), TP_printk("dev %d,%d nr_shrunk %d cache_cnt %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->nr_shrunk, __entry->cache_cnt) ); TRACE_EVENT(ext4_collapse_range, TP_PROTO(struct inode *inode, loff_t offset, loff_t len), TP_ARGS(inode, offset, len), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) __field(loff_t, offset) __field(loff_t, len) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->offset = offset; __entry->len = len; ), TP_printk("dev %d,%d ino %lu offset %lld len %lld", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->offset, __entry->len) ); TRACE_EVENT(ext4_insert_range, TP_PROTO(struct inode *inode, loff_t offset, loff_t len), TP_ARGS(inode, offset, len), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) __field(loff_t, offset) __field(loff_t, len) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->offset = offset; __entry->len = len; ), TP_printk("dev %d,%d ino %lu offset %lld len %lld", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->offset, __entry->len) ); TRACE_EVENT(ext4_es_shrink, TP_PROTO(struct super_block *sb, int nr_shrunk, u64 scan_time, int nr_skipped, int retried), TP_ARGS(sb, nr_shrunk, scan_time, nr_skipped, retried), TP_STRUCT__entry( __field( dev_t, dev ) __field( int, nr_shrunk ) __field( unsigned long long, scan_time ) __field( int, nr_skipped ) __field( int, retried ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->nr_shrunk = nr_shrunk; __entry->scan_time = div_u64(scan_time, 1000); __entry->nr_skipped = nr_skipped; __entry->retried = retried; ), TP_printk("dev %d,%d nr_shrunk %d, scan_time %llu " "nr_skipped %d retried %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->nr_shrunk, __entry->scan_time, __entry->nr_skipped, __entry->retried) ); TRACE_EVENT(ext4_es_insert_delayed_extent, TP_PROTO(struct inode *inode, struct extent_status *es, bool lclu_allocated, bool end_allocated), TP_ARGS(inode, es, lclu_allocated, end_allocated), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) __field( char, status ) __field( bool, lclu_allocated ) __field( bool, end_allocated ) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->lblk = es->es_lblk; __entry->len = es->es_len; __entry->pblk = ext4_es_show_pblock(es); __entry->status = ext4_es_status(es); __entry->lclu_allocated = lclu_allocated; __entry->end_allocated = end_allocated; ), TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s " "allocated %d %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, __entry->pblk, show_extent_status(__entry->status), __entry->lclu_allocated, __entry->end_allocated) ); /* fsmap traces */ DECLARE_EVENT_CLASS(ext4_fsmap_class, TP_PROTO(struct super_block *sb, u32 keydev, u32 agno, u64 bno, u64 len, u64 owner), TP_ARGS(sb, keydev, agno, bno, len, owner), TP_STRUCT__entry( __field(dev_t, dev) __field(dev_t, keydev) __field(u32, agno) __field(u64, bno) __field(u64, len) __field(u64, owner) ), TP_fast_assign( __entry->dev = sb->s_bdev->bd_dev; __entry->keydev = new_decode_dev(keydev); __entry->agno = agno; __entry->bno = bno; __entry->len = len; __entry->owner = owner; ), TP_printk("dev %d:%d keydev %d:%d agno %u bno %llu len %llu owner %lld\n", MAJOR(__entry->dev), MINOR(__entry->dev), MAJOR(__entry->keydev), MINOR(__entry->keydev), __entry->agno, __entry->bno, __entry->len, __entry->owner) ) #define DEFINE_FSMAP_EVENT(name) \ DEFINE_EVENT(ext4_fsmap_class, name, \ TP_PROTO(struct super_block *sb, u32 keydev, u32 agno, u64 bno, u64 len, \ u64 owner), \ TP_ARGS(sb, keydev, agno, bno, len, owner)) DEFINE_FSMAP_EVENT(ext4_fsmap_low_key); DEFINE_FSMAP_EVENT(ext4_fsmap_high_key); DEFINE_FSMAP_EVENT(ext4_fsmap_mapping); DECLARE_EVENT_CLASS(ext4_getfsmap_class, TP_PROTO(struct super_block *sb, struct ext4_fsmap *fsmap), TP_ARGS(sb, fsmap), TP_STRUCT__entry( __field(dev_t, dev) __field(dev_t, keydev) __field(u64, block) __field(u64, len) __field(u64, owner) __field(u64, flags) ), TP_fast_assign( __entry->dev = sb->s_bdev->bd_dev; __entry->keydev = new_decode_dev(fsmap->fmr_device); __entry->block = fsmap->fmr_physical; __entry->len = fsmap->fmr_length; __entry->owner = fsmap->fmr_owner; __entry->flags = fsmap->fmr_flags; ), TP_printk("dev %d:%d keydev %d:%d block %llu len %llu owner %lld flags 0x%llx\n", MAJOR(__entry->dev), MINOR(__entry->dev), MAJOR(__entry->keydev), MINOR(__entry->keydev), __entry->block, __entry->len, __entry->owner, __entry->flags) ) #define DEFINE_GETFSMAP_EVENT(name) \ DEFINE_EVENT(ext4_getfsmap_class, name, \ TP_PROTO(struct super_block *sb, struct ext4_fsmap *fsmap), \ TP_ARGS(sb, fsmap)) DEFINE_GETFSMAP_EVENT(ext4_getfsmap_low_key); DEFINE_GETFSMAP_EVENT(ext4_getfsmap_high_key); DEFINE_GETFSMAP_EVENT(ext4_getfsmap_mapping); TRACE_EVENT(ext4_shutdown, TP_PROTO(struct super_block *sb, unsigned long flags), TP_ARGS(sb, flags), TP_STRUCT__entry( __field( dev_t, dev ) __field( unsigned, flags ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->flags = flags; ), TP_printk("dev %d,%d flags %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->flags) ); TRACE_EVENT(ext4_error, TP_PROTO(struct super_block *sb, const char *function, unsigned int line), TP_ARGS(sb, function, line), TP_STRUCT__entry( __field( dev_t, dev ) __field( const char *, function ) __field( unsigned, line ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->function = function; __entry->line = line; ), TP_printk("dev %d,%d function %s line %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->function, __entry->line) ); TRACE_EVENT(ext4_prefetch_bitmaps, TP_PROTO(struct super_block *sb, ext4_group_t group, ext4_group_t next, unsigned int prefetch_ios), TP_ARGS(sb, group, next, prefetch_ios), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u32, group ) __field( __u32, next ) __field( __u32, ios ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->group = group; __entry->next = next; __entry->ios = prefetch_ios; ), TP_printk("dev %d,%d group %u next %u ios %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->group, __entry->next, __entry->ios) ); TRACE_EVENT(ext4_lazy_itable_init, TP_PROTO(struct super_block *sb, ext4_group_t group), TP_ARGS(sb, group), TP_STRUCT__entry( __field( dev_t, dev ) __field( __u32, group ) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->group = group; ), TP_printk("dev %d,%d group %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->group) ); TRACE_EVENT(ext4_fc_replay_scan, TP_PROTO(struct super_block *sb, int error, int off), TP_ARGS(sb, error, off), TP_STRUCT__entry( __field(dev_t, dev) __field(int, error) __field(int, off) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->error = error; __entry->off = off; ), TP_printk("dev %d,%d error %d, off %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->error, __entry->off) ); TRACE_EVENT(ext4_fc_replay, TP_PROTO(struct super_block *sb, int tag, int ino, int priv1, int priv2), TP_ARGS(sb, tag, ino, priv1, priv2), TP_STRUCT__entry( __field(dev_t, dev) __field(int, tag) __field(int, ino) __field(int, priv1) __field(int, priv2) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->tag = tag; __entry->ino = ino; __entry->priv1 = priv1; __entry->priv2 = priv2; ), TP_printk("dev %d,%d: tag %d, ino %d, data1 %d, data2 %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->tag, __entry->ino, __entry->priv1, __entry->priv2) ); TRACE_EVENT(ext4_fc_commit_start, TP_PROTO(struct super_block *sb, tid_t commit_tid), TP_ARGS(sb, commit_tid), TP_STRUCT__entry( __field(dev_t, dev) __field(tid_t, tid) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->tid = commit_tid; ), TP_printk("dev %d,%d tid %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->tid) ); TRACE_EVENT(ext4_fc_commit_stop, TP_PROTO(struct super_block *sb, int nblks, int reason, tid_t commit_tid), TP_ARGS(sb, nblks, reason, commit_tid), TP_STRUCT__entry( __field(dev_t, dev) __field(int, nblks) __field(int, reason) __field(int, num_fc) __field(int, num_fc_ineligible) __field(int, nblks_agg) __field(tid_t, tid) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->nblks = nblks; __entry->reason = reason; __entry->num_fc = EXT4_SB(sb)->s_fc_stats.fc_num_commits; __entry->num_fc_ineligible = EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits; __entry->nblks_agg = EXT4_SB(sb)->s_fc_stats.fc_numblks; __entry->tid = commit_tid; ), TP_printk("dev %d,%d nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d, tid %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->nblks, __entry->reason, __entry->num_fc, __entry->num_fc_ineligible, __entry->nblks_agg, __entry->tid) ); #define FC_REASON_NAME_STAT(reason) \ show_fc_reason(reason), \ __entry->fc_ineligible_rc[reason] TRACE_EVENT(ext4_fc_stats, TP_PROTO(struct super_block *sb), TP_ARGS(sb), TP_STRUCT__entry( __field(dev_t, dev) __array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX) __field(unsigned long, fc_commits) __field(unsigned long, fc_ineligible_commits) __field(unsigned long, fc_numblks) ), TP_fast_assign( int i; __entry->dev = sb->s_dev; for (i = 0; i < EXT4_FC_REASON_MAX; i++) { __entry->fc_ineligible_rc[i] = EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i]; } __entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits; __entry->fc_ineligible_commits = EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits; __entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks; ), TP_printk("dev %d,%d fc ineligible reasons:\n" "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u" "num_commits:%lu, ineligible: %lu, numblks: %lu", MAJOR(__entry->dev), MINOR(__entry->dev), FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME), FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE), FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM), FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT), FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE), FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), FC_REASON_NAME_STAT(EXT4_FC_REASON_ENCRYPTED_FILENAME), __entry->fc_commits, __entry->fc_ineligible_commits, __entry->fc_numblks) ); DECLARE_EVENT_CLASS(ext4_fc_track_dentry, TP_PROTO(handle_t *handle, struct inode *inode, struct dentry *dentry, int ret), TP_ARGS(handle, inode, dentry, ret), TP_STRUCT__entry( __field(dev_t, dev) __field(tid_t, t_tid) __field(ino_t, i_ino) __field(tid_t, i_sync_tid) __field(int, error) ), TP_fast_assign( struct ext4_inode_info *ei = EXT4_I(inode); __entry->dev = inode->i_sb->s_dev; __entry->t_tid = handle->h_transaction->t_tid; __entry->i_ino = inode->i_ino; __entry->i_sync_tid = ei->i_sync_tid; __entry->error = ret; ), TP_printk("dev %d,%d, t_tid %u, ino %lu, i_sync_tid %u, error %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->t_tid, __entry->i_ino, __entry->i_sync_tid, __entry->error ) ); #define DEFINE_EVENT_CLASS_DENTRY(__type) \ DEFINE_EVENT(ext4_fc_track_dentry, ext4_fc_track_##__type, \ TP_PROTO(handle_t *handle, struct inode *inode, \ struct dentry *dentry, int ret), \ TP_ARGS(handle, inode, dentry, ret) \ ) DEFINE_EVENT_CLASS_DENTRY(create); DEFINE_EVENT_CLASS_DENTRY(link); DEFINE_EVENT_CLASS_DENTRY(unlink); TRACE_EVENT(ext4_fc_track_inode, TP_PROTO(handle_t *handle, struct inode *inode, int ret), TP_ARGS(handle, inode, ret), TP_STRUCT__entry( __field(dev_t, dev) __field(tid_t, t_tid) __field(ino_t, i_ino) __field(tid_t, i_sync_tid) __field(int, error) ), TP_fast_assign( struct ext4_inode_info *ei = EXT4_I(inode); __entry->dev = inode->i_sb->s_dev; __entry->t_tid = handle->h_transaction->t_tid; __entry->i_ino = inode->i_ino; __entry->i_sync_tid = ei->i_sync_tid; __entry->error = ret; ), TP_printk("dev %d:%d, t_tid %u, inode %lu, i_sync_tid %u, error %d", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->t_tid, __entry->i_ino, __entry->i_sync_tid, __entry->error) ); TRACE_EVENT(ext4_fc_track_range, TP_PROTO(handle_t *handle, struct inode *inode, long start, long end, int ret), TP_ARGS(handle, inode, start, end, ret), TP_STRUCT__entry( __field(dev_t, dev) __field(tid_t, t_tid) __field(ino_t, i_ino) __field(tid_t, i_sync_tid) __field(long, start) __field(long, end) __field(int, error) ), TP_fast_assign( struct ext4_inode_info *ei = EXT4_I(inode); __entry->dev = inode->i_sb->s_dev; __entry->t_tid = handle->h_transaction->t_tid; __entry->i_ino = inode->i_ino; __entry->i_sync_tid = ei->i_sync_tid; __entry->start = start; __entry->end = end; __entry->error = ret; ), TP_printk("dev %d:%d, t_tid %u, inode %lu, i_sync_tid %u, error %d, start %ld, end %ld", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->t_tid, __entry->i_ino, __entry->i_sync_tid, __entry->error, __entry->start, __entry->end) ); TRACE_EVENT(ext4_fc_cleanup, TP_PROTO(journal_t *journal, int full, tid_t tid), TP_ARGS(journal, full, tid), TP_STRUCT__entry( __field(dev_t, dev) __field(int, j_fc_off) __field(int, full) __field(tid_t, tid) ), TP_fast_assign( struct super_block *sb = journal->j_private; __entry->dev = sb->s_dev; __entry->j_fc_off = journal->j_fc_off; __entry->full = full; __entry->tid = tid; ), TP_printk("dev %d,%d, j_fc_off %d, full %d, tid %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->j_fc_off, __entry->full, __entry->tid) ); TRACE_EVENT(ext4_update_sb, TP_PROTO(struct super_block *sb, ext4_fsblk_t fsblk, unsigned int flags), TP_ARGS(sb, fsblk, flags), TP_STRUCT__entry( __field(dev_t, dev) __field(ext4_fsblk_t, fsblk) __field(unsigned int, flags) ), TP_fast_assign( __entry->dev = sb->s_dev; __entry->fsblk = fsblk; __entry->flags = flags; ), TP_printk("dev %d,%d fsblk %llu flags %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->fsblk, __entry->flags) ); #endif /* _TRACE_EXT4_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
| 13953 13953 13962 14220 948 946 951 944 568 20 572 574 574 352 1490 634 635 4 429 605 58 636 7 6 7 7 7 7 7 7 4 7 261 172 15466 15683 2108 951 1172 636 576 1491 7 7 15779 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* I/O iterator iteration building functions. * * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #ifndef _LINUX_IOV_ITER_H #define _LINUX_IOV_ITER_H #include <linux/uio.h> #include <linux/bvec.h> typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len, void *priv, void *priv2); typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len, void *priv, void *priv2); /* * Handle ITER_UBUF. */ static __always_inline size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_ustep_f step) { void __user *base = iter->ubuf; size_t progress = 0, remain; remain = step(base + iter->iov_offset, 0, len, priv, priv2); progress = len - remain; iter->iov_offset += progress; iter->count -= progress; return progress; } /* * Handle ITER_IOVEC. */ static __always_inline size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_ustep_f step) { const struct iovec *p = iter->__iov; size_t progress = 0, skip = iter->iov_offset; do { size_t remain, consumed; size_t part = min(len, p->iov_len - skip); if (likely(part)) { remain = step(p->iov_base + skip, progress, part, priv, priv2); consumed = part - remain; progress += consumed; skip += consumed; len -= consumed; if (skip < p->iov_len) break; } p++; skip = 0; } while (len); iter->nr_segs -= p - iter->__iov; iter->__iov = p; iter->iov_offset = skip; iter->count -= progress; return progress; } /* * Handle ITER_KVEC. */ static __always_inline size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_step_f step) { const struct kvec *p = iter->kvec; size_t progress = 0, skip = iter->iov_offset; do { size_t remain, consumed; size_t part = min(len, p->iov_len - skip); if (likely(part)) { remain = step(p->iov_base + skip, progress, part, priv, priv2); consumed = part - remain; progress += consumed; skip += consumed; len -= consumed; if (skip < p->iov_len) break; } p++; skip = 0; } while (len); iter->nr_segs -= p - iter->kvec; iter->kvec = p; iter->iov_offset = skip; iter->count -= progress; return progress; } /* * Handle ITER_BVEC. */ static __always_inline size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_step_f step) { const struct bio_vec *p = iter->bvec; size_t progress = 0, skip = iter->iov_offset; do { size_t remain, consumed; size_t offset = p->bv_offset + skip, part; void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE); part = min3(len, (size_t)(p->bv_len - skip), (size_t)(PAGE_SIZE - offset % PAGE_SIZE)); remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2); kunmap_local(kaddr); consumed = part - remain; len -= consumed; progress += consumed; skip += consumed; if (skip >= p->bv_len) { skip = 0; p++; } if (remain) break; } while (len); iter->nr_segs -= p - iter->bvec; iter->bvec = p; iter->iov_offset = skip; iter->count -= progress; return progress; } /* * Handle ITER_XARRAY. */ static __always_inline size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_step_f step) { struct folio *folio; size_t progress = 0; loff_t start = iter->xarray_start + iter->iov_offset; pgoff_t index = start / PAGE_SIZE; XA_STATE(xas, iter->xarray, index); rcu_read_lock(); xas_for_each(&xas, folio, ULONG_MAX) { size_t remain, consumed, offset, part, flen; if (xas_retry(&xas, folio)) continue; if (WARN_ON(xa_is_value(folio))) break; if (WARN_ON(folio_test_hugetlb(folio))) break; offset = offset_in_folio(folio, start + progress); flen = min(folio_size(folio) - offset, len); while (flen) { void *base = kmap_local_folio(folio, offset); part = min_t(size_t, flen, PAGE_SIZE - offset_in_page(offset)); remain = step(base, progress, part, priv, priv2); kunmap_local(base); consumed = part - remain; progress += consumed; len -= consumed; if (remain || len == 0) goto out; flen -= consumed; offset += consumed; } } out: rcu_read_unlock(); iter->iov_offset += progress; iter->count -= progress; return progress; } /* * Handle ITER_DISCARD. */ static __always_inline size_t iterate_discard(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_step_f step) { size_t progress = len; iter->count -= progress; return progress; } /** * iterate_and_advance2 - Iterate over an iterator * @iter: The iterator to iterate over. * @len: The amount to iterate over. * @priv: Data for the step functions. * @priv2: More data for the step functions. * @ustep: Function for UBUF/IOVEC iterators; given __user addresses. * @step: Function for other iterators; given kernel addresses. * * Iterate over the next part of an iterator, up to the specified length. The * buffer is presented in segments, which for kernel iteration are broken up by * physical pages and mapped, with the mapped address being presented. * * Two step functions, @step and @ustep, must be provided, one for handling * mapped kernel addresses and the other is given user addresses which have the * potential to fault since no pinning is performed. * * The step functions are passed the address and length of the segment, @priv, * @priv2 and the amount of data so far iterated over (which can, for example, * be added to @priv to point to the right part of a second buffer). The step * functions should return the amount of the segment they didn't process (ie. 0 * indicates complete processsing). * * This function returns the amount of data processed (ie. 0 means nothing was * processed and the value of @len means processes to completion). */ static __always_inline size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv, void *priv2, iov_ustep_f ustep, iov_step_f step) { if (unlikely(iter->count < len)) len = iter->count; if (unlikely(!len)) return 0; if (likely(iter_is_ubuf(iter))) return iterate_ubuf(iter, len, priv, priv2, ustep); if (likely(iter_is_iovec(iter))) return iterate_iovec(iter, len, priv, priv2, ustep); if (iov_iter_is_bvec(iter)) return iterate_bvec(iter, len, priv, priv2, step); if (iov_iter_is_kvec(iter)) return iterate_kvec(iter, len, priv, priv2, step); if (iov_iter_is_xarray(iter)) return iterate_xarray(iter, len, priv, priv2, step); return iterate_discard(iter, len, priv, priv2, step); } /** * iterate_and_advance - Iterate over an iterator * @iter: The iterator to iterate over. * @len: The amount to iterate over. * @priv: Data for the step functions. * @ustep: Function for UBUF/IOVEC iterators; given __user addresses. * @step: Function for other iterators; given kernel addresses. * * As iterate_and_advance2(), but priv2 is always NULL. */ static __always_inline size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv, iov_ustep_f ustep, iov_step_f step) { return iterate_and_advance2(iter, len, priv, NULL, ustep, step); } #endif /* _LINUX_IOV_ITER_H */ |
| 3 3 3 3 3 12 4 12 21 10 10 13 13 21 19 15 15 3 15 15 15 14 12 14 3 3 2 15 15 15 15 14 15 2 28 15 1 27 27 32 32 32 30 30 30 28 29 29 29 29 28 16 29 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 | // SPDX-License-Identifier: GPL-2.0 /* * linux/mm/mincore.c * * Copyright (C) 1994-2006 Linus Torvalds */ /* * The mincore() system call. */ #include <linux/pagemap.h> #include <linux/gfp.h> #include <linux/pagewalk.h> #include <linux/mman.h> #include <linux/syscalls.h> #include <linux/swap.h> #include <linux/swapops.h> #include <linux/shmem_fs.h> #include <linux/hugetlb.h> #include <linux/pgtable.h> #include <linux/uaccess.h> #include "swap.h" static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, unsigned long end, struct mm_walk *walk) { #ifdef CONFIG_HUGETLB_PAGE unsigned char present; unsigned char *vec = walk->private; /* * Hugepages under user process are always in RAM and never * swapped out, but theoretically it needs to be checked. */ present = pte && !huge_pte_none_mostly(huge_ptep_get(walk->mm, addr, pte)); for (; addr != end; vec++, addr += PAGE_SIZE) *vec = present; walk->private = vec; #else BUG(); #endif return 0; } /* * Later we can get more picky about what "in core" means precisely. * For now, simply check to see if the page is in the page cache, * and is up to date; i.e. that no page-in operation would be required * at this time if an application were to map and access this page. */ static unsigned char mincore_page(struct address_space *mapping, pgoff_t index) { unsigned char present = 0; struct folio *folio; /* * When tmpfs swaps out a page from a file, any process mapping that * file will not get a swp_entry_t in its pte, but rather it is like * any other file mapping (ie. marked !present and faulted in with * tmpfs's .fault). So swapped out tmpfs mappings are tested here. */ folio = filemap_get_incore_folio(mapping, index); if (!IS_ERR(folio)) { present = folio_test_uptodate(folio); folio_put(folio); } return present; } static int __mincore_unmapped_range(unsigned long addr, unsigned long end, struct vm_area_struct *vma, unsigned char *vec) { unsigned long nr = (end - addr) >> PAGE_SHIFT; int i; if (vma->vm_file) { pgoff_t pgoff; pgoff = linear_page_index(vma, addr); for (i = 0; i < nr; i++, pgoff++) vec[i] = mincore_page(vma->vm_file->f_mapping, pgoff); } else { for (i = 0; i < nr; i++) vec[i] = 0; } return nr; } static int mincore_unmapped_range(unsigned long addr, unsigned long end, __always_unused int depth, struct mm_walk *walk) { walk->private += __mincore_unmapped_range(addr, end, walk->vma, walk->private); return 0; } static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { spinlock_t *ptl; struct vm_area_struct *vma = walk->vma; pte_t *ptep; unsigned char *vec = walk->private; int nr = (end - addr) >> PAGE_SHIFT; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { memset(vec, 1, nr); spin_unlock(ptl); goto out; } ptep = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); if (!ptep) { walk->action = ACTION_AGAIN; return 0; } for (; addr != end; ptep++, addr += PAGE_SIZE) { pte_t pte = ptep_get(ptep); /* We need to do cache lookup too for pte markers */ if (pte_none_mostly(pte)) __mincore_unmapped_range(addr, addr + PAGE_SIZE, vma, vec); else if (pte_present(pte)) *vec = 1; else { /* pte is a swap entry */ swp_entry_t entry = pte_to_swp_entry(pte); if (non_swap_entry(entry)) { /* * migration or hwpoison entries are always * uptodate */ *vec = 1; } else { #ifdef CONFIG_SWAP *vec = mincore_page(swap_address_space(entry), swap_cache_index(entry)); #else WARN_ON(1); *vec = 1; #endif } } vec++; } pte_unmap_unlock(ptep - 1, ptl); out: walk->private += nr; cond_resched(); return 0; } static inline bool can_do_mincore(struct vm_area_struct *vma) { if (vma_is_anonymous(vma)) return true; if (!vma->vm_file) return false; /* * Reveal pagecache information only for non-anonymous mappings that * correspond to the files the calling process could (if tried) open * for writing; otherwise we'd be including shared non-exclusive * mappings, which opens a side channel. */ return inode_owner_or_capable(&nop_mnt_idmap, file_inode(vma->vm_file)) || file_permission(vma->vm_file, MAY_WRITE) == 0; } static const struct mm_walk_ops mincore_walk_ops = { .pmd_entry = mincore_pte_range, .pte_hole = mincore_unmapped_range, .hugetlb_entry = mincore_hugetlb, .walk_lock = PGWALK_RDLOCK, }; /* * Do a chunk of "sys_mincore()". We've already checked * all the arguments, we hold the mmap semaphore: we should * just return the amount of info we're asked for. */ static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *vec) { struct vm_area_struct *vma; unsigned long end; int err; vma = vma_lookup(current->mm, addr); if (!vma) return -ENOMEM; end = min(vma->vm_end, addr + (pages << PAGE_SHIFT)); if (!can_do_mincore(vma)) { unsigned long pages = DIV_ROUND_UP(end - addr, PAGE_SIZE); memset(vec, 1, pages); return pages; } err = walk_page_range(vma->vm_mm, addr, end, &mincore_walk_ops, vec); if (err < 0) return err; return (end - addr) >> PAGE_SHIFT; } /* * The mincore(2) system call. * * mincore() returns the memory residency status of the pages in the * current process's address space specified by [addr, addr + len). * The status is returned in a vector of bytes. The least significant * bit of each byte is 1 if the referenced page is in memory, otherwise * it is zero. * * Because the status of a page can change after mincore() checks it * but before it returns to the application, the returned vector may * contain stale information. Only locked pages are guaranteed to * remain in memory. * * return values: * zero - success * -EFAULT - vec points to an illegal address * -EINVAL - addr is not a multiple of PAGE_SIZE * -ENOMEM - Addresses in the range [addr, addr + len] are * invalid for the address space of this process, or * specify one or more pages which are not currently * mapped * -EAGAIN - A kernel resource was temporarily unavailable. */ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned char __user *, vec) { long retval; unsigned long pages; unsigned char *tmp; start = untagged_addr(start); /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL; /* ..and we need to be passed a valid user-space range */ if (!access_ok((void __user *) start, len)) return -ENOMEM; /* This also avoids any overflows on PAGE_ALIGN */ pages = len >> PAGE_SHIFT; pages += (offset_in_page(len)) != 0; if (!access_ok(vec, pages)) return -EFAULT; tmp = (void *) __get_free_page(GFP_USER); if (!tmp) return -EAGAIN; retval = 0; while (pages) { /* * Do at most PAGE_SIZE entries per iteration, due to * the temporary buffer size. */ mmap_read_lock(current->mm); retval = do_mincore(start, min(pages, PAGE_SIZE), tmp); mmap_read_unlock(current->mm); if (retval <= 0) break; if (copy_to_user(vec, tmp, retval)) { retval = -EFAULT; break; } pages -= retval; vec += retval; start += retval << PAGE_SHIFT; retval = 0; } free_page((unsigned long) tmp); return retval; } |
| 73 73 71 1 35 74 2 73 70 69 70 69 1 71 68 68 94 91 94 87 94 6 42 21 22 33 74 25 74 67 7 67 2 67 18 12 14 18 4 1 1 1 1 1 1 1 22 169 169 169 322 240 238 323 31 31 58 51 58 41 10 8 10 286 7 7 7 6 285 15 15 131 132 119 114 1 86 86 1 239 91 91 91 91 88 87 171 237 15 171 234 4 1 234 98 202 31 236 5 5 240 241 237 239 5 234 238 239 11 1 32 21 11 32 143 11 135 16 16 119 1 118 43 1 75 2 118 72 241 3 334 335 336 64 7 7 5 2 3 7 12 10 12 11 11 12 98 63 58 98 58 58 58 58 44 58 47 58 43 43 43 72 10 1 10 1 10 1 10 10 10 7 7 7 1 2 3 1 10 2 10 2 10 3 10 3 10 1 10 3 2 3 1 7 3 7 1 7 1 7 3 10 2 10 1 1 10 1 9 9 10 1 9 2 10 1 10 1 10 236 236 235 236 203 33 236 138 42 8 13 5 8 20 10 11 16 9 12 11 21 15 16 36 33 3 2 8 20 14 10 22 24 72 72 99 2 95 4 96 6 72 61 21 1 49 117 165 165 90 80 10 8 7 5 5 4 2 1 1 7 4 191 4 1 187 6 3 181 4 1 177 3 1 174 2 2 2 2 10 10 5 198 198 10 198 184 198 53 198 9 198 1 198 26 12 17 191 172 172 102 102 102 172 51 51 1 50 49 8 42 5 42 50 171 50 171 1 1 170 28 170 120 50 170 170 170 1 1 169 41 10 169 169 169 169 169 1 167 140 2 166 166 166 166 166 141 141 14 33 5 33 174 26 74 26 87 26 87 26 13 26 26 26 241 241 241 241 241 241 217 185 217 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 | // SPDX-License-Identifier: GPL-2.0-only /* * linux/fs/fat/inode.c * * Written 1992,1993 by Werner Almesberger * VFAT extensions by Gordon Chaffee, merged with msdos fs by Henrik Storner * Rewritten for the constant inumbers support by Al Viro * * Fixes: * * Max Cohan: Fixed invalid FSINFO offset when info_sector is 0 */ #include <linux/module.h> #include <linux/pagemap.h> #include <linux/mpage.h> #include <linux/vfs.h> #include <linux/seq_file.h> #include <linux/uio.h> #include <linux/blkdev.h> #include <linux/backing-dev.h> #include <asm/unaligned.h> #include <linux/random.h> #include <linux/iversion.h> #include "fat.h" #ifndef CONFIG_FAT_DEFAULT_IOCHARSET /* if user don't select VFAT, this is undefined. */ #define CONFIG_FAT_DEFAULT_IOCHARSET "" #endif #define KB_IN_SECTORS 2 /* DOS dates from 1980/1/1 through 2107/12/31 */ #define FAT_DATE_MIN (0<<9 | 1<<5 | 1) #define FAT_DATE_MAX (127<<9 | 12<<5 | 31) #define FAT_TIME_MAX (23<<11 | 59<<5 | 29) /* * A deserialized copy of the on-disk structure laid out in struct * fat_boot_sector. */ struct fat_bios_param_block { u16 fat_sector_size; u8 fat_sec_per_clus; u16 fat_reserved; u8 fat_fats; u16 fat_dir_entries; u16 fat_sectors; u16 fat_fat_length; u32 fat_total_sect; u8 fat16_state; u32 fat16_vol_id; u32 fat32_length; u32 fat32_root_cluster; u16 fat32_info_sector; u8 fat32_state; u32 fat32_vol_id; }; static int fat_default_codepage = CONFIG_FAT_DEFAULT_CODEPAGE; static char fat_default_iocharset[] = CONFIG_FAT_DEFAULT_IOCHARSET; static struct fat_floppy_defaults { unsigned nr_sectors; unsigned sec_per_clus; unsigned dir_entries; unsigned media; unsigned fat_length; } floppy_defaults[] = { { .nr_sectors = 160 * KB_IN_SECTORS, .sec_per_clus = 1, .dir_entries = 64, .media = 0xFE, .fat_length = 1, }, { .nr_sectors = 180 * KB_IN_SECTORS, .sec_per_clus = 1, .dir_entries = 64, .media = 0xFC, .fat_length = 2, }, { .nr_sectors = 320 * KB_IN_SECTORS, .sec_per_clus = 2, .dir_entries = 112, .media = 0xFF, .fat_length = 1, }, { .nr_sectors = 360 * KB_IN_SECTORS, .sec_per_clus = 2, .dir_entries = 112, .media = 0xFD, .fat_length = 2, }, }; int fat_add_cluster(struct inode *inode) { int err, cluster; err = fat_alloc_clusters(inode, &cluster, 1); if (err) return err; /* FIXME: this cluster should be added after data of this * cluster is writed */ err = fat_chain_add(inode, cluster, 1); if (err) fat_free_clusters(inode, cluster); return err; } static inline int __fat_get_block(struct inode *inode, sector_t iblock, unsigned long *max_blocks, struct buffer_head *bh_result, int create) { struct super_block *sb = inode->i_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); unsigned long mapped_blocks; sector_t phys, last_block; int err, offset; err = fat_bmap(inode, iblock, &phys, &mapped_blocks, create, false); if (err) return err; if (phys) { map_bh(bh_result, sb, phys); *max_blocks = min(mapped_blocks, *max_blocks); return 0; } if (!create) return 0; if (iblock != MSDOS_I(inode)->mmu_private >> sb->s_blocksize_bits) { fat_fs_error(sb, "corrupted file size (i_pos %lld, %lld)", MSDOS_I(inode)->i_pos, MSDOS_I(inode)->mmu_private); return -EIO; } last_block = inode->i_blocks >> (sb->s_blocksize_bits - 9); offset = (unsigned long)iblock & (sbi->sec_per_clus - 1); /* * allocate a cluster according to the following. * 1) no more available blocks * 2) not part of fallocate region */ if (!offset && !(iblock < last_block)) { /* TODO: multiple cluster allocation would be desirable. */ err = fat_add_cluster(inode); if (err) return err; } /* available blocks on this cluster */ mapped_blocks = sbi->sec_per_clus - offset; *max_blocks = min(mapped_blocks, *max_blocks); MSDOS_I(inode)->mmu_private += *max_blocks << sb->s_blocksize_bits; err = fat_bmap(inode, iblock, &phys, &mapped_blocks, create, false); if (err) return err; if (!phys) { fat_fs_error(sb, "invalid FAT chain (i_pos %lld, last_block %llu)", MSDOS_I(inode)->i_pos, (unsigned long long)last_block); return -EIO; } BUG_ON(*max_blocks != mapped_blocks); set_buffer_new(bh_result); map_bh(bh_result, sb, phys); return 0; } static int fat_get_block(struct inode *inode, sector_t iblock, struct buffer_head *bh_result, int create) { struct super_block *sb = inode->i_sb; unsigned long max_blocks = bh_result->b_size >> inode->i_blkbits; int err; err = __fat_get_block(inode, iblock, &max_blocks, bh_result, create); if (err) return err; bh_result->b_size = max_blocks << sb->s_blocksize_bits; return 0; } static int fat_writepages(struct address_space *mapping, struct writeback_control *wbc) { return mpage_writepages(mapping, wbc, fat_get_block); } static int fat_read_folio(struct file *file, struct folio *folio) { return mpage_read_folio(folio, fat_get_block); } static void fat_readahead(struct readahead_control *rac) { mpage_readahead(rac, fat_get_block); } static void fat_write_failed(struct address_space *mapping, loff_t to) { struct inode *inode = mapping->host; if (to > inode->i_size) { truncate_pagecache(inode, inode->i_size); fat_truncate_blocks(inode, inode->i_size); } } static int fat_write_begin(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, struct page **pagep, void **fsdata) { int err; *pagep = NULL; err = cont_write_begin(file, mapping, pos, len, pagep, fsdata, fat_get_block, &MSDOS_I(mapping->host)->mmu_private); if (err < 0) fat_write_failed(mapping, pos + len); return err; } static int fat_write_end(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, unsigned copied, struct page *pagep, void *fsdata) { struct inode *inode = mapping->host; int err; err = generic_write_end(file, mapping, pos, len, copied, pagep, fsdata); if (err < len) fat_write_failed(mapping, pos + len); if (!(err < 0) && !(MSDOS_I(inode)->i_attrs & ATTR_ARCH)) { fat_truncate_time(inode, NULL, S_CTIME|S_MTIME); MSDOS_I(inode)->i_attrs |= ATTR_ARCH; mark_inode_dirty(inode); } return err; } static ssize_t fat_direct_IO(struct kiocb *iocb, struct iov_iter *iter) { struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; size_t count = iov_iter_count(iter); loff_t offset = iocb->ki_pos; ssize_t ret; if (iov_iter_rw(iter) == WRITE) { /* * FIXME: blockdev_direct_IO() doesn't use ->write_begin(), * so we need to update the ->mmu_private to block boundary. * * But we must fill the remaining area or hole by nul for * updating ->mmu_private. * * Return 0, and fallback to normal buffered write. */ loff_t size = offset + count; if (MSDOS_I(inode)->mmu_private < size) return 0; } /* * FAT need to use the DIO_LOCKING for avoiding the race * condition of fat_get_block() and ->truncate(). */ ret = blockdev_direct_IO(iocb, inode, iter, fat_get_block); if (ret < 0 && iov_iter_rw(iter) == WRITE) fat_write_failed(mapping, offset + count); return ret; } static int fat_get_block_bmap(struct inode *inode, sector_t iblock, struct buffer_head *bh_result, int create) { struct super_block *sb = inode->i_sb; unsigned long max_blocks = bh_result->b_size >> inode->i_blkbits; int err; sector_t bmap; unsigned long mapped_blocks; BUG_ON(create != 0); err = fat_bmap(inode, iblock, &bmap, &mapped_blocks, create, true); if (err) return err; if (bmap) { map_bh(bh_result, sb, bmap); max_blocks = min(mapped_blocks, max_blocks); } bh_result->b_size = max_blocks << sb->s_blocksize_bits; return 0; } static sector_t _fat_bmap(struct address_space *mapping, sector_t block) { sector_t blocknr; /* fat_get_cluster() assumes the requested blocknr isn't truncated. */ down_read(&MSDOS_I(mapping->host)->truncate_lock); blocknr = generic_block_bmap(mapping, block, fat_get_block_bmap); up_read(&MSDOS_I(mapping->host)->truncate_lock); return blocknr; } /* * fat_block_truncate_page() zeroes out a mapping from file offset `from' * up to the end of the block which corresponds to `from'. * This is required during truncate to physically zeroout the tail end * of that block so it doesn't yield old data if the file is later grown. * Also, avoid causing failure from fsx for cases of "data past EOF" */ int fat_block_truncate_page(struct inode *inode, loff_t from) { return block_truncate_page(inode->i_mapping, from, fat_get_block); } static const struct address_space_operations fat_aops = { .dirty_folio = block_dirty_folio, .invalidate_folio = block_invalidate_folio, .read_folio = fat_read_folio, .readahead = fat_readahead, .writepages = fat_writepages, .write_begin = fat_write_begin, .write_end = fat_write_end, .direct_IO = fat_direct_IO, .bmap = _fat_bmap, .migrate_folio = buffer_migrate_folio, }; /* * New FAT inode stuff. We do the following: * a) i_ino is constant and has nothing with on-disk location. * b) FAT manages its own cache of directory entries. * c) *This* cache is indexed by on-disk location. * d) inode has an associated directory entry, all right, but * it may be unhashed. * e) currently entries are stored within struct inode. That should * change. * f) we deal with races in the following way: * 1. readdir() and lookup() do FAT-dir-cache lookup. * 2. rename() unhashes the F-d-c entry and rehashes it in * a new place. * 3. unlink() and rmdir() unhash F-d-c entry. * 4. fat_write_inode() checks whether the thing is unhashed. * If it is we silently return. If it isn't we do bread(), * check if the location is still valid and retry if it * isn't. Otherwise we do changes. * 5. Spinlock is used to protect hash/unhash/location check/lookup * 6. fat_evict_inode() unhashes the F-d-c entry. * 7. lookup() and readdir() do igrab() if they find a F-d-c entry * and consider negative result as cache miss. */ static void fat_hash_init(struct super_block *sb) { struct msdos_sb_info *sbi = MSDOS_SB(sb); int i; spin_lock_init(&sbi->inode_hash_lock); for (i = 0; i < FAT_HASH_SIZE; i++) INIT_HLIST_HEAD(&sbi->inode_hashtable[i]); } static inline unsigned long fat_hash(loff_t i_pos) { return hash_32(i_pos, FAT_HASH_BITS); } static void dir_hash_init(struct super_block *sb) { struct msdos_sb_info *sbi = MSDOS_SB(sb); int i; spin_lock_init(&sbi->dir_hash_lock); for (i = 0; i < FAT_HASH_SIZE; i++) INIT_HLIST_HEAD(&sbi->dir_hashtable[i]); } void fat_attach(struct inode *inode, loff_t i_pos) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); if (inode->i_ino != MSDOS_ROOT_INO) { struct hlist_head *head = sbi->inode_hashtable + fat_hash(i_pos); spin_lock(&sbi->inode_hash_lock); MSDOS_I(inode)->i_pos = i_pos; hlist_add_head(&MSDOS_I(inode)->i_fat_hash, head); spin_unlock(&sbi->inode_hash_lock); } /* If NFS support is enabled, cache the mapping of start cluster * to directory inode. This is used during reconnection of * dentries to the filesystem root. */ if (S_ISDIR(inode->i_mode) && sbi->options.nfs) { struct hlist_head *d_head = sbi->dir_hashtable; d_head += fat_dir_hash(MSDOS_I(inode)->i_logstart); spin_lock(&sbi->dir_hash_lock); hlist_add_head(&MSDOS_I(inode)->i_dir_hash, d_head); spin_unlock(&sbi->dir_hash_lock); } } EXPORT_SYMBOL_GPL(fat_attach); void fat_detach(struct inode *inode) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); spin_lock(&sbi->inode_hash_lock); MSDOS_I(inode)->i_pos = 0; hlist_del_init(&MSDOS_I(inode)->i_fat_hash); spin_unlock(&sbi->inode_hash_lock); if (S_ISDIR(inode->i_mode) && sbi->options.nfs) { spin_lock(&sbi->dir_hash_lock); hlist_del_init(&MSDOS_I(inode)->i_dir_hash); spin_unlock(&sbi->dir_hash_lock); } } EXPORT_SYMBOL_GPL(fat_detach); struct inode *fat_iget(struct super_block *sb, loff_t i_pos) { struct msdos_sb_info *sbi = MSDOS_SB(sb); struct hlist_head *head = sbi->inode_hashtable + fat_hash(i_pos); struct msdos_inode_info *i; struct inode *inode = NULL; spin_lock(&sbi->inode_hash_lock); hlist_for_each_entry(i, head, i_fat_hash) { BUG_ON(i->vfs_inode.i_sb != sb); if (i->i_pos != i_pos) continue; inode = igrab(&i->vfs_inode); if (inode) break; } spin_unlock(&sbi->inode_hash_lock); return inode; } static int is_exec(unsigned char *extension) { unsigned char exe_extensions[] = "EXECOMBAT", *walk; for (walk = exe_extensions; *walk; walk += 3) if (!strncmp(extension, walk, 3)) return 1; return 0; } static int fat_calc_dir_size(struct inode *inode) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); int ret, fclus, dclus; inode->i_size = 0; if (MSDOS_I(inode)->i_start == 0) return 0; ret = fat_get_cluster(inode, FAT_ENT_EOF, &fclus, &dclus); if (ret < 0) return ret; inode->i_size = (fclus + 1) << sbi->cluster_bits; return 0; } static int fat_validate_dir(struct inode *dir) { struct super_block *sb = dir->i_sb; if (dir->i_nlink < 2) { /* Directory should have "."/".." entries at least. */ fat_fs_error(sb, "corrupted directory (invalid entries)"); return -EIO; } if (MSDOS_I(dir)->i_start == 0 || MSDOS_I(dir)->i_start == MSDOS_SB(sb)->root_cluster) { /* Directory should point valid cluster. */ fat_fs_error(sb, "corrupted directory (invalid i_start)"); return -EIO; } return 0; } /* doesn't deal with root inode */ int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); struct timespec64 mtime; int error; MSDOS_I(inode)->i_pos = 0; inode->i_uid = sbi->options.fs_uid; inode->i_gid = sbi->options.fs_gid; inode_inc_iversion(inode); inode->i_generation = get_random_u32(); if ((de->attr & ATTR_DIR) && !IS_FREE(de->name)) { inode->i_generation &= ~1; inode->i_mode = fat_make_mode(sbi, de->attr, S_IRWXUGO); inode->i_op = sbi->dir_ops; inode->i_fop = &fat_dir_operations; MSDOS_I(inode)->i_start = fat_get_start(sbi, de); MSDOS_I(inode)->i_logstart = MSDOS_I(inode)->i_start; error = fat_calc_dir_size(inode); if (error < 0) return error; MSDOS_I(inode)->mmu_private = inode->i_size; set_nlink(inode, fat_subdirs(inode)); error = fat_validate_dir(inode); if (error < 0) return error; } else { /* not a directory */ inode->i_generation |= 1; inode->i_mode = fat_make_mode(sbi, de->attr, ((sbi->options.showexec && !is_exec(de->name + 8)) ? S_IRUGO|S_IWUGO : S_IRWXUGO)); MSDOS_I(inode)->i_start = fat_get_start(sbi, de); MSDOS_I(inode)->i_logstart = MSDOS_I(inode)->i_start; inode->i_size = le32_to_cpu(de->size); inode->i_op = &fat_file_inode_operations; inode->i_fop = &fat_file_operations; inode->i_mapping->a_ops = &fat_aops; MSDOS_I(inode)->mmu_private = inode->i_size; } if (de->attr & ATTR_SYS) { if (sbi->options.sys_immutable) inode->i_flags |= S_IMMUTABLE; } fat_save_attrs(inode, de->attr); inode->i_blocks = ((inode->i_size + (sbi->cluster_size - 1)) & ~((loff_t)sbi->cluster_size - 1)) >> 9; fat_time_fat2unix(sbi, &mtime, de->time, de->date, 0); inode_set_mtime_to_ts(inode, mtime); inode_set_ctime_to_ts(inode, mtime); if (sbi->options.isvfat) { struct timespec64 atime; fat_time_fat2unix(sbi, &atime, 0, de->adate, 0); inode_set_atime_to_ts(inode, atime); fat_time_fat2unix(sbi, &MSDOS_I(inode)->i_crtime, de->ctime, de->cdate, de->ctime_cs); } else inode_set_atime_to_ts(inode, fat_truncate_atime(sbi, &mtime)); return 0; } static inline void fat_lock_build_inode(struct msdos_sb_info *sbi) { if (sbi->options.nfs == FAT_NFS_NOSTALE_RO) mutex_lock(&sbi->nfs_build_inode_lock); } static inline void fat_unlock_build_inode(struct msdos_sb_info *sbi) { if (sbi->options.nfs == FAT_NFS_NOSTALE_RO) mutex_unlock(&sbi->nfs_build_inode_lock); } struct inode *fat_build_inode(struct super_block *sb, struct msdos_dir_entry *de, loff_t i_pos) { struct inode *inode; int err; fat_lock_build_inode(MSDOS_SB(sb)); inode = fat_iget(sb, i_pos); if (inode) goto out; inode = new_inode(sb); if (!inode) { inode = ERR_PTR(-ENOMEM); goto out; } inode->i_ino = iunique(sb, MSDOS_ROOT_INO); inode_set_iversion(inode, 1); err = fat_fill_inode(inode, de); if (err) { iput(inode); inode = ERR_PTR(err); goto out; } fat_attach(inode, i_pos); insert_inode_hash(inode); out: fat_unlock_build_inode(MSDOS_SB(sb)); return inode; } EXPORT_SYMBOL_GPL(fat_build_inode); static int __fat_write_inode(struct inode *inode, int wait); static void fat_free_eofblocks(struct inode *inode) { /* Release unwritten fallocated blocks on inode eviction. */ if ((inode->i_blocks << 9) > round_up(MSDOS_I(inode)->mmu_private, MSDOS_SB(inode->i_sb)->cluster_size)) { int err; fat_truncate_blocks(inode, MSDOS_I(inode)->mmu_private); /* Fallocate results in updating the i_start/iogstart * for the zero byte file. So, make it return to * original state during evict and commit it to avoid * any corruption on the next access to the cluster * chain for the file. */ err = __fat_write_inode(inode, inode_needs_sync(inode)); if (err) { fat_msg(inode->i_sb, KERN_WARNING, "Failed to " "update on disk inode for unused " "fallocated blocks, inode could be " "corrupted. Please run fsck"); } } } static void fat_evict_inode(struct inode *inode) { truncate_inode_pages_final(&inode->i_data); if (!inode->i_nlink) { inode->i_size = 0; fat_truncate_blocks(inode, 0); } else fat_free_eofblocks(inode); invalidate_inode_buffers(inode); clear_inode(inode); fat_cache_inval_inode(inode); fat_detach(inode); } static void fat_set_state(struct super_block *sb, unsigned int set, unsigned int force) { struct buffer_head *bh; struct fat_boot_sector *b; struct msdos_sb_info *sbi = MSDOS_SB(sb); /* do not change any thing if mounted read only */ if (sb_rdonly(sb) && !force) return; /* do not change state if fs was dirty */ if (sbi->dirty) { /* warn only on set (mount). */ if (set) fat_msg(sb, KERN_WARNING, "Volume was not properly " "unmounted. Some data may be corrupt. " "Please run fsck."); return; } bh = sb_bread(sb, 0); if (bh == NULL) { fat_msg(sb, KERN_ERR, "unable to read boot sector " "to mark fs as dirty"); return; } b = (struct fat_boot_sector *) bh->b_data; if (is_fat32(sbi)) { if (set) b->fat32.state |= FAT_STATE_DIRTY; else b->fat32.state &= ~FAT_STATE_DIRTY; } else /* fat 16 and 12 */ { if (set) b->fat16.state |= FAT_STATE_DIRTY; else b->fat16.state &= ~FAT_STATE_DIRTY; } mark_buffer_dirty(bh); sync_dirty_buffer(bh); brelse(bh); } static void fat_reset_iocharset(struct fat_mount_options *opts) { if (opts->iocharset != fat_default_iocharset) { /* Note: opts->iocharset can be NULL here */ kfree(opts->iocharset); opts->iocharset = fat_default_iocharset; } } static void delayed_free(struct rcu_head *p) { struct msdos_sb_info *sbi = container_of(p, struct msdos_sb_info, rcu); unload_nls(sbi->nls_disk); unload_nls(sbi->nls_io); fat_reset_iocharset(&sbi->options); kfree(sbi); } static void fat_put_super(struct super_block *sb) { struct msdos_sb_info *sbi = MSDOS_SB(sb); fat_set_state(sb, 0, 0); iput(sbi->fsinfo_inode); iput(sbi->fat_inode); call_rcu(&sbi->rcu, delayed_free); } static struct kmem_cache *fat_inode_cachep; static struct inode *fat_alloc_inode(struct super_block *sb) { struct msdos_inode_info *ei; ei = alloc_inode_sb(sb, fat_inode_cachep, GFP_NOFS); if (!ei) return NULL; init_rwsem(&ei->truncate_lock); /* Zeroing to allow iput() even if partial initialized inode. */ ei->mmu_private = 0; ei->i_start = 0; ei->i_logstart = 0; ei->i_attrs = 0; ei->i_pos = 0; ei->i_crtime.tv_sec = 0; ei->i_crtime.tv_nsec = 0; return &ei->vfs_inode; } static void fat_free_inode(struct inode *inode) { kmem_cache_free(fat_inode_cachep, MSDOS_I(inode)); } static void init_once(void *foo) { struct msdos_inode_info *ei = (struct msdos_inode_info *)foo; spin_lock_init(&ei->cache_lru_lock); ei->nr_caches = 0; ei->cache_valid_id = FAT_CACHE_VALID + 1; INIT_LIST_HEAD(&ei->cache_lru); INIT_HLIST_NODE(&ei->i_fat_hash); INIT_HLIST_NODE(&ei->i_dir_hash); inode_init_once(&ei->vfs_inode); } static int __init fat_init_inodecache(void) { fat_inode_cachep = kmem_cache_create("fat_inode_cache", sizeof(struct msdos_inode_info), 0, (SLAB_RECLAIM_ACCOUNT| SLAB_ACCOUNT), init_once); if (fat_inode_cachep == NULL) return -ENOMEM; return 0; } static void __exit fat_destroy_inodecache(void) { /* * Make sure all delayed rcu free inodes are flushed before we * destroy cache. */ rcu_barrier(); kmem_cache_destroy(fat_inode_cachep); } int fat_reconfigure(struct fs_context *fc) { bool new_rdonly; struct super_block *sb = fc->root->d_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); fc->sb_flags |= SB_NODIRATIME | (sbi->options.isvfat ? 0 : SB_NOATIME); sync_filesystem(sb); /* make sure we update state on remount. */ new_rdonly = fc->sb_flags & SB_RDONLY; if (new_rdonly != sb_rdonly(sb)) { if (new_rdonly) fat_set_state(sb, 0, 0); else fat_set_state(sb, 1, 1); } return 0; } EXPORT_SYMBOL_GPL(fat_reconfigure); static int fat_statfs(struct dentry *dentry, struct kstatfs *buf) { struct super_block *sb = dentry->d_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); u64 id = huge_encode_dev(sb->s_bdev->bd_dev); /* If the count of free cluster is still unknown, counts it here. */ if (sbi->free_clusters == -1 || !sbi->free_clus_valid) { int err = fat_count_free_clusters(dentry->d_sb); if (err) return err; } buf->f_type = dentry->d_sb->s_magic; buf->f_bsize = sbi->cluster_size; buf->f_blocks = sbi->max_cluster - FAT_START_ENT; buf->f_bfree = sbi->free_clusters; buf->f_bavail = sbi->free_clusters; buf->f_fsid = u64_to_fsid(id); buf->f_namelen = (sbi->options.isvfat ? FAT_LFN_LEN : 12) * NLS_MAX_CHARSET_SIZE; return 0; } static int __fat_write_inode(struct inode *inode, int wait) { struct super_block *sb = inode->i_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); struct buffer_head *bh; struct msdos_dir_entry *raw_entry; struct timespec64 mtime; loff_t i_pos; sector_t blocknr; int err, offset; if (inode->i_ino == MSDOS_ROOT_INO) return 0; retry: i_pos = fat_i_pos_read(sbi, inode); if (!i_pos) return 0; fat_get_blknr_offset(sbi, i_pos, &blocknr, &offset); bh = sb_bread(sb, blocknr); if (!bh) { fat_msg(sb, KERN_ERR, "unable to read inode block " "for updating (i_pos %lld)", i_pos); return -EIO; } spin_lock(&sbi->inode_hash_lock); if (i_pos != MSDOS_I(inode)->i_pos) { spin_unlock(&sbi->inode_hash_lock); brelse(bh); goto retry; } raw_entry = &((struct msdos_dir_entry *) (bh->b_data))[offset]; if (S_ISDIR(inode->i_mode)) raw_entry->size = 0; else raw_entry->size = cpu_to_le32(inode->i_size); raw_entry->attr = fat_make_attrs(inode); fat_set_start(raw_entry, MSDOS_I(inode)->i_logstart); mtime = inode_get_mtime(inode); fat_time_unix2fat(sbi, &mtime, &raw_entry->time, &raw_entry->date, NULL); if (sbi->options.isvfat) { struct timespec64 ts = inode_get_atime(inode); __le16 atime; fat_time_unix2fat(sbi, &ts, &atime, &raw_entry->adate, NULL); fat_time_unix2fat(sbi, &MSDOS_I(inode)->i_crtime, &raw_entry->ctime, &raw_entry->cdate, &raw_entry->ctime_cs); } spin_unlock(&sbi->inode_hash_lock); mark_buffer_dirty(bh); err = 0; if (wait) err = sync_dirty_buffer(bh); brelse(bh); return err; } static int fat_write_inode(struct inode *inode, struct writeback_control *wbc) { int err; if (inode->i_ino == MSDOS_FSINFO_INO) { struct super_block *sb = inode->i_sb; mutex_lock(&MSDOS_SB(sb)->s_lock); err = fat_clusters_flush(sb); mutex_unlock(&MSDOS_SB(sb)->s_lock); } else err = __fat_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL); return err; } int fat_sync_inode(struct inode *inode) { return __fat_write_inode(inode, 1); } EXPORT_SYMBOL_GPL(fat_sync_inode); static int fat_show_options(struct seq_file *m, struct dentry *root); static const struct super_operations fat_sops = { .alloc_inode = fat_alloc_inode, .free_inode = fat_free_inode, .write_inode = fat_write_inode, .evict_inode = fat_evict_inode, .put_super = fat_put_super, .statfs = fat_statfs, .show_options = fat_show_options, }; static int fat_show_options(struct seq_file *m, struct dentry *root) { struct msdos_sb_info *sbi = MSDOS_SB(root->d_sb); struct fat_mount_options *opts = &sbi->options; int isvfat = opts->isvfat; if (!uid_eq(opts->fs_uid, GLOBAL_ROOT_UID)) seq_printf(m, ",uid=%u", from_kuid_munged(&init_user_ns, opts->fs_uid)); |