| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_PROJID_H #define _LINUX_PROJID_H /* * A set of types for the internal kernel types representing project ids. * * The types defined in this header allow distinguishing which project ids in * the kernel are values used by userspace and which project id values are * the internal kernel values. With the addition of user namespaces the values * can be different. Using the type system makes it possible for the compiler * to detect when we overlook these differences. * */ #include <linux/types.h> struct user_namespace; extern struct user_namespace init_user_ns; typedef __kernel_uid32_t projid_t; typedef struct { projid_t val; } kprojid_t; static inline projid_t __kprojid_val(kprojid_t projid) { return projid.val; } #define KPROJIDT_INIT(value) (kprojid_t){ value } #define INVALID_PROJID KPROJIDT_INIT(-1) #define OVERFLOW_PROJID 65534 static inline bool projid_eq(kprojid_t left, kprojid_t right) { return __kprojid_val(left) == __kprojid_val(right); } static inline bool projid_lt(kprojid_t left, kprojid_t right) { return __kprojid_val(left) < __kprojid_val(right); } static inline bool projid_valid(kprojid_t projid) { return !projid_eq(projid, INVALID_PROJID); } #ifdef CONFIG_USER_NS extern kprojid_t make_kprojid(struct user_namespace *from, projid_t projid); extern projid_t from_kprojid(struct user_namespace *to, kprojid_t projid); extern projid_t from_kprojid_munged(struct user_namespace *to, kprojid_t projid); static inline bool kprojid_has_mapping(struct user_namespace *ns, kprojid_t projid) { return from_kprojid(ns, projid) != (projid_t)-1; } #else static inline kprojid_t make_kprojid(struct user_namespace *from, projid_t projid) { return KPROJIDT_INIT(projid); } static inline projid_t from_kprojid(struct user_namespace *to, kprojid_t kprojid) { return __kprojid_val(kprojid); } static inline projid_t from_kprojid_munged(struct user_namespace *to, kprojid_t kprojid) { projid_t projid = from_kprojid(to, kprojid); if (projid == (projid_t)-1) projid = OVERFLOW_PROJID; return projid; } static inline bool kprojid_has_mapping(struct user_namespace *ns, kprojid_t projid) { return true; } #endif /* CONFIG_USER_NS */ #endif /* _LINUX_PROJID_H */ |
| 2 2 2 2 52 48 1 1 1 1 3 4 6 3 4 5 2 2 5 2 2 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 | // SPDX-License-Identifier: GPL-2.0+ /* * Support for dynamic clock devices * * Copyright (C) 2010 OMICRON electronics GmbH */ #include <linux/device.h> #include <linux/export.h> #include <linux/file.h> #include <linux/posix-clock.h> #include <linux/slab.h> #include <linux/syscalls.h> #include <linux/uaccess.h> #include "posix-timers.h" /* * Returns NULL if the posix_clock instance attached to 'fp' is old and stale. */ static struct posix_clock *get_posix_clock(struct file *fp) { struct posix_clock_context *pccontext = fp->private_data; struct posix_clock *clk = pccontext->clk; down_read(&clk->rwsem); if (!clk->zombie) return clk; up_read(&clk->rwsem); return NULL; } static void put_posix_clock(struct posix_clock *clk) { up_read(&clk->rwsem); } static ssize_t posix_clock_read(struct file *fp, char __user *buf, size_t count, loff_t *ppos) { struct posix_clock_context *pccontext = fp->private_data; struct posix_clock *clk = get_posix_clock(fp); int err = -EINVAL; if (!clk) return -ENODEV; if (clk->ops.read) err = clk->ops.read(pccontext, fp->f_flags, buf, count); put_posix_clock(clk); return err; } static __poll_t posix_clock_poll(struct file *fp, poll_table *wait) { struct posix_clock_context *pccontext = fp->private_data; struct posix_clock *clk = get_posix_clock(fp); __poll_t result = 0; if (!clk) return EPOLLERR; if (clk->ops.poll) result = clk->ops.poll(pccontext, fp, wait); put_posix_clock(clk); return result; } static long posix_clock_ioctl(struct file *fp, unsigned int cmd, unsigned long arg) { struct posix_clock_context *pccontext = fp->private_data; struct posix_clock *clk = get_posix_clock(fp); int err = -ENOTTY; if (!clk) return -ENODEV; if (clk->ops.ioctl) err = clk->ops.ioctl(pccontext, cmd, arg); put_posix_clock(clk); return err; } static int posix_clock_open(struct inode *inode, struct file *fp) { int err; struct posix_clock *clk = container_of(inode->i_cdev, struct posix_clock, cdev); struct posix_clock_context *pccontext; down_read(&clk->rwsem); if (clk->zombie) { err = -ENODEV; goto out; } pccontext = kzalloc(sizeof(*pccontext), GFP_KERNEL); if (!pccontext) { err = -ENOMEM; goto out; } pccontext->clk = clk; pccontext->fp = fp; if (clk->ops.open) { err = clk->ops.open(pccontext, fp->f_mode); if (err) { kfree(pccontext); goto out; } } fp->private_data = pccontext; get_device(clk->dev); err = 0; out: up_read(&clk->rwsem); return err; } static int posix_clock_release(struct inode *inode, struct file *fp) { struct posix_clock_context *pccontext = fp->private_data; struct posix_clock *clk; int err = 0; if (!pccontext) return -ENODEV; clk = pccontext->clk; if (clk->ops.release) err = clk->ops.release(pccontext); put_device(clk->dev); kfree(pccontext); fp->private_data = NULL; return err; } static const struct file_operations posix_clock_file_operations = { .owner = THIS_MODULE, .read = posix_clock_read, .poll = posix_clock_poll, .unlocked_ioctl = posix_clock_ioctl, .compat_ioctl = posix_clock_ioctl, .open = posix_clock_open, .release = posix_clock_release, }; int posix_clock_register(struct posix_clock *clk, struct device *dev) { int err; init_rwsem(&clk->rwsem); cdev_init(&clk->cdev, &posix_clock_file_operations); err = cdev_device_add(&clk->cdev, dev); if (err) { pr_err("%s unable to add device %d:%d\n", dev_name(dev), MAJOR(dev->devt), MINOR(dev->devt)); return err; } clk->cdev.owner = clk->ops.owner; clk->dev = dev; return 0; } EXPORT_SYMBOL_GPL(posix_clock_register); void posix_clock_unregister(struct posix_clock *clk) { cdev_device_del(&clk->cdev, clk->dev); down_write(&clk->rwsem); clk->zombie = true; up_write(&clk->rwsem); put_device(clk->dev); } EXPORT_SYMBOL_GPL(posix_clock_unregister); struct posix_clock_desc { struct file *fp; struct posix_clock *clk; }; static int get_clock_desc(const clockid_t id, struct posix_clock_desc *cd) { struct file *fp = fget(clockid_to_fd(id)); int err = -EINVAL; if (!fp) return err; if (fp->f_op->open != posix_clock_open || !fp->private_data) goto out; cd->fp = fp; cd->clk = get_posix_clock(fp); err = cd->clk ? 0 : -ENODEV; out: if (err) fput(fp); return err; } static void put_clock_desc(struct posix_clock_desc *cd) { put_posix_clock(cd->clk); fput(cd->fp); } static int pc_clock_adjtime(clockid_t id, struct __kernel_timex *tx) { struct posix_clock_desc cd; int err; err = get_clock_desc(id, &cd); if (err) return err; if (tx->modes && (cd.fp->f_mode & FMODE_WRITE) == 0) { err = -EACCES; goto out; } if (cd.clk->ops.clock_adjtime) err = cd.clk->ops.clock_adjtime(cd.clk, tx); else err = -EOPNOTSUPP; out: put_clock_desc(&cd); return err; } static int pc_clock_gettime(clockid_t id, struct timespec64 *ts) { struct posix_clock_desc cd; int err; err = get_clock_desc(id, &cd); if (err) return err; if (cd.clk->ops.clock_gettime) err = cd.clk->ops.clock_gettime(cd.clk, ts); else err = -EOPNOTSUPP; put_clock_desc(&cd); return err; } static int pc_clock_getres(clockid_t id, struct timespec64 *ts) { struct posix_clock_desc cd; int err; err = get_clock_desc(id, &cd); if (err) return err; if (cd.clk->ops.clock_getres) err = cd.clk->ops.clock_getres(cd.clk, ts); else err = -EOPNOTSUPP; put_clock_desc(&cd); return err; } static int pc_clock_settime(clockid_t id, const struct timespec64 *ts) { struct posix_clock_desc cd; int err; if (!timespec64_valid_strict(ts)) return -EINVAL; err = get_clock_desc(id, &cd); if (err) return err; if ((cd.fp->f_mode & FMODE_WRITE) == 0) { err = -EACCES; goto out; } if (cd.clk->ops.clock_settime) err = cd.clk->ops.clock_settime(cd.clk, ts); else err = -EOPNOTSUPP; out: put_clock_desc(&cd); return err; } const struct k_clock clock_posix_dynamic = { .clock_getres = pc_clock_getres, .clock_set = pc_clock_settime, .clock_get_timespec = pc_clock_gettime, .clock_adj = pc_clock_adjtime, }; |
| 484 487 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _TRACE_SYSCALL_H #define _TRACE_SYSCALL_H #include <linux/tracepoint.h> #include <linux/unistd.h> #include <linux/trace_events.h> #include <linux/thread_info.h> #include <asm/ptrace.h> /* * A syscall entry in the ftrace syscalls array. * * @name: name of the syscall * @syscall_nr: number of the syscall * @nb_args: number of parameters it takes * @types: list of types as strings * @args: list of args as strings (args[i] matches types[i]) * @enter_fields: list of fields for syscall_enter trace event * @enter_event: associated syscall_enter trace event * @exit_event: associated syscall_exit trace event */ struct syscall_metadata { const char *name; int syscall_nr; int nb_args; const char **types; const char **args; struct list_head enter_fields; struct trace_event_call *enter_event; struct trace_event_call *exit_event; }; #if defined(CONFIG_TRACEPOINTS) && defined(CONFIG_HAVE_SYSCALL_TRACEPOINTS) static inline void syscall_tracepoint_update(struct task_struct *p) { if (test_syscall_work(SYSCALL_TRACEPOINT)) set_task_syscall_work(p, SYSCALL_TRACEPOINT); else clear_task_syscall_work(p, SYSCALL_TRACEPOINT); } #else static inline void syscall_tracepoint_update(struct task_struct *p) { } #endif #endif /* _TRACE_SYSCALL_H */ |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | /* * Copyright © 2017 Red Hat * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice (including the next * paragraph) shall be included in all copies or substantial portions of the * Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * Authors: * */ #ifndef __DRM_SYNCOBJ_H__ #define __DRM_SYNCOBJ_H__ #include <linux/dma-fence.h> #include <linux/dma-fence-chain.h> struct drm_file; /** * struct drm_syncobj - sync object. * * This structure defines a generic sync object which wraps a &dma_fence. */ struct drm_syncobj { /** * @refcount: Reference count of this object. */ struct kref refcount; /** * @fence: * NULL or a pointer to the fence bound to this object. * * This field should not be used directly. Use drm_syncobj_fence_get() * and drm_syncobj_replace_fence() instead. */ struct dma_fence __rcu *fence; /** * @cb_list: List of callbacks to call when the &fence gets replaced. */ struct list_head cb_list; /** * @ev_fd_list: List of registered eventfd. */ struct list_head ev_fd_list; /** * @lock: Protects &cb_list and &ev_fd_list, and write-locks &fence. */ spinlock_t lock; /** * @file: A file backing for this syncobj. */ struct file *file; }; void drm_syncobj_free(struct kref *kref); /** * drm_syncobj_get - acquire a syncobj reference * @obj: sync object * * This acquires an additional reference to @obj. It is illegal to call this * without already holding a reference. No locks required. */ static inline void drm_syncobj_get(struct drm_syncobj *obj) { kref_get(&obj->refcount); } /** * drm_syncobj_put - release a reference to a sync object. * @obj: sync object. */ static inline void drm_syncobj_put(struct drm_syncobj *obj) { kref_put(&obj->refcount, drm_syncobj_free); } /** * drm_syncobj_fence_get - get a reference to a fence in a sync object * @syncobj: sync object. * * This acquires additional reference to &drm_syncobj.fence contained in @obj, * if not NULL. It is illegal to call this without already holding a reference. * No locks required. * * Returns: * Either the fence of @obj or NULL if there's none. */ static inline struct dma_fence * drm_syncobj_fence_get(struct drm_syncobj *syncobj) { struct dma_fence *fence; rcu_read_lock(); fence = dma_fence_get_rcu_safe(&syncobj->fence); rcu_read_unlock(); return fence; } struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle); void drm_syncobj_add_point(struct drm_syncobj *syncobj, struct dma_fence_chain *chain, struct dma_fence *fence, uint64_t point); void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence); int drm_syncobj_find_fence(struct drm_file *file_private, u32 handle, u64 point, u64 flags, struct dma_fence **fence); void drm_syncobj_free(struct kref *kref); int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags, struct dma_fence *fence); int drm_syncobj_get_handle(struct drm_file *file_private, struct drm_syncobj *syncobj, u32 *handle); int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd); #endif |
| 7 7 2 5 5 5 5 5 5 5 5 5 1 4 1 1 1 1 2 2 1 3 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 | // SPDX-License-Identifier: GPL-2.0-only /* * vivid-meta-cap.c - meta capture support functions. */ #include <linux/errno.h> #include <linux/kernel.h> #include <linux/videodev2.h> #include <media/v4l2-common.h> #include <linux/usb/video.h> #include "vivid-core.h" #include "vivid-kthread-cap.h" #include "vivid-meta-cap.h" static int meta_cap_queue_setup(struct vb2_queue *vq, unsigned int *nbuffers, unsigned int *nplanes, unsigned int sizes[], struct device *alloc_devs[]) { struct vivid_dev *dev = vb2_get_drv_priv(vq); unsigned int size = sizeof(struct vivid_uvc_meta_buf); if (!vivid_is_webcam(dev)) return -EINVAL; if (*nplanes) { if (sizes[0] < size) return -EINVAL; } else { sizes[0] = size; } *nplanes = 1; return 0; } static int meta_cap_buf_prepare(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); unsigned int size = sizeof(struct vivid_uvc_meta_buf); dprintk(dev, 1, "%s\n", __func__); if (dev->buf_prepare_error) { /* * Error injection: test what happens if buf_prepare() returns * an error. */ dev->buf_prepare_error = false; return -EINVAL; } if (vb2_plane_size(vb, 0) < size) { dprintk(dev, 1, "%s data will not fit into plane (%lu < %u)\n", __func__, vb2_plane_size(vb, 0), size); return -EINVAL; } vb2_set_plane_payload(vb, 0, size); return 0; } static void meta_cap_buf_queue(struct vb2_buffer *vb) { struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); struct vivid_buffer *buf = container_of(vbuf, struct vivid_buffer, vb); dprintk(dev, 1, "%s\n", __func__); spin_lock(&dev->slock); list_add_tail(&buf->list, &dev->meta_cap_active); spin_unlock(&dev->slock); } static int meta_cap_start_streaming(struct vb2_queue *vq, unsigned int count) { struct vivid_dev *dev = vb2_get_drv_priv(vq); int err; dprintk(dev, 1, "%s\n", __func__); dev->meta_cap_seq_count = 0; if (dev->start_streaming_error) { dev->start_streaming_error = false; err = -EINVAL; } else { err = vivid_start_generating_vid_cap(dev, &dev->meta_cap_streaming); } if (err) { struct vivid_buffer *buf, *tmp; list_for_each_entry_safe(buf, tmp, &dev->meta_cap_active, list) { list_del(&buf->list); vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED); } } return err; } /* abort streaming and wait for last buffer */ static void meta_cap_stop_streaming(struct vb2_queue *vq) { struct vivid_dev *dev = vb2_get_drv_priv(vq); dprintk(dev, 1, "%s\n", __func__); vivid_stop_generating_vid_cap(dev, &dev->meta_cap_streaming); } static void meta_cap_buf_request_complete(struct vb2_buffer *vb) { struct vivid_dev *dev = vb2_get_drv_priv(vb->vb2_queue); v4l2_ctrl_request_complete(vb->req_obj.req, &dev->ctrl_hdl_meta_cap); } const struct vb2_ops vivid_meta_cap_qops = { .queue_setup = meta_cap_queue_setup, .buf_prepare = meta_cap_buf_prepare, .buf_queue = meta_cap_buf_queue, .start_streaming = meta_cap_start_streaming, .stop_streaming = meta_cap_stop_streaming, .buf_request_complete = meta_cap_buf_request_complete, }; int vidioc_enum_fmt_meta_cap(struct file *file, void *priv, struct v4l2_fmtdesc *f) { struct vivid_dev *dev = video_drvdata(file); if (!vivid_is_webcam(dev)) return -EINVAL; if (f->index > 0) return -EINVAL; f->type = V4L2_BUF_TYPE_META_CAPTURE; f->pixelformat = V4L2_META_FMT_UVC; return 0; } int vidioc_g_fmt_meta_cap(struct file *file, void *priv, struct v4l2_format *f) { struct vivid_dev *dev = video_drvdata(file); struct v4l2_meta_format *meta = &f->fmt.meta; if (!vivid_is_webcam(dev) || !dev->has_meta_cap) return -EINVAL; meta->dataformat = V4L2_META_FMT_UVC; meta->buffersize = sizeof(struct vivid_uvc_meta_buf); return 0; } void vivid_meta_cap_fillbuff(struct vivid_dev *dev, struct vivid_buffer *buf, u64 soe) { struct vivid_uvc_meta_buf *meta = vb2_plane_vaddr(&buf->vb.vb2_buf, 0); int buf_off = 0; buf->vb.sequence = dev->meta_cap_seq_count; if (dev->field_cap == V4L2_FIELD_ALTERNATE) buf->vb.sequence /= 2; memset(meta, 1, vb2_plane_size(&buf->vb.vb2_buf, 0)); meta->ns = ktime_get_ns(); meta->sof = buf->vb.sequence * 30; meta->length = sizeof(*meta) - offsetof(struct vivid_uvc_meta_buf, length); meta->flags = UVC_STREAM_EOH | UVC_STREAM_EOF; if ((buf->vb.sequence % 2) == 0) meta->flags |= UVC_STREAM_FID; dprintk(dev, 2, "%s ns:%llu sof:%4d len:%u flags: 0x%02x", __func__, meta->ns, meta->sof, meta->length, meta->flags); if (dev->meta_pts) { meta->flags |= UVC_STREAM_PTS; meta->buf[0] = div_u64(soe, VIVID_META_CLOCK_UNIT); buf_off = 4; dprintk(dev, 2, " pts: %u\n", *(__u32 *)(meta->buf)); } if (dev->meta_scr) { meta->flags |= UVC_STREAM_SCR; meta->buf[buf_off] = div_u64((soe + dev->cap_frame_eof_offset), VIVID_META_CLOCK_UNIT); meta->buf[buf_off + 4] = (buf->vb.sequence * 30) % 1000; dprintk(dev, 2, " stc: %u, sof counter: %u\n", *(__u32 *)(meta->buf + buf_off), *(__u16 *)(meta->buf + buf_off + 4)); } dprintk(dev, 2, "\n"); } |
| 2889 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | /* SPDX-License-Identifier: GPL-2.0+ */ /* * RCU-based infrastructure for lightweight reader-writer locking * * Copyright (c) 2015, Red Hat, Inc. * * Author: Oleg Nesterov <oleg@redhat.com> */ #ifndef _LINUX_RCU_SYNC_H_ #define _LINUX_RCU_SYNC_H_ #include <linux/wait.h> #include <linux/rcupdate.h> /* Structure to mediate between updaters and fastpath-using readers. */ struct rcu_sync { int gp_state; int gp_count; wait_queue_head_t gp_wait; struct rcu_head cb_head; }; /** * rcu_sync_is_idle() - Are readers permitted to use their fastpaths? * @rsp: Pointer to rcu_sync structure to use for synchronization * * Returns true if readers are permitted to use their fastpaths. Must be * invoked within some flavor of RCU read-side critical section. */ static inline bool rcu_sync_is_idle(struct rcu_sync *rsp) { RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(), "suspicious rcu_sync_is_idle() usage"); return !READ_ONCE(rsp->gp_state); /* GP_IDLE */ } extern void rcu_sync_init(struct rcu_sync *); extern void rcu_sync_enter(struct rcu_sync *); extern void rcu_sync_exit(struct rcu_sync *); extern void rcu_sync_dtor(struct rcu_sync *); #define __RCU_SYNC_INITIALIZER(name) { \ .gp_state = 0, \ .gp_count = 0, \ .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ } #define DEFINE_RCU_SYNC(name) \ struct rcu_sync name = __RCU_SYNC_INITIALIZER(name) #endif /* _LINUX_RCU_SYNC_H_ */ |
| 1 8 6 6 6 6 6 16 16 16 23 23 23 15 8 8 8 6 6 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 | // SPDX-License-Identifier: GPL-2.0 /* Multipath TCP * * Copyright (c) 2022, SUSE. */ #define pr_fmt(fmt) "MPTCP: " fmt #include <linux/kernel.h> #include <linux/module.h> #include <linux/list.h> #include <linux/rculist.h> #include <linux/spinlock.h> #include "protocol.h" static DEFINE_SPINLOCK(mptcp_sched_list_lock); static LIST_HEAD(mptcp_sched_list); static int mptcp_sched_default_get_send(struct mptcp_sock *msk) { struct sock *ssk; ssk = mptcp_subflow_get_send(msk); if (!ssk) return -EINVAL; mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true); return 0; } static int mptcp_sched_default_get_retrans(struct mptcp_sock *msk) { struct sock *ssk; ssk = mptcp_subflow_get_retrans(msk); if (!ssk) return -EINVAL; mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true); return 0; } static struct mptcp_sched_ops mptcp_sched_default = { .get_send = mptcp_sched_default_get_send, .get_retrans = mptcp_sched_default_get_retrans, .name = "default", .owner = THIS_MODULE, }; /* Must be called with rcu read lock held */ struct mptcp_sched_ops *mptcp_sched_find(const char *name) { struct mptcp_sched_ops *sched, *ret = NULL; list_for_each_entry_rcu(sched, &mptcp_sched_list, list) { if (!strcmp(sched->name, name)) { ret = sched; break; } } return ret; } /* Build string with list of available scheduler values. * Similar to tcp_get_available_congestion_control() */ void mptcp_get_available_schedulers(char *buf, size_t maxlen) { struct mptcp_sched_ops *sched; size_t offs = 0; rcu_read_lock(); list_for_each_entry_rcu(sched, &mptcp_sched_list, list) { offs += snprintf(buf + offs, maxlen - offs, "%s%s", offs == 0 ? "" : " ", sched->name); if (WARN_ON_ONCE(offs >= maxlen)) break; } rcu_read_unlock(); } int mptcp_validate_scheduler(struct mptcp_sched_ops *sched) { if (!sched->get_send) { pr_err("%s does not implement required ops\n", sched->name); return -EINVAL; } return 0; } int mptcp_register_scheduler(struct mptcp_sched_ops *sched) { int ret; ret = mptcp_validate_scheduler(sched); if (ret) return ret; spin_lock(&mptcp_sched_list_lock); if (mptcp_sched_find(sched->name)) { spin_unlock(&mptcp_sched_list_lock); return -EEXIST; } list_add_tail_rcu(&sched->list, &mptcp_sched_list); spin_unlock(&mptcp_sched_list_lock); pr_debug("%s registered\n", sched->name); return 0; } void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched) { if (sched == &mptcp_sched_default) return; spin_lock(&mptcp_sched_list_lock); list_del_rcu(&sched->list); spin_unlock(&mptcp_sched_list_lock); } void mptcp_sched_init(void) { mptcp_register_scheduler(&mptcp_sched_default); } int mptcp_init_sched(struct mptcp_sock *msk, struct mptcp_sched_ops *sched) { if (!sched) sched = &mptcp_sched_default; if (!bpf_try_module_get(sched, sched->owner)) return -EBUSY; msk->sched = sched; if (msk->sched->init) msk->sched->init(msk); pr_debug("sched=%s\n", msk->sched->name); return 0; } void mptcp_release_sched(struct mptcp_sock *msk) { struct mptcp_sched_ops *sched = msk->sched; if (!sched) return; msk->sched = NULL; if (sched->release) sched->release(msk); bpf_module_put(sched, sched->owner); } void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow, bool scheduled) { WRITE_ONCE(subflow->scheduled, scheduled); } int mptcp_sched_get_send(struct mptcp_sock *msk) { struct mptcp_subflow_context *subflow; msk_owned_by_me(msk); /* the following check is moved out of mptcp_subflow_get_send */ if (__mptcp_check_fallback(msk)) { if (msk->first && __tcp_can_send(msk->first) && sk_stream_memory_free(msk->first)) { mptcp_subflow_set_scheduled(mptcp_subflow_ctx(msk->first), true); return 0; } return -EINVAL; } mptcp_for_each_subflow(msk, subflow) { if (READ_ONCE(subflow->scheduled)) return 0; } if (msk->sched == &mptcp_sched_default || !msk->sched) return mptcp_sched_default_get_send(msk); return msk->sched->get_send(msk); } int mptcp_sched_get_retrans(struct mptcp_sock *msk) { struct mptcp_subflow_context *subflow; msk_owned_by_me(msk); /* the following check is moved out of mptcp_subflow_get_retrans */ if (__mptcp_check_fallback(msk)) return -EINVAL; mptcp_for_each_subflow(msk, subflow) { if (READ_ONCE(subflow->scheduled)) return 0; } if (msk->sched == &mptcp_sched_default || !msk->sched) return mptcp_sched_default_get_retrans(msk); if (msk->sched->get_retrans) return msk->sched->get_retrans(msk); return msk->sched->get_send(msk); } |
| 143 143 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_X86_SYNC_CORE_H #define _ASM_X86_SYNC_CORE_H #include <linux/preempt.h> #include <asm/processor.h> #include <asm/cpufeature.h> #include <asm/special_insns.h> #ifdef CONFIG_X86_32 static __always_inline void iret_to_self(void) { asm volatile ( "pushfl\n\t" "pushl %%cs\n\t" "pushl $1f\n\t" "iret\n\t" "1:" : ASM_CALL_CONSTRAINT : : "memory"); } #else static __always_inline void iret_to_self(void) { unsigned int tmp; asm volatile ( "mov %%ss, %0\n\t" "pushq %q0\n\t" "pushq %%rsp\n\t" "addq $8, (%%rsp)\n\t" "pushfq\n\t" "mov %%cs, %0\n\t" "pushq %q0\n\t" "pushq $1f\n\t" "iretq\n\t" "1:" : "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory"); } #endif /* CONFIG_X86_32 */ /* * This function forces the icache and prefetched instruction stream to * catch up with reality in two very specific cases: * * a) Text was modified using one virtual address and is about to be executed * from the same physical page at a different virtual address. * * b) Text was modified on a different CPU, may subsequently be * executed on this CPU, and you want to make sure the new version * gets executed. This generally means you're calling this in an IPI. * * If you're calling this for a different reason, you're probably doing * it wrong. * * Like all of Linux's memory ordering operations, this is a * compiler barrier as well. */ static __always_inline void sync_core(void) { /* * The SERIALIZE instruction is the most straightforward way to * do this, but it is not universally available. */ if (static_cpu_has(X86_FEATURE_SERIALIZE)) { serialize(); return; } /* * For all other processors, there are quite a few ways to do this. * IRET-to-self is nice because it works on every CPU, at any CPL * (so it's compatible with paravirtualization), and it never exits * to a hypervisor. The only downsides are that it's a bit slow * (it seems to be a bit more than 2x slower than the fastest * options) and that it unmasks NMIs. The "push %cs" is needed, * because in paravirtual environments __KERNEL_CS may not be a * valid CS value when we do IRET directly. * * In case NMI unmasking or performance ever becomes a problem, * the next best option appears to be MOV-to-CR2 and an * unconditional jump. That sequence also works on all CPUs, * but it will fault at CPL3 (i.e. Xen PV). * * CPUID is the conventional way, but it's nasty: it doesn't * exist on some 486-like CPUs, and it usually exits to a * hypervisor. */ iret_to_self(); } /* * Ensure that a core serializing instruction is issued before returning * to user-mode. x86 implements return to user-space through sysexit, * sysrel, and sysretq, which are not core serializing. */ static inline void sync_core_before_usermode(void) { /* With PTI, we unconditionally serialize before running user code. */ if (static_cpu_has(X86_FEATURE_PTI)) return; /* * Even if we're in an interrupt, we might reschedule before returning, * in which case we could switch to a different thread in the same mm * and return using SYSRET or SYSEXIT. Instead of trying to keep * track of our need to sync the core, just sync right away. */ sync_core(); } #endif /* _ASM_X86_SYNC_CORE_H */ |
| 24 25 6 6 6 6 1 6 6 1 5 4 1 18 19 1 6 1 23 17 5 5 7 2 9 16 16 11 6 1 14 7 17 17 17 17 17 17 16 17 17 16 17 8 8 17 17 17 17 17 17 17 17 17 17 17 16 17 1 1 5 5 5 5 26 1 21 5 5 2 3 19 21 21 16 5 5 2 6 3 4 3 8 8 5 1 7 2 5 8 7 26 1 27 3 1 3 14 14 6 8 1 6 1 2 5 5 3 3 1 1 2 5 4 1 6 6 1 5 1 6 1 1 5 6 6 1 5 4 5 4 6 5 1 5 5 6 5 6 11 12 12 17 17 3 1 1 14 2 13 1 1 1 13 14 1 1 12 13 12 1 11 12 5 6 6 1 5 1 5 6 11 12 12 3 1 1 1 2 2 2 2 2 10 3 5 2 1 2 2 2 2 7 7 7 3 4 7 2 7 7 1 1 1 1 1 1 1 1 15 1 16 3 15 9 11 1 20 3 15 2 20 2 18 2 3 15 5 10 15 20 20 19 18 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 | /* * Copyright (c) 2016-2017, Mellanox Technologies. All rights reserved. * Copyright (c) 2016-2017, Dave Watson <davejwatson@fb.com>. All rights reserved. * Copyright (c) 2016-2017, Lance Chao <lancerchao@fb.com>. All rights reserved. * Copyright (c) 2016, Fridolin Pokorny <fridolin.pokorny@gmail.com>. All rights reserved. * Copyright (c) 2016, Nikos Mavrogiannopoulos <nmav@gnutls.org>. All rights reserved. * Copyright (c) 2018, Covalent IO, Inc. http://covalent.io * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include <linux/bug.h> #include <linux/sched/signal.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/splice.h> #include <crypto/aead.h> #include <net/strparser.h> #include <net/tls.h> #include <trace/events/sock.h> #include "tls.h" struct tls_decrypt_arg { struct_group(inargs, bool zc; bool async; bool async_done; u8 tail; ); struct sk_buff *skb; }; struct tls_decrypt_ctx { struct sock *sk; u8 iv[TLS_MAX_IV_SIZE]; u8 aad[TLS_MAX_AAD_SIZE]; u8 tail; bool free_sgout; struct scatterlist sg[]; }; noinline void tls_err_abort(struct sock *sk, int err) { WARN_ON_ONCE(err >= 0); /* sk->sk_err should contain a positive error code. */ WRITE_ONCE(sk->sk_err, -err); /* Paired with smp_rmb() in tcp_poll() */ smp_wmb(); sk_error_report(sk); } static int __skb_nsg(struct sk_buff *skb, int offset, int len, unsigned int recursion_level) { int start = skb_headlen(skb); int i, chunk = start - offset; struct sk_buff *frag_iter; int elt = 0; if (unlikely(recursion_level >= 24)) return -EMSGSIZE; if (chunk > 0) { if (chunk > len) chunk = len; elt++; len -= chunk; if (len == 0) return elt; offset += chunk; } for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { int end; WARN_ON(start > offset + len); end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]); chunk = end - offset; if (chunk > 0) { if (chunk > len) chunk = len; elt++; len -= chunk; if (len == 0) return elt; offset += chunk; } start = end; } if (unlikely(skb_has_frag_list(skb))) { skb_walk_frags(skb, frag_iter) { int end, ret; WARN_ON(start > offset + len); end = start + frag_iter->len; chunk = end - offset; if (chunk > 0) { if (chunk > len) chunk = len; ret = __skb_nsg(frag_iter, offset - start, chunk, recursion_level + 1); if (unlikely(ret < 0)) return ret; elt += ret; len -= chunk; if (len == 0) return elt; offset += chunk; } start = end; } } BUG_ON(len); return elt; } /* Return the number of scatterlist elements required to completely map the * skb, or -EMSGSIZE if the recursion depth is exceeded. */ static int skb_nsg(struct sk_buff *skb, int offset, int len) { return __skb_nsg(skb, offset, len, 0); } static int tls_padding_length(struct tls_prot_info *prot, struct sk_buff *skb, struct tls_decrypt_arg *darg) { struct strp_msg *rxm = strp_msg(skb); struct tls_msg *tlm = tls_msg(skb); int sub = 0; /* Determine zero-padding length */ if (prot->version == TLS_1_3_VERSION) { int offset = rxm->full_len - TLS_TAG_SIZE - 1; char content_type = darg->zc ? darg->tail : 0; int err; while (content_type == 0) { if (offset < prot->prepend_size) return -EBADMSG; err = skb_copy_bits(skb, rxm->offset + offset, &content_type, 1); if (err) return err; if (content_type) break; sub++; offset--; } tlm->control = content_type; } return sub; } static void tls_decrypt_done(void *data, int err) { struct aead_request *aead_req = data; struct crypto_aead *aead = crypto_aead_reqtfm(aead_req); struct scatterlist *sgout = aead_req->dst; struct tls_sw_context_rx *ctx; struct tls_decrypt_ctx *dctx; struct tls_context *tls_ctx; struct scatterlist *sg; unsigned int pages; struct sock *sk; int aead_size; /* If requests get too backlogged crypto API returns -EBUSY and calls * ->complete(-EINPROGRESS) immediately followed by ->complete(0) * to make waiting for backlog to flush with crypto_wait_req() easier. * First wait converts -EBUSY -> -EINPROGRESS, and the second one * -EINPROGRESS -> 0. * We have a single struct crypto_async_request per direction, this * scheme doesn't help us, so just ignore the first ->complete(). */ if (err == -EINPROGRESS) return; aead_size = sizeof(*aead_req) + crypto_aead_reqsize(aead); aead_size = ALIGN(aead_size, __alignof__(*dctx)); dctx = (void *)((u8 *)aead_req + aead_size); sk = dctx->sk; tls_ctx = tls_get_ctx(sk); ctx = tls_sw_ctx_rx(tls_ctx); /* Propagate if there was an err */ if (err) { if (err == -EBADMSG) TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR); ctx->async_wait.err = err; tls_err_abort(sk, err); } /* Free the destination pages if skb was not decrypted inplace */ if (dctx->free_sgout) { /* Skip the first S/G entry as it points to AAD */ for_each_sg(sg_next(sgout), sg, UINT_MAX, pages) { if (!sg) break; put_page(sg_page(sg)); } } kfree(aead_req); if (atomic_dec_and_test(&ctx->decrypt_pending)) complete(&ctx->async_wait.completion); } static int tls_decrypt_async_wait(struct tls_sw_context_rx *ctx) { if (!atomic_dec_and_test(&ctx->decrypt_pending)) crypto_wait_req(-EINPROGRESS, &ctx->async_wait); atomic_inc(&ctx->decrypt_pending); return ctx->async_wait.err; } static int tls_do_decryption(struct sock *sk, struct scatterlist *sgin, struct scatterlist *sgout, char *iv_recv, size_t data_len, struct aead_request *aead_req, struct tls_decrypt_arg *darg) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); int ret; aead_request_set_tfm(aead_req, ctx->aead_recv); aead_request_set_ad(aead_req, prot->aad_size); aead_request_set_crypt(aead_req, sgin, sgout, data_len + prot->tag_size, (u8 *)iv_recv); if (darg->async) { aead_request_set_callback(aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, tls_decrypt_done, aead_req); DEBUG_NET_WARN_ON_ONCE(atomic_read(&ctx->decrypt_pending) < 1); atomic_inc(&ctx->decrypt_pending); } else { DECLARE_CRYPTO_WAIT(wait); aead_request_set_callback(aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &wait); ret = crypto_aead_decrypt(aead_req); if (ret == -EINPROGRESS || ret == -EBUSY) ret = crypto_wait_req(ret, &wait); return ret; } ret = crypto_aead_decrypt(aead_req); if (ret == -EINPROGRESS) return 0; if (ret == -EBUSY) { ret = tls_decrypt_async_wait(ctx); darg->async_done = true; /* all completions have run, we're not doing async anymore */ darg->async = false; return ret; } atomic_dec(&ctx->decrypt_pending); darg->async = false; return ret; } static void tls_trim_both_msgs(struct sock *sk, int target_size) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec; sk_msg_trim(sk, &rec->msg_plaintext, target_size); if (target_size > 0) target_size += prot->overhead_size; sk_msg_trim(sk, &rec->msg_encrypted, target_size); } static int tls_alloc_encrypted_msg(struct sock *sk, int len) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec; struct sk_msg *msg_en = &rec->msg_encrypted; return sk_msg_alloc(sk, msg_en, len, 0); } static int tls_clone_plaintext_msg(struct sock *sk, int required) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec; struct sk_msg *msg_pl = &rec->msg_plaintext; struct sk_msg *msg_en = &rec->msg_encrypted; int skip, len; /* We add page references worth len bytes from encrypted sg * at the end of plaintext sg. It is guaranteed that msg_en * has enough required room (ensured by caller). */ len = required - msg_pl->sg.size; /* Skip initial bytes in msg_en's data to be able to use * same offset of both plain and encrypted data. */ skip = prot->prepend_size + msg_pl->sg.size; return sk_msg_clone(sk, msg_pl, msg_en, skip, len); } static struct tls_rec *tls_get_rec(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct sk_msg *msg_pl, *msg_en; struct tls_rec *rec; int mem_size; mem_size = sizeof(struct tls_rec) + crypto_aead_reqsize(ctx->aead_send); rec = kzalloc(mem_size, sk->sk_allocation); if (!rec) return NULL; msg_pl = &rec->msg_plaintext; msg_en = &rec->msg_encrypted; sk_msg_init(msg_pl); sk_msg_init(msg_en); sg_init_table(rec->sg_aead_in, 2); sg_set_buf(&rec->sg_aead_in[0], rec->aad_space, prot->aad_size); sg_unmark_end(&rec->sg_aead_in[1]); sg_init_table(rec->sg_aead_out, 2); sg_set_buf(&rec->sg_aead_out[0], rec->aad_space, prot->aad_size); sg_unmark_end(&rec->sg_aead_out[1]); rec->sk = sk; return rec; } static void tls_free_rec(struct sock *sk, struct tls_rec *rec) { sk_msg_free(sk, &rec->msg_encrypted); sk_msg_free(sk, &rec->msg_plaintext); kfree(rec); } static void tls_free_open_rec(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec; if (rec) { tls_free_rec(sk, rec); ctx->open_rec = NULL; } } int tls_tx_records(struct sock *sk, int flags) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec, *tmp; struct sk_msg *msg_en; int tx_flags, rc = 0; if (tls_is_partially_sent_record(tls_ctx)) { rec = list_first_entry(&ctx->tx_list, struct tls_rec, list); if (flags == -1) tx_flags = rec->tx_flags; else tx_flags = flags; rc = tls_push_partial_record(sk, tls_ctx, tx_flags); if (rc) goto tx_err; /* Full record has been transmitted. * Remove the head of tx_list */ list_del(&rec->list); sk_msg_free(sk, &rec->msg_plaintext); kfree(rec); } /* Tx all ready records */ list_for_each_entry_safe(rec, tmp, &ctx->tx_list, list) { if (READ_ONCE(rec->tx_ready)) { if (flags == -1) tx_flags = rec->tx_flags; else tx_flags = flags; msg_en = &rec->msg_encrypted; rc = tls_push_sg(sk, tls_ctx, &msg_en->sg.data[msg_en->sg.curr], 0, tx_flags); if (rc) goto tx_err; list_del(&rec->list); sk_msg_free(sk, &rec->msg_plaintext); kfree(rec); } else { break; } } tx_err: if (rc < 0 && rc != -EAGAIN) tls_err_abort(sk, rc); return rc; } static void tls_encrypt_done(void *data, int err) { struct tls_sw_context_tx *ctx; struct tls_context *tls_ctx; struct tls_prot_info *prot; struct tls_rec *rec = data; struct scatterlist *sge; struct sk_msg *msg_en; struct sock *sk; if (err == -EINPROGRESS) /* see the comment in tls_decrypt_done() */ return; msg_en = &rec->msg_encrypted; sk = rec->sk; tls_ctx = tls_get_ctx(sk); prot = &tls_ctx->prot_info; ctx = tls_sw_ctx_tx(tls_ctx); sge = sk_msg_elem(msg_en, msg_en->sg.curr); sge->offset -= prot->prepend_size; sge->length += prot->prepend_size; /* Check if error is previously set on socket */ if (err || sk->sk_err) { rec = NULL; /* If err is already set on socket, return the same code */ if (sk->sk_err) { ctx->async_wait.err = -sk->sk_err; } else { ctx->async_wait.err = err; tls_err_abort(sk, err); } } if (rec) { struct tls_rec *first_rec; /* Mark the record as ready for transmission */ smp_store_mb(rec->tx_ready, true); /* If received record is at head of tx_list, schedule tx */ first_rec = list_first_entry(&ctx->tx_list, struct tls_rec, list); if (rec == first_rec) { /* Schedule the transmission */ if (!test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) schedule_delayed_work(&ctx->tx_work.work, 1); } } if (atomic_dec_and_test(&ctx->encrypt_pending)) complete(&ctx->async_wait.completion); } static int tls_encrypt_async_wait(struct tls_sw_context_tx *ctx) { if (!atomic_dec_and_test(&ctx->encrypt_pending)) crypto_wait_req(-EINPROGRESS, &ctx->async_wait); atomic_inc(&ctx->encrypt_pending); return ctx->async_wait.err; } static int tls_do_encryption(struct sock *sk, struct tls_context *tls_ctx, struct tls_sw_context_tx *ctx, struct aead_request *aead_req, size_t data_len, u32 start) { struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_rec *rec = ctx->open_rec; struct sk_msg *msg_en = &rec->msg_encrypted; struct scatterlist *sge = sk_msg_elem(msg_en, start); int rc, iv_offset = 0; /* For CCM based ciphers, first byte of IV is a constant */ switch (prot->cipher_type) { case TLS_CIPHER_AES_CCM_128: rec->iv_data[0] = TLS_AES_CCM_IV_B0_BYTE; iv_offset = 1; break; case TLS_CIPHER_SM4_CCM: rec->iv_data[0] = TLS_SM4_CCM_IV_B0_BYTE; iv_offset = 1; break; } memcpy(&rec->iv_data[iv_offset], tls_ctx->tx.iv, prot->iv_size + prot->salt_size); tls_xor_iv_with_seq(prot, rec->iv_data + iv_offset, tls_ctx->tx.rec_seq); sge->offset += prot->prepend_size; sge->length -= prot->prepend_size; msg_en->sg.curr = start; aead_request_set_tfm(aead_req, ctx->aead_send); aead_request_set_ad(aead_req, prot->aad_size); aead_request_set_crypt(aead_req, rec->sg_aead_in, rec->sg_aead_out, data_len, rec->iv_data); aead_request_set_callback(aead_req, CRYPTO_TFM_REQ_MAY_BACKLOG, tls_encrypt_done, rec); /* Add the record in tx_list */ list_add_tail((struct list_head *)&rec->list, &ctx->tx_list); DEBUG_NET_WARN_ON_ONCE(atomic_read(&ctx->encrypt_pending) < 1); atomic_inc(&ctx->encrypt_pending); rc = crypto_aead_encrypt(aead_req); if (rc == -EBUSY) { rc = tls_encrypt_async_wait(ctx); rc = rc ?: -EINPROGRESS; } if (!rc || rc != -EINPROGRESS) { atomic_dec(&ctx->encrypt_pending); sge->offset -= prot->prepend_size; sge->length += prot->prepend_size; } if (!rc) { WRITE_ONCE(rec->tx_ready, true); } else if (rc != -EINPROGRESS) { list_del(&rec->list); return rc; } /* Unhook the record from context if encryption is not failure */ ctx->open_rec = NULL; tls_advance_record_sn(sk, prot, &tls_ctx->tx); return rc; } static int tls_split_open_record(struct sock *sk, struct tls_rec *from, struct tls_rec **to, struct sk_msg *msg_opl, struct sk_msg *msg_oen, u32 split_point, u32 tx_overhead_size, u32 *orig_end) { u32 i, j, bytes = 0, apply = msg_opl->apply_bytes; struct scatterlist *sge, *osge, *nsge; u32 orig_size = msg_opl->sg.size; struct scatterlist tmp = { }; struct sk_msg *msg_npl; struct tls_rec *new; int ret; new = tls_get_rec(sk); if (!new) return -ENOMEM; ret = sk_msg_alloc(sk, &new->msg_encrypted, msg_opl->sg.size + tx_overhead_size, 0); if (ret < 0) { tls_free_rec(sk, new); return ret; } *orig_end = msg_opl->sg.end; i = msg_opl->sg.start; sge = sk_msg_elem(msg_opl, i); while (apply && sge->length) { if (sge->length > apply) { u32 len = sge->length - apply; get_page(sg_page(sge)); sg_set_page(&tmp, sg_page(sge), len, sge->offset + apply); sge->length = apply; bytes += apply; apply = 0; } else { apply -= sge->length; bytes += sge->length; } sk_msg_iter_var_next(i); if (i == msg_opl->sg.end) break; sge = sk_msg_elem(msg_opl, i); } msg_opl->sg.end = i; msg_opl->sg.curr = i; msg_opl->sg.copybreak = 0; msg_opl->apply_bytes = 0; msg_opl->sg.size = bytes; msg_npl = &new->msg_plaintext; msg_npl->apply_bytes = apply; msg_npl->sg.size = orig_size - bytes; j = msg_npl->sg.start; nsge = sk_msg_elem(msg_npl, j); if (tmp.length) { memcpy(nsge, &tmp, sizeof(*nsge)); sk_msg_iter_var_next(j); nsge = sk_msg_elem(msg_npl, j); } osge = sk_msg_elem(msg_opl, i); while (osge->length) { memcpy(nsge, osge, sizeof(*nsge)); sg_unmark_end(nsge); sk_msg_iter_var_next(i); sk_msg_iter_var_next(j); if (i == *orig_end) break; osge = sk_msg_elem(msg_opl, i); nsge = sk_msg_elem(msg_npl, j); } msg_npl->sg.end = j; msg_npl->sg.curr = j; msg_npl->sg.copybreak = 0; *to = new; return 0; } static void tls_merge_open_record(struct sock *sk, struct tls_rec *to, struct tls_rec *from, u32 orig_end) { struct sk_msg *msg_npl = &from->msg_plaintext; struct sk_msg *msg_opl = &to->msg_plaintext; struct scatterlist *osge, *nsge; u32 i, j; i = msg_opl->sg.end; sk_msg_iter_var_prev(i); j = msg_npl->sg.start; osge = sk_msg_elem(msg_opl, i); nsge = sk_msg_elem(msg_npl, j); if (sg_page(osge) == sg_page(nsge) && osge->offset + osge->length == nsge->offset) { osge->length += nsge->length; put_page(sg_page(nsge)); } msg_opl->sg.end = orig_end; msg_opl->sg.curr = orig_end; msg_opl->sg.copybreak = 0; msg_opl->apply_bytes = msg_opl->sg.size + msg_npl->sg.size; msg_opl->sg.size += msg_npl->sg.size; sk_msg_free(sk, &to->msg_encrypted); sk_msg_xfer_full(&to->msg_encrypted, &from->msg_encrypted); kfree(from); } static int tls_push_record(struct sock *sk, int flags, unsigned char record_type) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec, *tmp = NULL; u32 i, split_point, orig_end; struct sk_msg *msg_pl, *msg_en; struct aead_request *req; bool split; int rc; if (!rec) return 0; msg_pl = &rec->msg_plaintext; msg_en = &rec->msg_encrypted; split_point = msg_pl->apply_bytes; split = split_point && split_point < msg_pl->sg.size; if (unlikely((!split && msg_pl->sg.size + prot->overhead_size > msg_en->sg.size) || (split && split_point + prot->overhead_size > msg_en->sg.size))) { split = true; split_point = msg_en->sg.size; } if (split) { rc = tls_split_open_record(sk, rec, &tmp, msg_pl, msg_en, split_point, prot->overhead_size, &orig_end); if (rc < 0) return rc; /* This can happen if above tls_split_open_record allocates * a single large encryption buffer instead of two smaller * ones. In this case adjust pointers and continue without * split. */ if (!msg_pl->sg.size) { tls_merge_open_record(sk, rec, tmp, orig_end); msg_pl = &rec->msg_plaintext; msg_en = &rec->msg_encrypted; split = false; } sk_msg_trim(sk, msg_en, msg_pl->sg.size + prot->overhead_size); } rec->tx_flags = flags; req = &rec->aead_req; i = msg_pl->sg.end; sk_msg_iter_var_prev(i); rec->content_type = record_type; if (prot->version == TLS_1_3_VERSION) { /* Add content type to end of message. No padding added */ sg_set_buf(&rec->sg_content_type, &rec->content_type, 1); sg_mark_end(&rec->sg_content_type); sg_chain(msg_pl->sg.data, msg_pl->sg.end + 1, &rec->sg_content_type); } else { sg_mark_end(sk_msg_elem(msg_pl, i)); } if (msg_pl->sg.end < msg_pl->sg.start) { sg_chain(&msg_pl->sg.data[msg_pl->sg.start], MAX_SKB_FRAGS - msg_pl->sg.start + 1, msg_pl->sg.data); } i = msg_pl->sg.start; sg_chain(rec->sg_aead_in, 2, &msg_pl->sg.data[i]); i = msg_en->sg.end; sk_msg_iter_var_prev(i); sg_mark_end(sk_msg_elem(msg_en, i)); i = msg_en->sg.start; sg_chain(rec->sg_aead_out, 2, &msg_en->sg.data[i]); tls_make_aad(rec->aad_space, msg_pl->sg.size + prot->tail_size, tls_ctx->tx.rec_seq, record_type, prot); tls_fill_prepend(tls_ctx, page_address(sg_page(&msg_en->sg.data[i])) + msg_en->sg.data[i].offset, msg_pl->sg.size + prot->tail_size, record_type); tls_ctx->pending_open_record_frags = false; rc = tls_do_encryption(sk, tls_ctx, ctx, req, msg_pl->sg.size + prot->tail_size, i); if (rc < 0) { if (rc != -EINPROGRESS) { tls_err_abort(sk, -EBADMSG); if (split) { tls_ctx->pending_open_record_frags = true; tls_merge_open_record(sk, rec, tmp, orig_end); } } ctx->async_capable = 1; return rc; } else if (split) { msg_pl = &tmp->msg_plaintext; msg_en = &tmp->msg_encrypted; sk_msg_trim(sk, msg_en, msg_pl->sg.size + prot->overhead_size); tls_ctx->pending_open_record_frags = true; ctx->open_rec = tmp; } return tls_tx_records(sk, flags); } static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, bool full_record, u8 record_type, ssize_t *copied, int flags) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct sk_msg msg_redir = { }; struct sk_psock *psock; struct sock *sk_redir; struct tls_rec *rec; bool enospc, policy, redir_ingress; int err = 0, send; u32 delta = 0; policy = !(flags & MSG_SENDPAGE_NOPOLICY); psock = sk_psock_get(sk); if (!psock || !policy) { err = tls_push_record(sk, flags, record_type); if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); err = -sk->sk_err; } if (psock) sk_psock_put(sk, psock); return err; } more_data: enospc = sk_msg_full(msg); if (psock->eval == __SK_NONE) { delta = msg->sg.size; psock->eval = sk_psock_msg_verdict(sk, psock, msg); delta -= msg->sg.size; } if (msg->cork_bytes && msg->cork_bytes > msg->sg.size && !enospc && !full_record) { err = -ENOSPC; goto out_err; } msg->cork_bytes = 0; send = msg->sg.size; if (msg->apply_bytes && msg->apply_bytes < send) send = msg->apply_bytes; switch (psock->eval) { case __SK_PASS: err = tls_push_record(sk, flags, record_type); if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); err = -sk->sk_err; goto out_err; } break; case __SK_REDIRECT: redir_ingress = psock->redir_ingress; sk_redir = psock->sk_redir; memcpy(&msg_redir, msg, sizeof(*msg)); if (msg->apply_bytes < send) msg->apply_bytes = 0; else msg->apply_bytes -= send; sk_msg_return_zero(sk, msg, send); msg->sg.size -= send; release_sock(sk); err = tcp_bpf_sendmsg_redir(sk_redir, redir_ingress, &msg_redir, send, flags); lock_sock(sk); if (err < 0) { /* Regardless of whether the data represented by * msg_redir is sent successfully, we have already * uncharged it via sk_msg_return_zero(). The * msg->sg.size represents the remaining unprocessed * data, which needs to be uncharged here. */ sk_mem_uncharge(sk, msg->sg.size); *copied -= sk_msg_free_nocharge(sk, &msg_redir); msg->sg.size = 0; } if (msg->sg.size == 0) tls_free_open_rec(sk); break; case __SK_DROP: default: sk_msg_free_partial(sk, msg, send); if (msg->apply_bytes < send) msg->apply_bytes = 0; else msg->apply_bytes -= send; if (msg->sg.size == 0) tls_free_open_rec(sk); *copied -= (send + delta); err = -EACCES; } if (likely(!err)) { bool reset_eval = !ctx->open_rec; rec = ctx->open_rec; if (rec) { msg = &rec->msg_plaintext; if (!msg->apply_bytes) reset_eval = true; } if (reset_eval) { psock->eval = __SK_NONE; if (psock->sk_redir) { sock_put(psock->sk_redir); psock->sk_redir = NULL; } } if (rec) goto more_data; } out_err: sk_psock_put(sk, psock); return err; } static int tls_sw_push_pending_record(struct sock *sk, int flags) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec = ctx->open_rec; struct sk_msg *msg_pl; size_t copied; if (!rec) return 0; msg_pl = &rec->msg_plaintext; copied = msg_pl->sg.size; if (!copied) return 0; return bpf_exec_tx_verdict(msg_pl, sk, true, TLS_RECORD_TYPE_DATA, &copied, flags); } static int tls_sw_sendmsg_splice(struct sock *sk, struct msghdr *msg, struct sk_msg *msg_pl, size_t try_to_copy, ssize_t *copied) { struct page *page = NULL, **pages = &page; do { ssize_t part; size_t off; part = iov_iter_extract_pages(&msg->msg_iter, &pages, try_to_copy, 1, 0, &off); if (part <= 0) return part ?: -EIO; if (WARN_ON_ONCE(!sendpage_ok(page))) { iov_iter_revert(&msg->msg_iter, part); return -EIO; } sk_msg_page_add(msg_pl, page, part, off); msg_pl->sg.copybreak = 0; msg_pl->sg.curr = msg_pl->sg.end; sk_mem_charge(sk, part); *copied += part; try_to_copy -= part; } while (try_to_copy && !sk_msg_full(msg_pl)); return 0; } static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) { long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); bool async_capable = ctx->async_capable; unsigned char record_type = TLS_RECORD_TYPE_DATA; bool is_kvec = iov_iter_is_kvec(&msg->msg_iter); bool eor = !(msg->msg_flags & MSG_MORE); size_t try_to_copy; ssize_t copied = 0; struct sk_msg *msg_pl, *msg_en; struct tls_rec *rec; int required_size; int num_async = 0; bool full_record; int record_room; int num_zc = 0; int orig_size; int ret = 0; if (!eor && (msg->msg_flags & MSG_EOR)) return -EINVAL; if (unlikely(msg->msg_controllen)) { ret = tls_process_cmsg(sk, msg, &record_type); if (ret) { if (ret == -EINPROGRESS) num_async++; else if (ret != -EAGAIN) goto send_end; } } while (msg_data_left(msg)) { if (sk->sk_err) { ret = -sk->sk_err; goto send_end; } if (ctx->open_rec) rec = ctx->open_rec; else rec = ctx->open_rec = tls_get_rec(sk); if (!rec) { ret = -ENOMEM; goto send_end; } msg_pl = &rec->msg_plaintext; msg_en = &rec->msg_encrypted; orig_size = msg_pl->sg.size; full_record = false; try_to_copy = msg_data_left(msg); record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size; if (try_to_copy >= record_room) { try_to_copy = record_room; full_record = true; } required_size = msg_pl->sg.size + try_to_copy + prot->overhead_size; if (!sk_stream_memory_free(sk)) goto wait_for_sndbuf; alloc_encrypted: ret = tls_alloc_encrypted_msg(sk, required_size); if (ret) { if (ret != -ENOSPC) goto wait_for_memory; /* Adjust try_to_copy according to the amount that was * actually allocated. The difference is due * to max sg elements limit */ try_to_copy -= required_size - msg_en->sg.size; full_record = true; } if (try_to_copy && (msg->msg_flags & MSG_SPLICE_PAGES)) { ret = tls_sw_sendmsg_splice(sk, msg, msg_pl, try_to_copy, &copied); if (ret < 0) goto send_end; tls_ctx->pending_open_record_frags = true; if (sk_msg_full(msg_pl)) full_record = true; if (full_record || eor) goto copied; continue; } if (!is_kvec && (full_record || eor) && !async_capable) { u32 first = msg_pl->sg.end; ret = sk_msg_zerocopy_from_iter(sk, &msg->msg_iter, msg_pl, try_to_copy); if (ret) goto fallback_to_reg_send; num_zc++; copied += try_to_copy; sk_msg_sg_copy_set(msg_pl, first); ret = bpf_exec_tx_verdict(msg_pl, sk, full_record, record_type, &copied, msg->msg_flags); if (ret) { if (ret == -EINPROGRESS) num_async++; else if (ret == -ENOMEM) goto wait_for_memory; else if (ctx->open_rec && ret == -ENOSPC) { if (msg_pl->cork_bytes) { ret = 0; goto send_end; } goto rollback_iter; } else if (ret != -EAGAIN) goto send_end; } continue; rollback_iter: copied -= try_to_copy; sk_msg_sg_copy_clear(msg_pl, first); iov_iter_revert(&msg->msg_iter, msg_pl->sg.size - orig_size); fallback_to_reg_send: sk_msg_trim(sk, msg_pl, orig_size); } required_size = msg_pl->sg.size + try_to_copy; ret = tls_clone_plaintext_msg(sk, required_size); if (ret) { if (ret != -ENOSPC) goto send_end; /* Adjust try_to_copy according to the amount that was * actually allocated. The difference is due * to max sg elements limit */ try_to_copy -= required_size - msg_pl->sg.size; full_record = true; sk_msg_trim(sk, msg_en, msg_pl->sg.size + prot->overhead_size); } if (try_to_copy) { ret = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_pl, try_to_copy); if (ret < 0) goto trim_sgl; } /* Open records defined only if successfully copied, otherwise * we would trim the sg but not reset the open record frags. */ tls_ctx->pending_open_record_frags = true; copied += try_to_copy; copied: if (full_record || eor) { ret = bpf_exec_tx_verdict(msg_pl, sk, full_record, record_type, &copied, msg->msg_flags); if (ret) { if (ret == -EINPROGRESS) num_async++; else if (ret == -ENOMEM) goto wait_for_memory; else if (ret != -EAGAIN) { if (ret == -ENOSPC) ret = 0; goto send_end; } } } continue; wait_for_sndbuf: set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); wait_for_memory: ret = sk_stream_wait_memory(sk, &timeo); if (ret) { trim_sgl: if (ctx->open_rec) tls_trim_both_msgs(sk, orig_size); goto send_end; } if (ctx->open_rec && msg_en->sg.size < required_size) goto alloc_encrypted; } if (!num_async) { goto send_end; } else if (num_zc || eor) { int err; /* Wait for pending encryptions to get completed */ err = tls_encrypt_async_wait(ctx); if (err) { ret = err; copied = 0; } } /* Transmit if any encryptions have completed */ if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { cancel_delayed_work(&ctx->tx_work.work); tls_tx_records(sk, msg->msg_flags); } send_end: ret = sk_stream_error(sk, msg->msg_flags, ret); return copied > 0 ? copied : ret; } int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) { struct tls_context *tls_ctx = tls_get_ctx(sk); int ret; if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_CMSG_COMPAT | MSG_SPLICE_PAGES | MSG_EOR | MSG_SENDPAGE_NOPOLICY)) return -EOPNOTSUPP; ret = mutex_lock_interruptible(&tls_ctx->tx_lock); if (ret) return ret; lock_sock(sk); ret = tls_sw_sendmsg_locked(sk, msg, size); release_sock(sk); mutex_unlock(&tls_ctx->tx_lock); return ret; } /* * Handle unexpected EOF during splice without SPLICE_F_MORE set. */ void tls_sw_splice_eof(struct socket *sock) { struct sock *sk = sock->sk; struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec; struct sk_msg *msg_pl; ssize_t copied = 0; bool retrying = false; int ret = 0; if (!ctx->open_rec) return; mutex_lock(&tls_ctx->tx_lock); lock_sock(sk); retry: /* same checks as in tls_sw_push_pending_record() */ rec = ctx->open_rec; if (!rec) goto unlock; msg_pl = &rec->msg_plaintext; if (msg_pl->sg.size == 0) goto unlock; /* Check the BPF advisor and perform transmission. */ ret = bpf_exec_tx_verdict(msg_pl, sk, false, TLS_RECORD_TYPE_DATA, &copied, 0); switch (ret) { case 0: case -EAGAIN: if (retrying) goto unlock; retrying = true; goto retry; case -EINPROGRESS: break; default: goto unlock; } /* Wait for pending encryptions to get completed */ if (tls_encrypt_async_wait(ctx)) goto unlock; /* Transmit if any encryptions have completed */ if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { cancel_delayed_work(&ctx->tx_work.work); tls_tx_records(sk, 0); } unlock: release_sock(sk); mutex_unlock(&tls_ctx->tx_lock); } static int tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, bool released) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); DEFINE_WAIT_FUNC(wait, woken_wake_function); int ret = 0; long timeo; /* a rekey is pending, let userspace deal with it */ if (unlikely(ctx->key_update_pending)) return -EKEYEXPIRED; timeo = sock_rcvtimeo(sk, nonblock); while (!tls_strp_msg_ready(ctx)) { if (!sk_psock_queue_empty(psock)) return 0; if (sk->sk_err) return sock_error(sk); if (ret < 0) return ret; if (!skb_queue_empty(&sk->sk_receive_queue)) { tls_strp_check_rcv(&ctx->strp); if (tls_strp_msg_ready(ctx)) break; } if (sk->sk_shutdown & RCV_SHUTDOWN) return 0; if (sock_flag(sk, SOCK_DONE)) return 0; if (!timeo) return -EAGAIN; released = true; add_wait_queue(sk_sleep(sk), &wait); sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); ret = sk_wait_event(sk, &timeo, tls_strp_msg_ready(ctx) || !sk_psock_queue_empty(psock), &wait); sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); remove_wait_queue(sk_sleep(sk), &wait); /* Handle signals */ if (signal_pending(current)) return sock_intr_errno(timeo); } tls_strp_msg_load(&ctx->strp, released); return 1; } static int tls_setup_from_iter(struct iov_iter *from, int length, int *pages_used, struct scatterlist *to, int to_max_pages) { int rc = 0, i = 0, num_elem = *pages_used, maxpages; struct page *pages[MAX_SKB_FRAGS]; unsigned int size = 0; ssize_t copied, use; size_t offset; while (length > 0) { i = 0; maxpages = to_max_pages - num_elem; if (maxpages == 0) { rc = -EFAULT; goto out; } copied = iov_iter_get_pages2(from, pages, length, maxpages, &offset); if (copied <= 0) { rc = -EFAULT; goto out; } length -= copied; size += copied; while (copied) { use = min_t(int, copied, PAGE_SIZE - offset); sg_set_page(&to[num_elem], pages[i], use, offset); sg_unmark_end(&to[num_elem]); /* We do not uncharge memory from this API */ offset = 0; copied -= use; i++; num_elem++; } } /* Mark the end in the last sg entry if newly added */ if (num_elem > *pages_used) sg_mark_end(&to[num_elem - 1]); out: if (rc) iov_iter_revert(from, size); *pages_used = num_elem; return rc; } static struct sk_buff * tls_alloc_clrtxt_skb(struct sock *sk, struct sk_buff *skb, unsigned int full_len) { struct strp_msg *clr_rxm; struct sk_buff *clr_skb; int err; clr_skb = alloc_skb_with_frags(0, full_len, TLS_PAGE_ORDER, &err, sk->sk_allocation); if (!clr_skb) return NULL; skb_copy_header(clr_skb, skb); clr_skb->len = full_len; clr_skb->data_len = full_len; clr_rxm = strp_msg(clr_skb); clr_rxm->offset = 0; return clr_skb; } /* Decrypt handlers * * tls_decrypt_sw() and tls_decrypt_device() are decrypt handlers. * They must transform the darg in/out argument are as follows: * | Input | Output * ------------------------------------------------------------------- * zc | Zero-copy decrypt allowed | Zero-copy performed * async | Async decrypt allowed | Async crypto used / in progress * skb | * | Output skb * * If ZC decryption was performed darg.skb will point to the input skb. */ /* This function decrypts the input skb into either out_iov or in out_sg * or in skb buffers itself. The input parameter 'darg->zc' indicates if * zero-copy mode needs to be tried or not. With zero-copy mode, either * out_iov or out_sg must be non-NULL. In case both out_iov and out_sg are * NULL, then the decryption happens inside skb buffers itself, i.e. * zero-copy gets disabled and 'darg->zc' is updated. */ static int tls_decrypt_sg(struct sock *sk, struct iov_iter *out_iov, struct scatterlist *out_sg, struct tls_decrypt_arg *darg) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct tls_prot_info *prot = &tls_ctx->prot_info; int n_sgin, n_sgout, aead_size, err, pages = 0; struct sk_buff *skb = tls_strp_msg(ctx); const struct strp_msg *rxm = strp_msg(skb); const struct tls_msg *tlm = tls_msg(skb); struct aead_request *aead_req; struct scatterlist *sgin = NULL; struct scatterlist *sgout = NULL; const int data_len = rxm->full_len - prot->overhead_size; int tail_pages = !!prot->tail_size; struct tls_decrypt_ctx *dctx; struct sk_buff *clear_skb; int iv_offset = 0; u8 *mem; n_sgin = skb_nsg(skb, rxm->offset + prot->prepend_size, rxm->full_len - prot->prepend_size); if (n_sgin < 1) return n_sgin ?: -EBADMSG; if (darg->zc && (out_iov || out_sg)) { clear_skb = NULL; if (out_iov) n_sgout = 1 + tail_pages + iov_iter_npages_cap(out_iov, INT_MAX, data_len); else n_sgout = sg_nents(out_sg); } else { darg->zc = false; clear_skb = tls_alloc_clrtxt_skb(sk, skb, rxm->full_len); if (!clear_skb) return -ENOMEM; n_sgout = 1 + skb_shinfo(clear_skb)->nr_frags; } /* Increment to accommodate AAD */ n_sgin = n_sgin + 1; /* Allocate a single block of memory which contains * aead_req || tls_decrypt_ctx. * Both structs are variable length. */ aead_size = sizeof(*aead_req) + crypto_aead_reqsize(ctx->aead_recv); aead_size = ALIGN(aead_size, __alignof__(*dctx)); mem = kmalloc(aead_size + struct_size(dctx, sg, size_add(n_sgin, n_sgout)), sk->sk_allocation); if (!mem) { err = -ENOMEM; goto exit_free_skb; } /* Segment the allocated memory */ aead_req = (struct aead_request *)mem; dctx = (struct tls_decrypt_ctx *)(mem + aead_size); dctx->sk = sk; sgin = &dctx->sg[0]; sgout = &dctx->sg[n_sgin]; /* For CCM based ciphers, first byte of nonce+iv is a constant */ switch (prot->cipher_type) { case TLS_CIPHER_AES_CCM_128: dctx->iv[0] = TLS_AES_CCM_IV_B0_BYTE; iv_offset = 1; break; case TLS_CIPHER_SM4_CCM: dctx->iv[0] = TLS_SM4_CCM_IV_B0_BYTE; iv_offset = 1; break; } /* Prepare IV */ if (prot->version == TLS_1_3_VERSION || prot->cipher_type == TLS_CIPHER_CHACHA20_POLY1305) { memcpy(&dctx->iv[iv_offset], tls_ctx->rx.iv, prot->iv_size + prot->salt_size); } else { err = skb_copy_bits(skb, rxm->offset + TLS_HEADER_SIZE, &dctx->iv[iv_offset] + prot->salt_size, prot->iv_size); if (err < 0) goto exit_free; memcpy(&dctx->iv[iv_offset], tls_ctx->rx.iv, prot->salt_size); } tls_xor_iv_with_seq(prot, &dctx->iv[iv_offset], tls_ctx->rx.rec_seq); /* Prepare AAD */ tls_make_aad(dctx->aad, rxm->full_len - prot->overhead_size + prot->tail_size, tls_ctx->rx.rec_seq, tlm->control, prot); /* Prepare sgin */ sg_init_table(sgin, n_sgin); sg_set_buf(&sgin[0], dctx->aad, prot->aad_size); err = skb_to_sgvec(skb, &sgin[1], rxm->offset + prot->prepend_size, rxm->full_len - prot->prepend_size); if (err < 0) goto exit_free; if (clear_skb) { sg_init_table(sgout, n_sgout); sg_set_buf(&sgout[0], dctx->aad, prot->aad_size); err = skb_to_sgvec(clear_skb, &sgout[1], prot->prepend_size, data_len + prot->tail_size); if (err < 0) goto exit_free; } else if (out_iov) { sg_init_table(sgout, n_sgout); sg_set_buf(&sgout[0], dctx->aad, prot->aad_size); err = tls_setup_from_iter(out_iov, data_len, &pages, &sgout[1], (n_sgout - 1 - tail_pages)); if (err < 0) goto exit_free_pages; if (prot->tail_size) { sg_unmark_end(&sgout[pages]); sg_set_buf(&sgout[pages + 1], &dctx->tail, prot->tail_size); sg_mark_end(&sgout[pages + 1]); } } else if (out_sg) { memcpy(sgout, out_sg, n_sgout * sizeof(*sgout)); } dctx->free_sgout = !!pages; /* Prepare and submit AEAD request */ err = tls_do_decryption(sk, sgin, sgout, dctx->iv, data_len + prot->tail_size, aead_req, darg); if (err) { if (darg->async_done) goto exit_free_skb; goto exit_free_pages; } darg->skb = clear_skb ?: tls_strp_msg(ctx); clear_skb = NULL; if (unlikely(darg->async)) { err = tls_strp_msg_hold(&ctx->strp, &ctx->async_hold); if (err) __skb_queue_tail(&ctx->async_hold, darg->skb); return err; } if (unlikely(darg->async_done)) return 0; if (prot->tail_size) darg->tail = dctx->tail; exit_free_pages: /* Release the pages in case iov was mapped to pages */ for (; pages > 0; pages--) put_page(sg_page(&sgout[pages])); exit_free: kfree(mem); exit_free_skb: consume_skb(clear_skb); return err; } static int tls_decrypt_sw(struct sock *sk, struct tls_context *tls_ctx, struct msghdr *msg, struct tls_decrypt_arg *darg) { struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct tls_prot_info *prot = &tls_ctx->prot_info; struct strp_msg *rxm; int pad, err; err = tls_decrypt_sg(sk, &msg->msg_iter, NULL, darg); if (err < 0) { if (err == -EBADMSG) TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR); return err; } /* keep going even for ->async, the code below is TLS 1.3 */ /* If opportunistic TLS 1.3 ZC failed retry without ZC */ if (unlikely(darg->zc && prot->version == TLS_1_3_VERSION && darg->tail != TLS_RECORD_TYPE_DATA)) { darg->zc = false; if (!darg->tail) TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXNOPADVIOL); TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTRETRY); return tls_decrypt_sw(sk, tls_ctx, msg, darg); } pad = tls_padding_length(prot, darg->skb, darg); if (pad < 0) { if (darg->skb != tls_strp_msg(ctx)) consume_skb(darg->skb); return pad; } rxm = strp_msg(darg->skb); rxm->full_len -= pad; return 0; } static int tls_decrypt_device(struct sock *sk, struct msghdr *msg, struct tls_context *tls_ctx, struct tls_decrypt_arg *darg) { struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct tls_prot_info *prot = &tls_ctx->prot_info; struct strp_msg *rxm; int pad, err; if (tls_ctx->rx_conf != TLS_HW) return 0; err = tls_device_decrypted(sk, tls_ctx); if (err <= 0) return err; pad = tls_padding_length(prot, tls_strp_msg(ctx), darg); if (pad < 0) return pad; darg->async = false; darg->skb = tls_strp_msg(ctx); /* ->zc downgrade check, in case TLS 1.3 gets here */ darg->zc &= !(prot->version == TLS_1_3_VERSION && tls_msg(darg->skb)->control != TLS_RECORD_TYPE_DATA); rxm = strp_msg(darg->skb); rxm->full_len -= pad; if (!darg->zc) { /* Non-ZC case needs a real skb */ darg->skb = tls_strp_msg_detach(ctx); if (!darg->skb) return -ENOMEM; } else { unsigned int off, len; /* In ZC case nobody cares about the output skb. * Just copy the data here. Note the skb is not fully trimmed. */ off = rxm->offset + prot->prepend_size; len = rxm->full_len - prot->overhead_size; err = skb_copy_datagram_msg(darg->skb, off, msg, len); if (err) return err; } return 1; } static int tls_check_pending_rekey(struct sock *sk, struct tls_context *ctx, struct sk_buff *skb) { const struct strp_msg *rxm = strp_msg(skb); const struct tls_msg *tlm = tls_msg(skb); char hs_type; int err; if (likely(tlm->control != TLS_RECORD_TYPE_HANDSHAKE)) return 0; if (rxm->full_len < 1) return 0; err = skb_copy_bits(skb, rxm->offset, &hs_type, 1); if (err < 0) { DEBUG_NET_WARN_ON_ONCE(1); return err; } if (hs_type == TLS_HANDSHAKE_KEYUPDATE) { struct tls_sw_context_rx *rx_ctx = ctx->priv_ctx_rx; WRITE_ONCE(rx_ctx->key_update_pending, true); TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYRECEIVED); } return 0; } static int tls_rx_one_record(struct sock *sk, struct msghdr *msg, struct tls_decrypt_arg *darg) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_prot_info *prot = &tls_ctx->prot_info; struct strp_msg *rxm; int err; err = tls_decrypt_device(sk, msg, tls_ctx, darg); if (!err) err = tls_decrypt_sw(sk, tls_ctx, msg, darg); if (err < 0) return err; rxm = strp_msg(darg->skb); rxm->offset += prot->prepend_size; rxm->full_len -= prot->overhead_size; tls_advance_record_sn(sk, prot, &tls_ctx->rx); return tls_check_pending_rekey(sk, tls_ctx, darg->skb); } int decrypt_skb(struct sock *sk, struct scatterlist *sgout) { struct tls_decrypt_arg darg = { .zc = true, }; return tls_decrypt_sg(sk, NULL, sgout, &darg); } static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm, u8 *control) { int err; if (!*control) { *control = tlm->control; if (!*control) return -EBADMSG; err = put_cmsg(msg, SOL_TLS, TLS_GET_RECORD_TYPE, sizeof(*control), control); if (*control != TLS_RECORD_TYPE_DATA) { if (err || msg->msg_flags & MSG_CTRUNC) return -EIO; } } else if (*control != tlm->control) { return 0; } return 1; } static void tls_rx_rec_done(struct tls_sw_context_rx *ctx) { tls_strp_msg_done(&ctx->strp); } /* This function traverses the rx_list in tls receive context to copies the * decrypted records into the buffer provided by caller zero copy is not * true. Further, the records are removed from the rx_list if it is not a peek * case and the record has been consumed completely. */ static int process_rx_list(struct tls_sw_context_rx *ctx, struct msghdr *msg, u8 *control, size_t skip, size_t len, bool is_peek, bool *more) { struct sk_buff *skb = skb_peek(&ctx->rx_list); struct tls_msg *tlm; ssize_t copied = 0; int err; while (skip && skb) { struct strp_msg *rxm = strp_msg(skb); tlm = tls_msg(skb); err = tls_record_content_type(msg, tlm, control); if (err <= 0) goto more; if (skip < rxm->full_len) break; skip = skip - rxm->full_len; skb = skb_peek_next(skb, &ctx->rx_list); } while (len && skb) { struct sk_buff *next_skb; struct strp_msg *rxm = strp_msg(skb); int chunk = min_t(unsigned int, rxm->full_len - skip, len); tlm = tls_msg(skb); err = tls_record_content_type(msg, tlm, control); if (err <= 0) goto more; err = skb_copy_datagram_msg(skb, rxm->offset + skip, msg, chunk); if (err < 0) goto more; len = len - chunk; copied = copied + chunk; /* Consume the data from record if it is non-peek case*/ if (!is_peek) { rxm->offset = rxm->offset + chunk; rxm->full_len = rxm->full_len - chunk; /* Return if there is unconsumed data in the record */ if (rxm->full_len - skip) break; } /* The remaining skip-bytes must lie in 1st record in rx_list. * So from the 2nd record, 'skip' should be 0. */ skip = 0; if (msg) msg->msg_flags |= MSG_EOR; next_skb = skb_peek_next(skb, &ctx->rx_list); if (!is_peek) { __skb_unlink(skb, &ctx->rx_list); consume_skb(skb); } skb = next_skb; } err = 0; out: return copied ? : err; more: if (more) *more = true; goto out; } static bool tls_read_flush_backlog(struct sock *sk, struct tls_prot_info *prot, size_t len_left, size_t decrypted, ssize_t done, size_t *flushed_at) { size_t max_rec; if (len_left <= decrypted) return false; max_rec = prot->overhead_size - prot->tail_size + TLS_MAX_PAYLOAD_SIZE; if (done - *flushed_at < SZ_128K && tcp_inq(sk) > max_rec) return false; *flushed_at = done; return sk_flush_backlog(sk); } static int tls_rx_reader_acquire(struct sock *sk, struct tls_sw_context_rx *ctx, bool nonblock) { long timeo; int ret; timeo = sock_rcvtimeo(sk, nonblock); while (unlikely(ctx->reader_present)) { DEFINE_WAIT_FUNC(wait, woken_wake_function); ctx->reader_contended = 1; add_wait_queue(&ctx->wq, &wait); ret = sk_wait_event(sk, &timeo, !READ_ONCE(ctx->reader_present), &wait); remove_wait_queue(&ctx->wq, &wait); if (timeo <= 0) return -EAGAIN; if (signal_pending(current)) return sock_intr_errno(timeo); if (ret < 0) return ret; } WRITE_ONCE(ctx->reader_present, 1); return 0; } static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx *ctx, bool nonblock) { int err; lock_sock(sk); err = tls_rx_reader_acquire(sk, ctx, nonblock); if (err) release_sock(sk); return err; } static void tls_rx_reader_release(struct sock *sk, struct tls_sw_context_rx *ctx) { if (unlikely(ctx->reader_contended)) { if (wq_has_sleeper(&ctx->wq)) wake_up(&ctx->wq); else ctx->reader_contended = 0; WARN_ON_ONCE(!ctx->reader_present); } WRITE_ONCE(ctx->reader_present, 0); } static void tls_rx_reader_unlock(struct sock *sk, struct tls_sw_context_rx *ctx) { tls_rx_reader_release(sk, ctx); release_sock(sk); } int tls_sw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct tls_prot_info *prot = &tls_ctx->prot_info; ssize_t decrypted = 0, async_copy_bytes = 0; struct sk_psock *psock; unsigned char control = 0; size_t flushed_at = 0; struct strp_msg *rxm; struct tls_msg *tlm; ssize_t copied = 0; ssize_t peeked = 0; bool async = false; int target, err; bool is_kvec = iov_iter_is_kvec(&msg->msg_iter); bool is_peek = flags & MSG_PEEK; bool rx_more = false; bool released = true; bool bpf_strp_enabled; bool zc_capable; if (unlikely(flags & MSG_ERRQUEUE)) return sock_recv_errqueue(sk, msg, len, SOL_IP, IP_RECVERR); err = tls_rx_reader_lock(sk, ctx, flags & MSG_DONTWAIT); if (err < 0) return err; psock = sk_psock_get(sk); bpf_strp_enabled = sk_psock_strp_enabled(psock); /* If crypto failed the connection is broken */ err = ctx->async_wait.err; if (err) goto end; /* Process pending decrypted records. It must be non-zero-copy */ err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &rx_more); if (err < 0) goto end; copied = err; if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA) || rx_more) goto end; target = sock_rcvlowat(sk, flags & MSG_WAITALL, len); len = len - copied; zc_capable = !bpf_strp_enabled && !is_kvec && !is_peek && ctx->zc_capable; decrypted = 0; while (len && (decrypted + copied < target || tls_strp_msg_ready(ctx))) { struct tls_decrypt_arg darg; int to_decrypt, chunk; err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT, released); if (err <= 0) { if (psock) { chunk = sk_msg_recvmsg(sk, psock, msg, len, flags); if (chunk > 0) { decrypted += chunk; len -= chunk; continue; } } goto recv_end; } memset(&darg.inargs, 0, sizeof(darg.inargs)); rxm = strp_msg(tls_strp_msg(ctx)); tlm = tls_msg(tls_strp_msg(ctx)); to_decrypt = rxm->full_len - prot->overhead_size; if (zc_capable && to_decrypt <= len && tlm->control == TLS_RECORD_TYPE_DATA) darg.zc = true; /* Do not use async mode if record is non-data */ if (tlm->control == TLS_RECORD_TYPE_DATA && !bpf_strp_enabled) darg.async = ctx->async_capable; else darg.async = false; err = tls_rx_one_record(sk, msg, &darg); if (err < 0) { tls_err_abort(sk, -EBADMSG); goto recv_end; } async |= darg.async; /* If the type of records being processed is not known yet, * set it to record type just dequeued. If it is already known, * but does not match the record type just dequeued, go to end. * We always get record type here since for tls1.2, record type * is known just after record is dequeued from stream parser. * For tls1.3, we disable async. */ err = tls_record_content_type(msg, tls_msg(darg.skb), &control); if (err <= 0) { DEBUG_NET_WARN_ON_ONCE(darg.zc); tls_rx_rec_done(ctx); put_on_rx_list_err: __skb_queue_tail(&ctx->rx_list, darg.skb); goto recv_end; } /* periodically flush backlog, and feed strparser */ released = tls_read_flush_backlog(sk, prot, len, to_decrypt, decrypted + copied, &flushed_at); /* TLS 1.3 may have updated the length by more than overhead */ rxm = strp_msg(darg.skb); chunk = rxm->full_len; tls_rx_rec_done(ctx); if (!darg.zc) { bool partially_consumed = chunk > len; struct sk_buff *skb = darg.skb; DEBUG_NET_WARN_ON_ONCE(darg.skb == ctx->strp.anchor); if (async) { /* TLS 1.2-only, to_decrypt must be text len */ chunk = min_t(int, to_decrypt, len); async_copy_bytes += chunk; put_on_rx_list: decrypted += chunk; len -= chunk; __skb_queue_tail(&ctx->rx_list, skb); if (unlikely(control != TLS_RECORD_TYPE_DATA)) break; continue; } if (bpf_strp_enabled) { released = true; err = sk_psock_tls_strp_read(psock, skb); if (err != __SK_PASS) { rxm->offset = rxm->offset + rxm->full_len; rxm->full_len = 0; if (err == __SK_DROP) consume_skb(skb); continue; } } if (partially_consumed) chunk = len; err = skb_copy_datagram_msg(skb, rxm->offset, msg, chunk); if (err < 0) goto put_on_rx_list_err; if (is_peek) { peeked += chunk; goto put_on_rx_list; } if (partially_consumed) { rxm->offset += chunk; rxm->full_len -= chunk; goto put_on_rx_list; } consume_skb(skb); } decrypted += chunk; len -= chunk; /* Return full control message to userspace before trying * to parse another message type */ msg->msg_flags |= MSG_EOR; if (control != TLS_RECORD_TYPE_DATA) break; } recv_end: if (async) { int ret; /* Wait for all previously submitted records to be decrypted */ ret = tls_decrypt_async_wait(ctx); __skb_queue_purge(&ctx->async_hold); if (ret) { if (err >= 0 || err == -EINPROGRESS) err = ret; goto end; } /* Drain records from the rx_list & copy if required */ if (is_peek) err = process_rx_list(ctx, msg, &control, copied + peeked, decrypted - peeked, is_peek, NULL); else err = process_rx_list(ctx, msg, &control, 0, async_copy_bytes, is_peek, NULL); /* we could have copied less than we wanted, and possibly nothing */ decrypted += max(err, 0) - async_copy_bytes; } copied += decrypted; end: tls_rx_reader_unlock(sk, ctx); if (psock) sk_psock_put(sk, psock); return copied ? : err; } ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { struct tls_context *tls_ctx = tls_get_ctx(sock->sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct strp_msg *rxm = NULL; struct sock *sk = sock->sk; struct tls_msg *tlm; struct sk_buff *skb; ssize_t copied = 0; int chunk; int err; err = tls_rx_reader_lock(sk, ctx, flags & SPLICE_F_NONBLOCK); if (err < 0) return err; if (!skb_queue_empty(&ctx->rx_list)) { skb = __skb_dequeue(&ctx->rx_list); } else { struct tls_decrypt_arg darg; err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK, true); if (err <= 0) goto splice_read_end; memset(&darg.inargs, 0, sizeof(darg.inargs)); err = tls_rx_one_record(sk, NULL, &darg); if (err < 0) { tls_err_abort(sk, -EBADMSG); goto splice_read_end; } tls_rx_rec_done(ctx); skb = darg.skb; } rxm = strp_msg(skb); tlm = tls_msg(skb); /* splice does not support reading control messages */ if (tlm->control != TLS_RECORD_TYPE_DATA) { err = -EINVAL; goto splice_requeue; } chunk = min_t(unsigned int, rxm->full_len, len); copied = skb_splice_bits(skb, sk, rxm->offset, pipe, chunk, flags); if (copied < 0) goto splice_requeue; if (chunk < rxm->full_len) { rxm->offset += len; rxm->full_len -= len; goto splice_requeue; } consume_skb(skb); splice_read_end: tls_rx_reader_unlock(sk, ctx); return copied ? : err; splice_requeue: __skb_queue_head(&ctx->rx_list, skb); goto splice_read_end; } int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t read_actor) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct tls_prot_info *prot = &tls_ctx->prot_info; struct strp_msg *rxm = NULL; struct sk_buff *skb = NULL; struct sk_psock *psock; size_t flushed_at = 0; bool released = true; struct tls_msg *tlm; ssize_t copied = 0; ssize_t decrypted; int err, used; psock = sk_psock_get(sk); if (psock) { sk_psock_put(sk, psock); return -EINVAL; } err = tls_rx_reader_acquire(sk, ctx, true); if (err < 0) return err; /* If crypto failed the connection is broken */ err = ctx->async_wait.err; if (err) goto read_sock_end; decrypted = 0; do { if (!skb_queue_empty(&ctx->rx_list)) { skb = __skb_dequeue(&ctx->rx_list); rxm = strp_msg(skb); tlm = tls_msg(skb); } else { struct tls_decrypt_arg darg; err = tls_rx_rec_wait(sk, NULL, true, released); if (err <= 0) goto read_sock_end; memset(&darg.inargs, 0, sizeof(darg.inargs)); err = tls_rx_one_record(sk, NULL, &darg); if (err < 0) { tls_err_abort(sk, -EBADMSG); goto read_sock_end; } released = tls_read_flush_backlog(sk, prot, INT_MAX, 0, decrypted, &flushed_at); skb = darg.skb; rxm = strp_msg(skb); tlm = tls_msg(skb); decrypted += rxm->full_len; tls_rx_rec_done(ctx); } /* read_sock does not support reading control messages */ if (tlm->control != TLS_RECORD_TYPE_DATA) { err = -EINVAL; goto read_sock_requeue; } used = read_actor(desc, skb, rxm->offset, rxm->full_len); if (used <= 0) { if (!copied) err = used; goto read_sock_requeue; } copied += used; if (used < rxm->full_len) { rxm->offset += used; rxm->full_len -= used; if (!desc->count) goto read_sock_requeue; } else { consume_skb(skb); if (!desc->count) skb = NULL; } } while (skb); read_sock_end: tls_rx_reader_release(sk, ctx); return copied ? : err; read_sock_requeue: __skb_queue_head(&ctx->rx_list, skb); goto read_sock_end; } bool tls_sw_sock_is_readable(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); bool ingress_empty = true; struct sk_psock *psock; rcu_read_lock(); psock = sk_psock(sk); if (psock) ingress_empty = list_empty(&psock->ingress_msg); rcu_read_unlock(); return !ingress_empty || tls_strp_msg_ready(ctx) || !skb_queue_empty(&ctx->rx_list); } int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) { struct tls_context *tls_ctx = tls_get_ctx(strp->sk); struct tls_prot_info *prot = &tls_ctx->prot_info; char header[TLS_HEADER_SIZE + TLS_MAX_IV_SIZE]; size_t cipher_overhead; size_t data_len = 0; int ret; /* Verify that we have a full TLS header, or wait for more data */ if (strp->stm.offset + prot->prepend_size > skb->len) return 0; /* Sanity-check size of on-stack buffer. */ if (WARN_ON(prot->prepend_size > sizeof(header))) { ret = -EINVAL; goto read_failure; } /* Linearize header to local buffer */ ret = skb_copy_bits(skb, strp->stm.offset, header, prot->prepend_size); if (ret < 0) goto read_failure; strp->mark = header[0]; data_len = ((header[4] & 0xFF) | (header[3] << 8)); cipher_overhead = prot->tag_size; if (prot->version != TLS_1_3_VERSION && prot->cipher_type != TLS_CIPHER_CHACHA20_POLY1305) cipher_overhead += prot->iv_size; if (data_len > TLS_MAX_PAYLOAD_SIZE + cipher_overhead + prot->tail_size) { ret = -EMSGSIZE; goto read_failure; } if (data_len < cipher_overhead) { ret = -EBADMSG; goto read_failure; } /* Note that both TLS1.3 and TLS1.2 use TLS_1_2 version here */ if (header[1] != TLS_1_2_VERSION_MINOR || header[2] != TLS_1_2_VERSION_MAJOR) { ret = -EINVAL; goto read_failure; } tls_device_rx_resync_new_rec(strp->sk, data_len + TLS_HEADER_SIZE, TCP_SKB_CB(skb)->seq + strp->stm.offset); return data_len + TLS_HEADER_SIZE; read_failure: tls_err_abort(strp->sk, ret); return ret; } void tls_rx_msg_ready(struct tls_strparser *strp) { struct tls_sw_context_rx *ctx; ctx = container_of(strp, struct tls_sw_context_rx, strp); ctx->saved_data_ready(strp->sk); } static void tls_data_ready(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); struct sk_psock *psock; gfp_t alloc_save; trace_sk_data_ready(sk); alloc_save = sk->sk_allocation; sk->sk_allocation = GFP_ATOMIC; tls_strp_data_ready(&ctx->strp); sk->sk_allocation = alloc_save; psock = sk_psock_get(sk); if (psock) { if (!list_empty(&psock->ingress_msg)) ctx->saved_data_ready(sk); sk_psock_put(sk, psock); } } void tls_sw_cancel_work_tx(struct tls_context *tls_ctx) { struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); set_bit(BIT_TX_CLOSING, &ctx->tx_bitmask); set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); cancel_delayed_work_sync(&ctx->tx_work.work); } void tls_sw_release_resources_tx(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); struct tls_rec *rec, *tmp; /* Wait for any pending async encryptions to complete */ tls_encrypt_async_wait(ctx); tls_tx_records(sk, -1); /* Free up un-sent records in tx_list. First, free * the partially sent record if any at head of tx_list. */ if (tls_ctx->partially_sent_record) { tls_free_partial_record(sk, tls_ctx); rec = list_first_entry(&ctx->tx_list, struct tls_rec, list); list_del(&rec->list); sk_msg_free(sk, &rec->msg_plaintext); kfree(rec); } list_for_each_entry_safe(rec, tmp, &ctx->tx_list, list) { list_del(&rec->list); sk_msg_free(sk, &rec->msg_encrypted); sk_msg_free(sk, &rec->msg_plaintext); kfree(rec); } crypto_free_aead(ctx->aead_send); tls_free_open_rec(sk); } void tls_sw_free_ctx_tx(struct tls_context *tls_ctx) { struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx); kfree(ctx); } void tls_sw_release_resources_rx(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); if (ctx->aead_recv) { __skb_queue_purge(&ctx->rx_list); crypto_free_aead(ctx->aead_recv); tls_strp_stop(&ctx->strp); /* If tls_sw_strparser_arm() was not called (cleanup paths) * we still want to tls_strp_stop(), but sk->sk_data_ready was * never swapped. */ if (ctx->saved_data_ready) { write_lock_bh(&sk->sk_callback_lock); sk->sk_data_ready = ctx->saved_data_ready; write_unlock_bh(&sk->sk_callback_lock); } } } void tls_sw_strparser_done(struct tls_context *tls_ctx) { struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); tls_strp_done(&ctx->strp); } void tls_sw_free_ctx_rx(struct tls_context *tls_ctx) { struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); kfree(ctx); } void tls_sw_free_resources_rx(struct sock *sk) { struct tls_context *tls_ctx = tls_get_ctx(sk); tls_sw_release_resources_rx(sk); tls_sw_free_ctx_rx(tls_ctx); } /* The work handler to transmitt the encrypted records in tx_list */ static void tx_work_handler(struct work_struct *work) { struct delayed_work *delayed_work = to_delayed_work(work); struct tx_work *tx_work = container_of(delayed_work, struct tx_work, work); struct sock *sk = tx_work->sk; struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_tx *ctx; if (unlikely(!tls_ctx)) return; ctx = tls_sw_ctx_tx(tls_ctx); if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) return; if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) return; if (mutex_trylock(&tls_ctx->tx_lock)) { lock_sock(sk); tls_tx_records(sk, -1); release_sock(sk); mutex_unlock(&tls_ctx->tx_lock); } else if (!test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { /* Someone is holding the tx_lock, they will likely run Tx * and cancel the work on their way out of the lock section. * Schedule a long delay just in case. */ schedule_delayed_work(&ctx->tx_work.work, msecs_to_jiffies(10)); } } static bool tls_is_tx_ready(struct tls_sw_context_tx *ctx) { struct tls_rec *rec; rec = list_first_entry_or_null(&ctx->tx_list, struct tls_rec, list); if (!rec) return false; return READ_ONCE(rec->tx_ready); } void tls_sw_write_space(struct sock *sk, struct tls_context *ctx) { struct tls_sw_context_tx *tx_ctx = tls_sw_ctx_tx(ctx); /* Schedule the transmission if tx list is ready */ if (tls_is_tx_ready(tx_ctx) && !test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask)) schedule_delayed_work(&tx_ctx->tx_work.work, 0); } void tls_sw_strparser_arm(struct sock *sk, struct tls_context *tls_ctx) { struct tls_sw_context_rx *rx_ctx = tls_sw_ctx_rx(tls_ctx); write_lock_bh(&sk->sk_callback_lock); rx_ctx->saved_data_ready = sk->sk_data_ready; sk->sk_data_ready = tls_data_ready; write_unlock_bh(&sk->sk_callback_lock); } void tls_update_rx_zc_capable(struct tls_context *tls_ctx) { struct tls_sw_context_rx *rx_ctx = tls_sw_ctx_rx(tls_ctx); rx_ctx->zc_capable = tls_ctx->rx_no_pad || tls_ctx->prot_info.version != TLS_1_3_VERSION; } static struct tls_sw_context_tx *init_ctx_tx(struct tls_context *ctx, struct sock *sk) { struct tls_sw_context_tx *sw_ctx_tx; if (!ctx->priv_ctx_tx) { sw_ctx_tx = kzalloc(sizeof(*sw_ctx_tx), GFP_KERNEL); if (!sw_ctx_tx) return NULL; } else { sw_ctx_tx = ctx->priv_ctx_tx; } crypto_init_wait(&sw_ctx_tx->async_wait); atomic_set(&sw_ctx_tx->encrypt_pending, 1); INIT_LIST_HEAD(&sw_ctx_tx->tx_list); INIT_DELAYED_WORK(&sw_ctx_tx->tx_work.work, tx_work_handler); sw_ctx_tx->tx_work.sk = sk; return sw_ctx_tx; } static struct tls_sw_context_rx *init_ctx_rx(struct tls_context *ctx) { struct tls_sw_context_rx *sw_ctx_rx; if (!ctx->priv_ctx_rx) { sw_ctx_rx = kzalloc(sizeof(*sw_ctx_rx), GFP_KERNEL); if (!sw_ctx_rx) return NULL; } else { sw_ctx_rx = ctx->priv_ctx_rx; } crypto_init_wait(&sw_ctx_rx->async_wait); atomic_set(&sw_ctx_rx->decrypt_pending, 1); init_waitqueue_head(&sw_ctx_rx->wq); skb_queue_head_init(&sw_ctx_rx->rx_list); skb_queue_head_init(&sw_ctx_rx->async_hold); return sw_ctx_rx; } int init_prot_info(struct tls_prot_info *prot, const struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc) { u16 nonce_size = cipher_desc->nonce; if (crypto_info->version == TLS_1_3_VERSION) { nonce_size = 0; prot->aad_size = TLS_HEADER_SIZE; prot->tail_size = 1; } else { prot->aad_size = TLS_AAD_SPACE_SIZE; prot->tail_size = 0; } /* Sanity-check the sizes for stack allocations. */ if (nonce_size > TLS_MAX_IV_SIZE || prot->aad_size > TLS_MAX_AAD_SIZE) return -EINVAL; prot->version = crypto_info->version; prot->cipher_type = crypto_info->cipher_type; prot->prepend_size = TLS_HEADER_SIZE + nonce_size; prot->tag_size = cipher_desc->tag; prot->overhead_size = prot->prepend_size + prot->tag_size + prot->tail_size; prot->iv_size = cipher_desc->iv; prot->salt_size = cipher_desc->salt; prot->rec_seq_size = cipher_desc->rec_seq; return 0; } static void tls_finish_key_update(struct sock *sk, struct tls_context *tls_ctx) { struct tls_sw_context_rx *ctx = tls_ctx->priv_ctx_rx; WRITE_ONCE(ctx->key_update_pending, false); /* wake-up pre-existing poll() */ ctx->saved_data_ready(sk); } int tls_set_sw_offload(struct sock *sk, int tx, struct tls_crypto_info *new_crypto_info) { struct tls_crypto_info *crypto_info, *src_crypto_info; struct tls_sw_context_tx *sw_ctx_tx = NULL; struct tls_sw_context_rx *sw_ctx_rx = NULL; const struct tls_cipher_desc *cipher_desc; char *iv, *rec_seq, *key, *salt; struct cipher_context *cctx; struct tls_prot_info *prot; struct crypto_aead **aead; struct tls_context *ctx; struct crypto_tfm *tfm; int rc = 0; ctx = tls_get_ctx(sk); prot = &ctx->prot_info; /* new_crypto_info != NULL means rekey */ if (!new_crypto_info) { if (tx) { ctx->priv_ctx_tx = init_ctx_tx(ctx, sk); if (!ctx->priv_ctx_tx) return -ENOMEM; } else { ctx->priv_ctx_rx = init_ctx_rx(ctx); if (!ctx->priv_ctx_rx) return -ENOMEM; } } if (tx) { sw_ctx_tx = ctx->priv_ctx_tx; crypto_info = &ctx->crypto_send.info; cctx = &ctx->tx; aead = &sw_ctx_tx->aead_send; } else { sw_ctx_rx = ctx->priv_ctx_rx; crypto_info = &ctx->crypto_recv.info; cctx = &ctx->rx; aead = &sw_ctx_rx->aead_recv; } src_crypto_info = new_crypto_info ?: crypto_info; cipher_desc = get_cipher_desc(src_crypto_info->cipher_type); if (!cipher_desc) { rc = -EINVAL; goto free_priv; } rc = init_prot_info(prot, src_crypto_info, cipher_desc); if (rc) goto free_priv; iv = crypto_info_iv(src_crypto_info, cipher_desc); key = crypto_info_key(src_crypto_info, cipher_desc); salt = crypto_info_salt(src_crypto_info, cipher_desc); rec_seq = crypto_info_rec_seq(src_crypto_info, cipher_desc); if (!*aead) { *aead = crypto_alloc_aead(cipher_desc->cipher_name, 0, 0); if (IS_ERR(*aead)) { rc = PTR_ERR(*aead); *aead = NULL; goto free_priv; } } ctx->push_pending_record = tls_sw_push_pending_record; /* setkey is the last operation that could fail during a * rekey. if it succeeds, we can start modifying the * context. */ rc = crypto_aead_setkey(*aead, key, cipher_desc->key); if (rc) { if (new_crypto_info) goto out; else goto free_aead; } if (!new_crypto_info) { rc = crypto_aead_setauthsize(*aead, prot->tag_size); if (rc) goto free_aead; } if (!tx && !new_crypto_info) { tfm = crypto_aead_tfm(sw_ctx_rx->aead_recv); tls_update_rx_zc_capable(ctx); sw_ctx_rx->async_capable = src_crypto_info->version != TLS_1_3_VERSION && !!(tfm->__crt_alg->cra_flags & CRYPTO_ALG_ASYNC); rc = tls_strp_init(&sw_ctx_rx->strp, sk); if (rc) goto free_aead; } memcpy(cctx->iv, salt, cipher_desc->salt); memcpy(cctx->iv + cipher_desc->salt, iv, cipher_desc->iv); memcpy(cctx->rec_seq, rec_seq, cipher_desc->rec_seq); if (new_crypto_info) { unsafe_memcpy(crypto_info, new_crypto_info, cipher_desc->crypto_info, /* size was checked in do_tls_setsockopt_conf */); memzero_explicit(new_crypto_info, cipher_desc->crypto_info); if (!tx) tls_finish_key_update(sk, ctx); } goto out; free_aead: crypto_free_aead(*aead); *aead = NULL; free_priv: if (!new_crypto_info) { if (tx) { kfree(ctx->priv_ctx_tx); ctx->priv_ctx_tx = NULL; } else { kfree(ctx->priv_ctx_rx); ctx->priv_ctx_rx = NULL; } } out: return rc; } |
| 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Roccat driver for Linux * * Copyright (c) 2010 Stefan Achatz <erazor_de@users.sourceforge.net> */ /* */ /* * Module roccat is a char device used to report special events of roccat * hardware to userland. These events include requests for on-screen-display of * profile or dpi settings or requests for execution of macro sequences that are * not stored in device. The information in these events depends on hid device * implementation and contains data that is not available in a single hid event * or else hidraw could have been used. * It is inspired by hidraw, but uses only one circular buffer for all readers. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/cdev.h> #include <linux/poll.h> #include <linux/sched/signal.h> #include <linux/hid-roccat.h> #include <linux/module.h> #define ROCCAT_FIRST_MINOR 0 #define ROCCAT_MAX_DEVICES 8 /* should be a power of 2 for performance reason */ #define ROCCAT_CBUF_SIZE 16 struct roccat_report { uint8_t *value; }; struct roccat_device { unsigned int minor; int report_size; int open; int exist; wait_queue_head_t wait; struct device *dev; struct hid_device *hid; struct list_head readers; /* protects modifications of readers list */ struct mutex readers_lock; /* * circular_buffer has one writer and multiple readers with their own * read pointers */ struct roccat_report cbuf[ROCCAT_CBUF_SIZE]; int cbuf_end; struct mutex cbuf_lock; }; struct roccat_reader { struct list_head node; struct roccat_device *device; int cbuf_start; }; static int roccat_major; static struct cdev roccat_cdev; static struct roccat_device *devices[ROCCAT_MAX_DEVICES]; /* protects modifications of devices array */ static DEFINE_MUTEX(devices_lock); static ssize_t roccat_read(struct file *file, char __user *buffer, size_t count, loff_t *ppos) { struct roccat_reader *reader = file->private_data; struct roccat_device *device = reader->device; struct roccat_report *report; ssize_t retval = 0, len; DECLARE_WAITQUEUE(wait, current); mutex_lock(&device->cbuf_lock); /* no data? */ if (reader->cbuf_start == device->cbuf_end) { add_wait_queue(&device->wait, &wait); set_current_state(TASK_INTERRUPTIBLE); /* wait for data */ while (reader->cbuf_start == device->cbuf_end) { if (file->f_flags & O_NONBLOCK) { retval = -EAGAIN; break; } if (signal_pending(current)) { retval = -ERESTARTSYS; break; } if (!device->exist) { retval = -EIO; break; } mutex_unlock(&device->cbuf_lock); schedule(); mutex_lock(&device->cbuf_lock); set_current_state(TASK_INTERRUPTIBLE); } set_current_state(TASK_RUNNING); remove_wait_queue(&device->wait, &wait); } /* here we either have data or a reason to return if retval is set */ if (retval) goto exit_unlock; report = &device->cbuf[reader->cbuf_start]; /* * If report is larger than requested amount of data, rest of report * is lost! */ len = device->report_size > count ? count : device->report_size; if (copy_to_user(buffer, report->value, len)) { retval = -EFAULT; goto exit_unlock; } retval += len; reader->cbuf_start = (reader->cbuf_start + 1) % ROCCAT_CBUF_SIZE; exit_unlock: mutex_unlock(&device->cbuf_lock); return retval; } static __poll_t roccat_poll(struct file *file, poll_table *wait) { struct roccat_reader *reader = file->private_data; poll_wait(file, &reader->device->wait, wait); if (reader->cbuf_start != reader->device->cbuf_end) return EPOLLIN | EPOLLRDNORM; if (!reader->device->exist) return EPOLLERR | EPOLLHUP; return 0; } static int roccat_open(struct inode *inode, struct file *file) { unsigned int minor = iminor(inode); struct roccat_reader *reader; struct roccat_device *device; int error = 0; reader = kzalloc(sizeof(struct roccat_reader), GFP_KERNEL); if (!reader) return -ENOMEM; mutex_lock(&devices_lock); device = devices[minor]; if (!device) { pr_emerg("roccat device with minor %d doesn't exist\n", minor); error = -ENODEV; goto exit_err_devices; } mutex_lock(&device->readers_lock); if (!device->open++) { /* power on device on adding first reader */ error = hid_hw_power(device->hid, PM_HINT_FULLON); if (error < 0) { --device->open; goto exit_err_readers; } error = hid_hw_open(device->hid); if (error < 0) { hid_hw_power(device->hid, PM_HINT_NORMAL); --device->open; goto exit_err_readers; } } reader->device = device; /* new reader doesn't get old events */ reader->cbuf_start = device->cbuf_end; list_add_tail(&reader->node, &device->readers); file->private_data = reader; exit_err_readers: mutex_unlock(&device->readers_lock); exit_err_devices: mutex_unlock(&devices_lock); if (error) kfree(reader); return error; } static int roccat_release(struct inode *inode, struct file *file) { unsigned int minor = iminor(inode); struct roccat_reader *reader = file->private_data; struct roccat_device *device; mutex_lock(&devices_lock); device = devices[minor]; if (!device) { mutex_unlock(&devices_lock); pr_emerg("roccat device with minor %d doesn't exist\n", minor); return -ENODEV; } mutex_lock(&device->readers_lock); list_del(&reader->node); mutex_unlock(&device->readers_lock); kfree(reader); if (!--device->open) { /* removing last reader */ if (device->exist) { hid_hw_power(device->hid, PM_HINT_NORMAL); hid_hw_close(device->hid); } else { kfree(device); } } mutex_unlock(&devices_lock); return 0; } /* * roccat_report_event() - output data to readers * @minor: minor device number returned by roccat_connect() * @data: pointer to data * * Return value is zero on success, a negative error code on failure. * * This is called from interrupt handler. */ int roccat_report_event(int minor, u8 const *data) { struct roccat_device *device; struct roccat_reader *reader; struct roccat_report *report; uint8_t *new_value; device = devices[minor]; new_value = kmemdup(data, device->report_size, GFP_ATOMIC); if (!new_value) return -ENOMEM; mutex_lock(&device->cbuf_lock); report = &device->cbuf[device->cbuf_end]; /* passing NULL is safe */ kfree(report->value); report->value = new_value; device->cbuf_end = (device->cbuf_end + 1) % ROCCAT_CBUF_SIZE; list_for_each_entry(reader, &device->readers, node) { /* * As we already inserted one element, the buffer can't be * empty. If start and end are equal, buffer is full and we * increase start, so that slow reader misses one event, but * gets the newer ones in the right order. */ if (reader->cbuf_start == device->cbuf_end) reader->cbuf_start = (reader->cbuf_start + 1) % ROCCAT_CBUF_SIZE; } mutex_unlock(&device->cbuf_lock); wake_up_interruptible(&device->wait); return 0; } EXPORT_SYMBOL_GPL(roccat_report_event); /* * roccat_connect() - create a char device for special event output * @class: the class thats used to create the device. Meant to hold device * specific sysfs attributes. * @hid: the hid device the char device should be connected to. * @report_size: size of reports * * Return value is minor device number in Range [0, ROCCAT_MAX_DEVICES] on * success, a negative error code on failure. */ int roccat_connect(const struct class *klass, struct hid_device *hid, int report_size) { unsigned int minor; struct roccat_device *device; int temp; device = kzalloc(sizeof(struct roccat_device), GFP_KERNEL); if (!device) return -ENOMEM; mutex_lock(&devices_lock); for (minor = 0; minor < ROCCAT_MAX_DEVICES; ++minor) { if (devices[minor]) continue; break; } if (minor < ROCCAT_MAX_DEVICES) { devices[minor] = device; } else { mutex_unlock(&devices_lock); kfree(device); return -EINVAL; } device->dev = device_create(klass, &hid->dev, MKDEV(roccat_major, minor), NULL, "%s%s%d", "roccat", hid->driver->name, minor); if (IS_ERR(device->dev)) { devices[minor] = NULL; mutex_unlock(&devices_lock); temp = PTR_ERR(device->dev); kfree(device); return temp; } mutex_unlock(&devices_lock); init_waitqueue_head(&device->wait); INIT_LIST_HEAD(&device->readers); mutex_init(&device->readers_lock); mutex_init(&device->cbuf_lock); device->minor = minor; device->hid = hid; device->exist = 1; device->cbuf_end = 0; device->report_size = report_size; return minor; } EXPORT_SYMBOL_GPL(roccat_connect); /* roccat_disconnect() - remove char device from hid device * @minor: the minor device number returned by roccat_connect() */ void roccat_disconnect(int minor) { struct roccat_device *device; mutex_lock(&devices_lock); device = devices[minor]; mutex_unlock(&devices_lock); device->exist = 0; /* TODO exist maybe not needed */ device_destroy(device->dev->class, MKDEV(roccat_major, minor)); mutex_lock(&devices_lock); devices[minor] = NULL; mutex_unlock(&devices_lock); if (device->open) { hid_hw_close(device->hid); wake_up_interruptible(&device->wait); } else { kfree(device); } } EXPORT_SYMBOL_GPL(roccat_disconnect); static long roccat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct inode *inode = file_inode(file); struct roccat_device *device; unsigned int minor = iminor(inode); long retval = 0; mutex_lock(&devices_lock); device = devices[minor]; if (!device) { retval = -ENODEV; goto out; } switch (cmd) { case ROCCATIOCGREPSIZE: if (put_user(device->report_size, (int __user *)arg)) retval = -EFAULT; break; default: retval = -ENOTTY; } out: mutex_unlock(&devices_lock); return retval; } static const struct file_operations roccat_ops = { .owner = THIS_MODULE, .read = roccat_read, .poll = roccat_poll, .open = roccat_open, .release = roccat_release, .llseek = noop_llseek, .unlocked_ioctl = roccat_ioctl, }; static int __init roccat_init(void) { int retval; dev_t dev_id; retval = alloc_chrdev_region(&dev_id, ROCCAT_FIRST_MINOR, ROCCAT_MAX_DEVICES, "roccat"); if (retval < 0) { pr_warn("can't get major number\n"); goto error; } roccat_major = MAJOR(dev_id); cdev_init(&roccat_cdev, &roccat_ops); retval = cdev_add(&roccat_cdev, dev_id, ROCCAT_MAX_DEVICES); if (retval < 0) { pr_warn("cannot add cdev\n"); goto cleanup_alloc_chrdev_region; } return 0; cleanup_alloc_chrdev_region: unregister_chrdev_region(dev_id, ROCCAT_MAX_DEVICES); error: return retval; } static void __exit roccat_exit(void) { dev_t dev_id = MKDEV(roccat_major, 0); cdev_del(&roccat_cdev); unregister_chrdev_region(dev_id, ROCCAT_MAX_DEVICES); } module_init(roccat_init); module_exit(roccat_exit); MODULE_AUTHOR("Stefan Achatz"); MODULE_DESCRIPTION("USB Roccat char device"); MODULE_LICENSE("GPL v2"); |
| 165 151 144 144 151 144 142 150 5 3 151 143 143 144 144 143 142 109 169 151 15 156 143 147 141 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 | /* SPDX-License-Identifier: GPL-2.0-or-later */ #ifndef __SOUND_PCM_PARAMS_H #define __SOUND_PCM_PARAMS_H /* * PCM params helpers * Copyright (c) by Abramo Bagnara <abramo@alsa-project.org> */ #include <sound/pcm.h> int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm, struct snd_pcm_hw_params *params, snd_pcm_hw_param_t var, int *dir); int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm, struct snd_pcm_hw_params *params, snd_pcm_hw_param_t var, int *dir); int snd_pcm_hw_param_value(const struct snd_pcm_hw_params *params, snd_pcm_hw_param_t var, int *dir); #define SNDRV_MASK_BITS 64 /* we use so far 64bits only */ #define SNDRV_MASK_SIZE (SNDRV_MASK_BITS / 32) #define MASK_OFS(i) ((i) >> 5) #define MASK_BIT(i) (1U << ((i) & 31)) static inline void snd_mask_none(struct snd_mask *mask) { memset(mask, 0, sizeof(*mask)); } static inline void snd_mask_any(struct snd_mask *mask) { memset(mask, 0xff, SNDRV_MASK_SIZE * sizeof(u_int32_t)); } static inline int snd_mask_empty(const struct snd_mask *mask) { int i; for (i = 0; i < SNDRV_MASK_SIZE; i++) if (mask->bits[i]) return 0; return 1; } static inline unsigned int snd_mask_min(const struct snd_mask *mask) { int i; for (i = 0; i < SNDRV_MASK_SIZE; i++) { if (mask->bits[i]) return __ffs(mask->bits[i]) + (i << 5); } return 0; } static inline unsigned int snd_mask_max(const struct snd_mask *mask) { int i; for (i = SNDRV_MASK_SIZE - 1; i >= 0; i--) { if (mask->bits[i]) return __fls(mask->bits[i]) + (i << 5); } return 0; } static inline void snd_mask_set(struct snd_mask *mask, unsigned int val) { mask->bits[MASK_OFS(val)] |= MASK_BIT(val); } /* Most of drivers need only this one */ static inline void snd_mask_set_format(struct snd_mask *mask, snd_pcm_format_t format) { snd_mask_set(mask, (__force unsigned int)format); } static inline void snd_mask_reset(struct snd_mask *mask, unsigned int val) { mask->bits[MASK_OFS(val)] &= ~MASK_BIT(val); } static inline void snd_mask_set_range(struct snd_mask *mask, unsigned int from, unsigned int to) { unsigned int i; for (i = from; i <= to; i++) mask->bits[MASK_OFS(i)] |= MASK_BIT(i); } static inline void snd_mask_reset_range(struct snd_mask *mask, unsigned int from, unsigned int to) { unsigned int i; for (i = from; i <= to; i++) mask->bits[MASK_OFS(i)] &= ~MASK_BIT(i); } static inline void snd_mask_leave(struct snd_mask *mask, unsigned int val) { unsigned int v; v = mask->bits[MASK_OFS(val)] & MASK_BIT(val); snd_mask_none(mask); mask->bits[MASK_OFS(val)] = v; } static inline void snd_mask_intersect(struct snd_mask *mask, const struct snd_mask *v) { int i; for (i = 0; i < SNDRV_MASK_SIZE; i++) mask->bits[i] &= v->bits[i]; } static inline int snd_mask_eq(const struct snd_mask *mask, const struct snd_mask *v) { return ! memcmp(mask, v, SNDRV_MASK_SIZE * sizeof(u_int32_t)); } static inline void snd_mask_copy(struct snd_mask *mask, const struct snd_mask *v) { *mask = *v; } static inline int snd_mask_test(const struct snd_mask *mask, unsigned int val) { return mask->bits[MASK_OFS(val)] & MASK_BIT(val); } /* Most of drivers need only this one */ static inline int snd_mask_test_format(const struct snd_mask *mask, snd_pcm_format_t format) { return snd_mask_test(mask, (__force unsigned int)format); } static inline int snd_mask_single(const struct snd_mask *mask) { int i, c = 0; for (i = 0; i < SNDRV_MASK_SIZE; i++) { if (! mask->bits[i]) continue; if (mask->bits[i] & (mask->bits[i] - 1)) return 0; if (c) return 0; c++; } return 1; } static inline int snd_mask_refine(struct snd_mask *mask, const struct snd_mask *v) { struct snd_mask old; snd_mask_copy(&old, mask); snd_mask_intersect(mask, v); if (snd_mask_empty(mask)) return -EINVAL; return !snd_mask_eq(mask, &old); } static inline int snd_mask_refine_first(struct snd_mask *mask) { if (snd_mask_single(mask)) return 0; snd_mask_leave(mask, snd_mask_min(mask)); return 1; } static inline int snd_mask_refine_last(struct snd_mask *mask) { if (snd_mask_single(mask)) return 0; snd_mask_leave(mask, snd_mask_max(mask)); return 1; } static inline int snd_mask_refine_min(struct snd_mask *mask, unsigned int val) { if (snd_mask_min(mask) >= val) return 0; snd_mask_reset_range(mask, 0, val - 1); if (snd_mask_empty(mask)) return -EINVAL; return 1; } static inline int snd_mask_refine_max(struct snd_mask *mask, unsigned int val) { if (snd_mask_max(mask) <= val) return 0; snd_mask_reset_range(mask, val + 1, SNDRV_MASK_BITS); if (snd_mask_empty(mask)) return -EINVAL; return 1; } static inline int snd_mask_refine_set(struct snd_mask *mask, unsigned int val) { int changed; changed = !snd_mask_single(mask); snd_mask_leave(mask, val); if (snd_mask_empty(mask)) return -EINVAL; return changed; } static inline int snd_mask_value(const struct snd_mask *mask) { return snd_mask_min(mask); } static inline void snd_interval_any(struct snd_interval *i) { i->min = 0; i->openmin = 0; i->max = UINT_MAX; i->openmax = 0; i->integer = 0; i->empty = 0; } static inline void snd_interval_none(struct snd_interval *i) { i->empty = 1; } static inline int snd_interval_checkempty(const struct snd_interval *i) { return (i->min > i->max || (i->min == i->max && (i->openmin || i->openmax))); } static inline int snd_interval_empty(const struct snd_interval *i) { return i->empty; } static inline int snd_interval_single(const struct snd_interval *i) { return (i->min == i->max || (i->min + 1 == i->max && (i->openmin || i->openmax))); } static inline int snd_interval_value(const struct snd_interval *i) { if (i->openmin && !i->openmax) return i->max; return i->min; } static inline int snd_interval_min(const struct snd_interval *i) { return i->min; } static inline int snd_interval_max(const struct snd_interval *i) { unsigned int v; v = i->max; if (i->openmax) v--; return v; } static inline int snd_interval_test(const struct snd_interval *i, unsigned int val) { return !((i->min > val || (i->min == val && i->openmin) || i->max < val || (i->max == val && i->openmax))); } static inline void snd_interval_copy(struct snd_interval *d, const struct snd_interval *s) { *d = *s; } static inline int snd_interval_setinteger(struct snd_interval *i) { if (i->integer) return 0; if (i->openmin && i->openmax && i->min == i->max) return -EINVAL; i->integer = 1; return 1; } static inline int snd_interval_eq(const struct snd_interval *i1, const struct snd_interval *i2) { if (i1->empty) return i2->empty; if (i2->empty) return i1->empty; return i1->min == i2->min && i1->openmin == i2->openmin && i1->max == i2->max && i1->openmax == i2->openmax; } /** * params_access - get the access type from the hw params * @p: hw params */ static inline snd_pcm_access_t params_access(const struct snd_pcm_hw_params *p) { return (__force snd_pcm_access_t)snd_mask_min(hw_param_mask_c(p, SNDRV_PCM_HW_PARAM_ACCESS)); } /** * params_format - get the sample format from the hw params * @p: hw params */ static inline snd_pcm_format_t params_format(const struct snd_pcm_hw_params *p) { return (__force snd_pcm_format_t)snd_mask_min(hw_param_mask_c(p, SNDRV_PCM_HW_PARAM_FORMAT)); } /** * params_subformat - get the sample subformat from the hw params * @p: hw params */ static inline snd_pcm_subformat_t params_subformat(const struct snd_pcm_hw_params *p) { return (__force snd_pcm_subformat_t)snd_mask_min(hw_param_mask_c(p, SNDRV_PCM_HW_PARAM_SUBFORMAT)); } /** * params_period_bytes - get the period size (in bytes) from the hw params * @p: hw params */ static inline unsigned int params_period_bytes(const struct snd_pcm_hw_params *p) { return hw_param_interval_c(p, SNDRV_PCM_HW_PARAM_PERIOD_BYTES)->min; } /** * params_width - get the number of bits of the sample format from the hw params * @p: hw params * * This function returns the number of bits per sample that the selected sample * format of the hw params has. */ static inline int params_width(const struct snd_pcm_hw_params *p) { return snd_pcm_format_width(params_format(p)); } /* * params_physical_width - get the storage size of the sample format from the hw params * @p: hw params * * This functions returns the number of bits per sample that the selected sample * format of the hw params takes up in memory. This will be equal or larger than * params_width(). */ static inline int params_physical_width(const struct snd_pcm_hw_params *p) { return snd_pcm_format_physical_width(params_format(p)); } int snd_pcm_hw_params_bits(const struct snd_pcm_hw_params *p); static inline void params_set_format(struct snd_pcm_hw_params *p, snd_pcm_format_t fmt) { snd_mask_set_format(hw_param_mask(p, SNDRV_PCM_HW_PARAM_FORMAT), fmt); } #endif /* __SOUND_PCM_PARAMS_H */ |
| 6 2 17 17 8 9 23 8 9 14 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 | /* * Copyright (c) 2016 Tom Herbert <tom@herbertland.com> * Copyright (c) 2016-2017, Mellanox Technologies. All rights reserved. * Copyright (c) 2016-2017, Dave Watson <davejwatson@fb.com>. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _TLS_INT_H #define _TLS_INT_H #include <asm/byteorder.h> #include <linux/types.h> #include <linux/skmsg.h> #include <net/tls.h> #include <net/tls_prot.h> #define TLS_PAGE_ORDER (min_t(unsigned int, PAGE_ALLOC_COSTLY_ORDER, \ TLS_MAX_PAYLOAD_SIZE >> PAGE_SHIFT)) #define __TLS_INC_STATS(net, field) \ __SNMP_INC_STATS((net)->mib.tls_statistics, field) #define TLS_INC_STATS(net, field) \ SNMP_INC_STATS((net)->mib.tls_statistics, field) #define TLS_DEC_STATS(net, field) \ SNMP_DEC_STATS((net)->mib.tls_statistics, field) struct tls_cipher_desc { unsigned int nonce; unsigned int iv; unsigned int key; unsigned int salt; unsigned int tag; unsigned int rec_seq; unsigned int iv_offset; unsigned int key_offset; unsigned int salt_offset; unsigned int rec_seq_offset; char *cipher_name; bool offloadable; size_t crypto_info; }; #define TLS_CIPHER_MIN TLS_CIPHER_AES_GCM_128 #define TLS_CIPHER_MAX TLS_CIPHER_ARIA_GCM_256 extern const struct tls_cipher_desc tls_cipher_desc[TLS_CIPHER_MAX + 1 - TLS_CIPHER_MIN]; static inline const struct tls_cipher_desc *get_cipher_desc(u16 cipher_type) { if (cipher_type < TLS_CIPHER_MIN || cipher_type > TLS_CIPHER_MAX) return NULL; return &tls_cipher_desc[cipher_type - TLS_CIPHER_MIN]; } static inline char *crypto_info_iv(struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc) { return (char *)crypto_info + cipher_desc->iv_offset; } static inline char *crypto_info_key(struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc) { return (char *)crypto_info + cipher_desc->key_offset; } static inline char *crypto_info_salt(struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc) { return (char *)crypto_info + cipher_desc->salt_offset; } static inline char *crypto_info_rec_seq(struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc) { return (char *)crypto_info + cipher_desc->rec_seq_offset; } /* TLS records are maintained in 'struct tls_rec'. It stores the memory pages * allocated or mapped for each TLS record. After encryption, the records are * stores in a linked list. */ struct tls_rec { struct list_head list; int tx_ready; int tx_flags; struct sk_msg msg_plaintext; struct sk_msg msg_encrypted; /* AAD | msg_plaintext.sg.data | sg_tag */ struct scatterlist sg_aead_in[2]; /* AAD | msg_encrypted.sg.data (data contains overhead for hdr & iv & tag) */ struct scatterlist sg_aead_out[2]; char content_type; struct scatterlist sg_content_type; struct sock *sk; char aad_space[TLS_AAD_SPACE_SIZE]; u8 iv_data[TLS_MAX_IV_SIZE]; struct aead_request aead_req; u8 aead_req_ctx[]; }; int __net_init tls_proc_init(struct net *net); void __net_exit tls_proc_fini(struct net *net); struct tls_context *tls_ctx_create(struct sock *sk); void tls_ctx_free(struct sock *sk, struct tls_context *ctx); void update_sk_prot(struct sock *sk, struct tls_context *ctx); int wait_on_pending_writer(struct sock *sk, long *timeo); void tls_err_abort(struct sock *sk, int err); int init_prot_info(struct tls_prot_info *prot, const struct tls_crypto_info *crypto_info, const struct tls_cipher_desc *cipher_desc); int tls_set_sw_offload(struct sock *sk, int tx, struct tls_crypto_info *new_crypto_info); void tls_update_rx_zc_capable(struct tls_context *tls_ctx); void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx); void tls_sw_strparser_done(struct tls_context *tls_ctx); int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); void tls_sw_splice_eof(struct socket *sock); void tls_sw_cancel_work_tx(struct tls_context *tls_ctx); void tls_sw_release_resources_tx(struct sock *sk); void tls_sw_free_ctx_tx(struct tls_context *tls_ctx); void tls_sw_free_resources_rx(struct sock *sk); void tls_sw_release_resources_rx(struct sock *sk); void tls_sw_free_ctx_rx(struct tls_context *tls_ctx); int tls_sw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len); bool tls_sw_sock_is_readable(struct sock *sk); ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t read_actor); int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); void tls_device_splice_eof(struct socket *sock); int tls_tx_records(struct sock *sk, int flags); void tls_sw_write_space(struct sock *sk, struct tls_context *ctx); void tls_device_write_space(struct sock *sk, struct tls_context *ctx); int tls_process_cmsg(struct sock *sk, struct msghdr *msg, unsigned char *record_type); int decrypt_skb(struct sock *sk, struct scatterlist *sgout); int tls_sw_fallback_init(struct sock *sk, struct tls_offload_context_tx *offload_ctx, struct tls_crypto_info *crypto_info); int tls_strp_dev_init(void); void tls_strp_dev_exit(void); void tls_strp_done(struct tls_strparser *strp); void tls_strp_stop(struct tls_strparser *strp); int tls_strp_init(struct tls_strparser *strp, struct sock *sk); void tls_strp_data_ready(struct tls_strparser *strp); void tls_strp_check_rcv(struct tls_strparser *strp); void tls_strp_msg_done(struct tls_strparser *strp); int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb); void tls_rx_msg_ready(struct tls_strparser *strp); void tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh); int tls_strp_msg_cow(struct tls_sw_context_rx *ctx); struct sk_buff *tls_strp_msg_detach(struct tls_sw_context_rx *ctx); int tls_strp_msg_hold(struct tls_strparser *strp, struct sk_buff_head *dst); static inline struct tls_msg *tls_msg(struct sk_buff *skb) { struct sk_skb_cb *scb = (struct sk_skb_cb *)skb->cb; return &scb->tls; } static inline struct sk_buff *tls_strp_msg(struct tls_sw_context_rx *ctx) { DEBUG_NET_WARN_ON_ONCE(!ctx->strp.msg_ready || !ctx->strp.anchor->len); return ctx->strp.anchor; } static inline bool tls_strp_msg_ready(struct tls_sw_context_rx *ctx) { return READ_ONCE(ctx->strp.msg_ready); } static inline bool tls_strp_msg_mixed_decrypted(struct tls_sw_context_rx *ctx) { return ctx->strp.mixed_decrypted; } #ifdef CONFIG_TLS_DEVICE int tls_device_init(void); void tls_device_cleanup(void); int tls_set_device_offload(struct sock *sk); void tls_device_free_resources_tx(struct sock *sk); int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx); void tls_device_offload_cleanup_rx(struct sock *sk); void tls_device_rx_resync_new_rec(struct sock *sk, u32 rcd_len, u32 seq); int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx); #else static inline int tls_device_init(void) { return 0; } static inline void tls_device_cleanup(void) {} static inline int tls_set_device_offload(struct sock *sk) { return -EOPNOTSUPP; } static inline void tls_device_free_resources_tx(struct sock *sk) {} static inline int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx) { return -EOPNOTSUPP; } static inline void tls_device_offload_cleanup_rx(struct sock *sk) {} static inline void tls_device_rx_resync_new_rec(struct sock *sk, u32 rcd_len, u32 seq) {} static inline int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx) { return 0; } #endif int tls_push_sg(struct sock *sk, struct tls_context *ctx, struct scatterlist *sg, u16 first_offset, int flags); int tls_push_partial_record(struct sock *sk, struct tls_context *ctx, int flags); void tls_free_partial_record(struct sock *sk, struct tls_context *ctx); static inline bool tls_is_partially_sent_record(struct tls_context *ctx) { return !!ctx->partially_sent_record; } static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx) { return tls_ctx->pending_open_record_frags; } static inline bool tls_bigint_increment(unsigned char *seq, int len) { int i; for (i = len - 1; i >= 0; i--) { ++seq[i]; if (seq[i] != 0) break; } return (i == -1); } static inline void tls_bigint_subtract(unsigned char *seq, int n) { u64 rcd_sn; __be64 *p; BUILD_BUG_ON(TLS_MAX_REC_SEQ_SIZE != 8); p = (__be64 *)seq; rcd_sn = be64_to_cpu(*p); *p = cpu_to_be64(rcd_sn - n); } static inline void tls_advance_record_sn(struct sock *sk, struct tls_prot_info *prot, struct cipher_context *ctx) { if (tls_bigint_increment(ctx->rec_seq, prot->rec_seq_size)) tls_err_abort(sk, -EBADMSG); if (prot->version != TLS_1_3_VERSION && prot->cipher_type != TLS_CIPHER_CHACHA20_POLY1305) tls_bigint_increment(ctx->iv + prot->salt_size, prot->iv_size); } static inline void tls_xor_iv_with_seq(struct tls_prot_info *prot, char *iv, char *seq) { int i; if (prot->version == TLS_1_3_VERSION || prot->cipher_type == TLS_CIPHER_CHACHA20_POLY1305) { for (i = 0; i < 8; i++) iv[i + 4] ^= seq[i]; } } static inline void tls_fill_prepend(struct tls_context *ctx, char *buf, size_t plaintext_len, unsigned char record_type) { struct tls_prot_info *prot = &ctx->prot_info; size_t pkt_len, iv_size = prot->iv_size; pkt_len = plaintext_len + prot->tag_size; if (prot->version != TLS_1_3_VERSION && prot->cipher_type != TLS_CIPHER_CHACHA20_POLY1305) { pkt_len += iv_size; memcpy(buf + TLS_NONCE_OFFSET, ctx->tx.iv + prot->salt_size, iv_size); } /* we cover nonce explicit here as well, so buf should be of * size KTLS_DTLS_HEADER_SIZE + KTLS_DTLS_NONCE_EXPLICIT_SIZE */ buf[0] = prot->version == TLS_1_3_VERSION ? TLS_RECORD_TYPE_DATA : record_type; /* Note that VERSION must be TLS_1_2 for both TLS1.2 and TLS1.3 */ buf[1] = TLS_1_2_VERSION_MINOR; buf[2] = TLS_1_2_VERSION_MAJOR; /* we can use IV for nonce explicit according to spec */ buf[3] = pkt_len >> 8; buf[4] = pkt_len & 0xFF; } static inline void tls_make_aad(char *buf, size_t size, char *record_sequence, unsigned char record_type, struct tls_prot_info *prot) { if (prot->version != TLS_1_3_VERSION) { memcpy(buf, record_sequence, prot->rec_seq_size); buf += 8; } else { size += prot->tag_size; } buf[0] = prot->version == TLS_1_3_VERSION ? TLS_RECORD_TYPE_DATA : record_type; buf[1] = TLS_1_2_VERSION_MAJOR; buf[2] = TLS_1_2_VERSION_MINOR; buf[3] = size >> 8; buf[4] = size & 0xFF; } #endif |
| 74 40 75 112 40 68 74 74 112 74 59 75 119 121 4 4 4 64 64 20 44 46 46 46 123 2 8 2 93 48 18 1 2 123 2 107 2 61 108 91 16 2 107 110 109 14 1 109 110 44 63 63 43 63 16 1 63 63 33 51 2 10 3 1 3 110 2 14 92 75 75 30 31 30 902 909 902 24 4 5 35 35 35 36 36 48 48 3 3 3 3 2 2 2 2 16 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 | /* netfilter.c: look after the filters for various protocols. * Heavily influenced by the old firewall.c by David Bonn and Alan Cox. * * Thanks to Rob `CmdrTaco' Malda for not influencing this code in any * way. * * This code is GPL. */ #include <linux/kernel.h> #include <linux/netfilter.h> #include <net/protocol.h> #include <linux/init.h> #include <linux/skbuff.h> #include <linux/wait.h> #include <linux/module.h> #include <linux/interrupt.h> #include <linux/if.h> #include <linux/netdevice.h> #include <linux/netfilter_ipv6.h> #include <linux/inetdevice.h> #include <linux/proc_fs.h> #include <linux/mutex.h> #include <linux/mm.h> #include <linux/rcupdate.h> #include <net/net_namespace.h> #include <net/netfilter/nf_queue.h> #include <net/sock.h> #include "nf_internals.h" const struct nf_ipv6_ops __rcu *nf_ipv6_ops __read_mostly; EXPORT_SYMBOL_GPL(nf_ipv6_ops); #ifdef CONFIG_JUMP_LABEL struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; EXPORT_SYMBOL(nf_hooks_needed); #endif static DEFINE_MUTEX(nf_hook_mutex); /* max hooks per family/hooknum */ #define MAX_HOOK_COUNT 1024 #define nf_entry_dereference(e) \ rcu_dereference_protected(e, lockdep_is_held(&nf_hook_mutex)) static struct nf_hook_entries *allocate_hook_entries_size(u16 num) { struct nf_hook_entries *e; size_t alloc = sizeof(*e) + sizeof(struct nf_hook_entry) * num + sizeof(struct nf_hook_ops *) * num + sizeof(struct nf_hook_entries_rcu_head); if (num == 0) return NULL; e = kvzalloc(alloc, GFP_KERNEL_ACCOUNT); if (e) e->num_hook_entries = num; return e; } static void __nf_hook_entries_free(struct rcu_head *h) { struct nf_hook_entries_rcu_head *head; head = container_of(h, struct nf_hook_entries_rcu_head, head); kvfree(head->allocation); } static void nf_hook_entries_free(struct nf_hook_entries *e) { struct nf_hook_entries_rcu_head *head; struct nf_hook_ops **ops; unsigned int num; if (!e) return; num = e->num_hook_entries; ops = nf_hook_entries_get_hook_ops(e); head = (void *)&ops[num]; head->allocation = e; call_rcu(&head->head, __nf_hook_entries_free); } static unsigned int accept_all(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { return NF_ACCEPT; /* ACCEPT makes nf_hook_slow call next hook */ } static const struct nf_hook_ops dummy_ops = { .hook = accept_all, .priority = INT_MIN, }; static struct nf_hook_entries * nf_hook_entries_grow(const struct nf_hook_entries *old, const struct nf_hook_ops *reg) { unsigned int i, alloc_entries, nhooks, old_entries; struct nf_hook_ops **orig_ops = NULL; struct nf_hook_ops **new_ops; struct nf_hook_entries *new; bool inserted = false; alloc_entries = 1; old_entries = old ? old->num_hook_entries : 0; if (old) { orig_ops = nf_hook_entries_get_hook_ops(old); for (i = 0; i < old_entries; i++) { if (orig_ops[i] != &dummy_ops) alloc_entries++; /* Restrict BPF hook type to force a unique priority, not * shared at attach time. * * This is mainly to avoid ordering issues between two * different bpf programs, this doesn't prevent a normal * hook at same priority as a bpf one (we don't want to * prevent defrag, conntrack, iptables etc from attaching). */ if (reg->priority == orig_ops[i]->priority && reg->hook_ops_type == NF_HOOK_OP_BPF) return ERR_PTR(-EBUSY); } } if (alloc_entries > MAX_HOOK_COUNT) return ERR_PTR(-E2BIG); new = allocate_hook_entries_size(alloc_entries); if (!new) return ERR_PTR(-ENOMEM); new_ops = nf_hook_entries_get_hook_ops(new); i = 0; nhooks = 0; while (i < old_entries) { if (orig_ops[i] == &dummy_ops) { ++i; continue; } if (inserted || reg->priority > orig_ops[i]->priority) { new_ops[nhooks] = (void *)orig_ops[i]; new->hooks[nhooks] = old->hooks[i]; i++; } else { new_ops[nhooks] = (void *)reg; new->hooks[nhooks].hook = reg->hook; new->hooks[nhooks].priv = reg->priv; inserted = true; } nhooks++; } if (!inserted) { new_ops[nhooks] = (void *)reg; new->hooks[nhooks].hook = reg->hook; new->hooks[nhooks].priv = reg->priv; } return new; } static void hooks_validate(const struct nf_hook_entries *hooks) { #ifdef CONFIG_DEBUG_MISC struct nf_hook_ops **orig_ops; int prio = INT_MIN; size_t i = 0; orig_ops = nf_hook_entries_get_hook_ops(hooks); for (i = 0; i < hooks->num_hook_entries; i++) { if (orig_ops[i] == &dummy_ops) continue; WARN_ON(orig_ops[i]->priority < prio); if (orig_ops[i]->priority > prio) prio = orig_ops[i]->priority; } #endif } int nf_hook_entries_insert_raw(struct nf_hook_entries __rcu **pp, const struct nf_hook_ops *reg) { struct nf_hook_entries *new_hooks; struct nf_hook_entries *p; p = rcu_dereference_raw(*pp); new_hooks = nf_hook_entries_grow(p, reg); if (IS_ERR(new_hooks)) return PTR_ERR(new_hooks); hooks_validate(new_hooks); rcu_assign_pointer(*pp, new_hooks); BUG_ON(p == new_hooks); nf_hook_entries_free(p); return 0; } EXPORT_SYMBOL_GPL(nf_hook_entries_insert_raw); /* * __nf_hook_entries_try_shrink - try to shrink hook array * * @old -- current hook blob at @pp * @pp -- location of hook blob * * Hook unregistration must always succeed, so to-be-removed hooks * are replaced by a dummy one that will just move to next hook. * * This counts the current dummy hooks, attempts to allocate new blob, * copies the live hooks, then replaces and discards old one. * * return values: * * Returns address to free, or NULL. */ static void *__nf_hook_entries_try_shrink(struct nf_hook_entries *old, struct nf_hook_entries __rcu **pp) { unsigned int i, j, skip = 0, hook_entries; struct nf_hook_entries *new = NULL; struct nf_hook_ops **orig_ops; struct nf_hook_ops **new_ops; if (WARN_ON_ONCE(!old)) return NULL; orig_ops = nf_hook_entries_get_hook_ops(old); for (i = 0; i < old->num_hook_entries; i++) { if (orig_ops[i] == &dummy_ops) skip++; } /* if skip == hook_entries all hooks have been removed */ hook_entries = old->num_hook_entries; if (skip == hook_entries) goto out_assign; if (skip == 0) return NULL; hook_entries -= skip; new = allocate_hook_entries_size(hook_entries); if (!new) return NULL; new_ops = nf_hook_entries_get_hook_ops(new); for (i = 0, j = 0; i < old->num_hook_entries; i++) { if (orig_ops[i] == &dummy_ops) continue; new->hooks[j] = old->hooks[i]; new_ops[j] = (void *)orig_ops[i]; j++; } hooks_validate(new); out_assign: rcu_assign_pointer(*pp, new); return old; } static struct nf_hook_entries __rcu ** nf_hook_entry_head(struct net *net, int pf, unsigned int hooknum, struct net_device *dev) { switch (pf) { case NFPROTO_NETDEV: break; #ifdef CONFIG_NETFILTER_FAMILY_ARP case NFPROTO_ARP: if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_arp) <= hooknum)) return NULL; return net->nf.hooks_arp + hooknum; #endif #ifdef CONFIG_NETFILTER_FAMILY_BRIDGE case NFPROTO_BRIDGE: if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_bridge) <= hooknum)) return NULL; return net->nf.hooks_bridge + hooknum; #endif #ifdef CONFIG_NETFILTER_INGRESS case NFPROTO_INET: if (WARN_ON_ONCE(hooknum != NF_INET_INGRESS)) return NULL; if (!dev || dev_net(dev) != net) { WARN_ON_ONCE(1); return NULL; } return &dev->nf_hooks_ingress; #endif case NFPROTO_IPV4: if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_ipv4) <= hooknum)) return NULL; return net->nf.hooks_ipv4 + hooknum; case NFPROTO_IPV6: if (WARN_ON_ONCE(ARRAY_SIZE(net->nf.hooks_ipv6) <= hooknum)) return NULL; return net->nf.hooks_ipv6 + hooknum; default: WARN_ON_ONCE(1); return NULL; } #ifdef CONFIG_NETFILTER_INGRESS if (hooknum == NF_NETDEV_INGRESS) { if (dev && dev_net(dev) == net) return &dev->nf_hooks_ingress; } #endif #ifdef CONFIG_NETFILTER_EGRESS if (hooknum == NF_NETDEV_EGRESS) { if (dev && dev_net(dev) == net) return &dev->nf_hooks_egress; } #endif WARN_ON_ONCE(1); return NULL; } static int nf_ingress_check(struct net *net, const struct nf_hook_ops *reg, int hooknum) { #ifndef CONFIG_NETFILTER_INGRESS if (reg->hooknum == hooknum) return -EOPNOTSUPP; #endif if (reg->hooknum != hooknum || !reg->dev || dev_net(reg->dev) != net) return -EINVAL; return 0; } static inline bool __maybe_unused nf_ingress_hook(const struct nf_hook_ops *reg, int pf) { if ((pf == NFPROTO_NETDEV && reg->hooknum == NF_NETDEV_INGRESS) || (pf == NFPROTO_INET && reg->hooknum == NF_INET_INGRESS)) return true; return false; } static inline bool __maybe_unused nf_egress_hook(const struct nf_hook_ops *reg, int pf) { return pf == NFPROTO_NETDEV && reg->hooknum == NF_NETDEV_EGRESS; } static void nf_static_key_inc(const struct nf_hook_ops *reg, int pf) { #ifdef CONFIG_JUMP_LABEL int hooknum; if (pf == NFPROTO_INET && reg->hooknum == NF_INET_INGRESS) { pf = NFPROTO_NETDEV; hooknum = NF_NETDEV_INGRESS; } else { hooknum = reg->hooknum; } static_key_slow_inc(&nf_hooks_needed[pf][hooknum]); #endif } static void nf_static_key_dec(const struct nf_hook_ops *reg, int pf) { #ifdef CONFIG_JUMP_LABEL int hooknum; if (pf == NFPROTO_INET && reg->hooknum == NF_INET_INGRESS) { pf = NFPROTO_NETDEV; hooknum = NF_NETDEV_INGRESS; } else { hooknum = reg->hooknum; } static_key_slow_dec(&nf_hooks_needed[pf][hooknum]); #endif } static int __nf_register_net_hook(struct net *net, int pf, const struct nf_hook_ops *reg) { struct nf_hook_entries *p, *new_hooks; struct nf_hook_entries __rcu **pp; int err; switch (pf) { case NFPROTO_NETDEV: #ifndef CONFIG_NETFILTER_INGRESS if (reg->hooknum == NF_NETDEV_INGRESS) return -EOPNOTSUPP; #endif #ifndef CONFIG_NETFILTER_EGRESS if (reg->hooknum == NF_NETDEV_EGRESS) return -EOPNOTSUPP; #endif if ((reg->hooknum != NF_NETDEV_INGRESS && reg->hooknum != NF_NETDEV_EGRESS) || !reg->dev || dev_net(reg->dev) != net) return -EINVAL; break; case NFPROTO_INET: if (reg->hooknum != NF_INET_INGRESS) break; err = nf_ingress_check(net, reg, NF_INET_INGRESS); if (err < 0) return err; break; } pp = nf_hook_entry_head(net, pf, reg->hooknum, reg->dev); if (!pp) return -EINVAL; mutex_lock(&nf_hook_mutex); p = nf_entry_dereference(*pp); new_hooks = nf_hook_entries_grow(p, reg); if (!IS_ERR(new_hooks)) { hooks_validate(new_hooks); rcu_assign_pointer(*pp, new_hooks); } mutex_unlock(&nf_hook_mutex); if (IS_ERR(new_hooks)) return PTR_ERR(new_hooks); #ifdef CONFIG_NETFILTER_INGRESS if (nf_ingress_hook(reg, pf)) net_inc_ingress_queue(); #endif #ifdef CONFIG_NETFILTER_EGRESS if (nf_egress_hook(reg, pf)) net_inc_egress_queue(); #endif nf_static_key_inc(reg, pf); BUG_ON(p == new_hooks); nf_hook_entries_free(p); return 0; } /* * nf_remove_net_hook - remove a hook from blob * * @oldp: current address of hook blob * @unreg: hook to unregister * * This cannot fail, hook unregistration must always succeed. * Therefore replace the to-be-removed hook with a dummy hook. */ static bool nf_remove_net_hook(struct nf_hook_entries *old, const struct nf_hook_ops *unreg) { struct nf_hook_ops **orig_ops; unsigned int i; orig_ops = nf_hook_entries_get_hook_ops(old); for (i = 0; i < old->num_hook_entries; i++) { if (orig_ops[i] != unreg) continue; WRITE_ONCE(old->hooks[i].hook, accept_all); WRITE_ONCE(orig_ops[i], (void *)&dummy_ops); return true; } return false; } static void __nf_unregister_net_hook(struct net *net, int pf, const struct nf_hook_ops *reg) { struct nf_hook_entries __rcu **pp; struct nf_hook_entries *p; pp = nf_hook_entry_head(net, pf, reg->hooknum, reg->dev); if (!pp) return; mutex_lock(&nf_hook_mutex); p = nf_entry_dereference(*pp); if (WARN_ON_ONCE(!p)) { mutex_unlock(&nf_hook_mutex); return; } if (nf_remove_net_hook(p, reg)) { #ifdef CONFIG_NETFILTER_INGRESS if (nf_ingress_hook(reg, pf)) net_dec_ingress_queue(); #endif #ifdef CONFIG_NETFILTER_EGRESS if (nf_egress_hook(reg, pf)) net_dec_egress_queue(); #endif nf_static_key_dec(reg, pf); } else { WARN_ONCE(1, "hook not found, pf %d num %d", pf, reg->hooknum); } p = __nf_hook_entries_try_shrink(p, pp); mutex_unlock(&nf_hook_mutex); if (!p) return; nf_queue_nf_hook_drop(net); nf_hook_entries_free(p); } void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg) { if (reg->pf == NFPROTO_INET) { if (reg->hooknum == NF_INET_INGRESS) { __nf_unregister_net_hook(net, NFPROTO_INET, reg); } else { __nf_unregister_net_hook(net, NFPROTO_IPV4, reg); __nf_unregister_net_hook(net, NFPROTO_IPV6, reg); } } else { __nf_unregister_net_hook(net, reg->pf, reg); } } EXPORT_SYMBOL(nf_unregister_net_hook); void nf_hook_entries_delete_raw(struct nf_hook_entries __rcu **pp, const struct nf_hook_ops *reg) { struct nf_hook_entries *p; p = rcu_dereference_raw(*pp); if (nf_remove_net_hook(p, reg)) { p = __nf_hook_entries_try_shrink(p, pp); nf_hook_entries_free(p); } } EXPORT_SYMBOL_GPL(nf_hook_entries_delete_raw); int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) { int err; if (reg->pf == NFPROTO_INET) { if (reg->hooknum == NF_INET_INGRESS) { err = __nf_register_net_hook(net, NFPROTO_INET, reg); if (err < 0) return err; } else { err = __nf_register_net_hook(net, NFPROTO_IPV4, reg); if (err < 0) return err; err = __nf_register_net_hook(net, NFPROTO_IPV6, reg); if (err < 0) { __nf_unregister_net_hook(net, NFPROTO_IPV4, reg); return err; } } } else { err = __nf_register_net_hook(net, reg->pf, reg); if (err < 0) return err; } return 0; } EXPORT_SYMBOL(nf_register_net_hook); int nf_register_net_hooks(struct net *net, const struct nf_hook_ops *reg, unsigned int n) { unsigned int i; int err = 0; for (i = 0; i < n; i++) { err = nf_register_net_hook(net, ®[i]); if (err) goto err; } return err; err: if (i > 0) nf_unregister_net_hooks(net, reg, i); return err; } EXPORT_SYMBOL(nf_register_net_hooks); void nf_unregister_net_hooks(struct net *net, const struct nf_hook_ops *reg, unsigned int hookcount) { unsigned int i; for (i = 0; i < hookcount; i++) nf_unregister_net_hook(net, ®[i]); } EXPORT_SYMBOL(nf_unregister_net_hooks); /* Returns 1 if okfn() needs to be executed by the caller, * -EPERM for NF_DROP, 0 otherwise. Caller must hold rcu_read_lock. */ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, const struct nf_hook_entries *e, unsigned int s) { unsigned int verdict; int ret; for (; s < e->num_hook_entries; s++) { verdict = nf_hook_entry_hookfn(&e->hooks[s], skb, state); switch (verdict & NF_VERDICT_MASK) { case NF_ACCEPT: break; case NF_DROP: kfree_skb_reason(skb, SKB_DROP_REASON_NETFILTER_DROP); ret = NF_DROP_GETERR(verdict); if (ret == 0) ret = -EPERM; return ret; case NF_QUEUE: ret = nf_queue(skb, state, s, verdict); if (ret == 1) continue; return ret; case NF_STOLEN: return NF_DROP_GETERR(verdict); default: WARN_ON_ONCE(1); return 0; } } return 1; } EXPORT_SYMBOL(nf_hook_slow); void nf_hook_slow_list(struct list_head *head, struct nf_hook_state *state, const struct nf_hook_entries *e) { struct sk_buff *skb, *next; LIST_HEAD(sublist); int ret; list_for_each_entry_safe(skb, next, head, list) { skb_list_del_init(skb); ret = nf_hook_slow(skb, state, e, 0); if (ret == 1) list_add_tail(&skb->list, &sublist); } /* Put passed packets back on main list */ list_splice(&sublist, head); } EXPORT_SYMBOL(nf_hook_slow_list); /* This needs to be compiled in any case to avoid dependencies between the * nfnetlink_queue code and nf_conntrack. */ const struct nfnl_ct_hook __rcu *nfnl_ct_hook __read_mostly; EXPORT_SYMBOL_GPL(nfnl_ct_hook); const struct nf_ct_hook __rcu *nf_ct_hook __read_mostly; EXPORT_SYMBOL_GPL(nf_ct_hook); const struct nf_defrag_hook __rcu *nf_defrag_v4_hook __read_mostly; EXPORT_SYMBOL_GPL(nf_defrag_v4_hook); const struct nf_defrag_hook __rcu *nf_defrag_v6_hook __read_mostly; EXPORT_SYMBOL_GPL(nf_defrag_v6_hook); #if IS_ENABLED(CONFIG_NF_CONNTRACK) u8 nf_ctnetlink_has_listener; EXPORT_SYMBOL_GPL(nf_ctnetlink_has_listener); const struct nf_nat_hook __rcu *nf_nat_hook __read_mostly; EXPORT_SYMBOL_GPL(nf_nat_hook); /* This does not belong here, but locally generated errors need it if connection * tracking in use: without this, connection may not be in hash table, and hence * manufactured ICMP or RST packets will not be associated with it. */ void nf_ct_attach(struct sk_buff *new, const struct sk_buff *skb) { const struct nf_ct_hook *ct_hook; if (skb->_nfct) { rcu_read_lock(); ct_hook = rcu_dereference(nf_ct_hook); if (ct_hook) ct_hook->attach(new, skb); rcu_read_unlock(); } } EXPORT_SYMBOL(nf_ct_attach); void nf_conntrack_destroy(struct nf_conntrack *nfct) { const struct nf_ct_hook *ct_hook; rcu_read_lock(); ct_hook = rcu_dereference(nf_ct_hook); if (ct_hook) ct_hook->destroy(nfct); rcu_read_unlock(); WARN_ON(!ct_hook); } EXPORT_SYMBOL(nf_conntrack_destroy); void nf_ct_set_closing(struct nf_conntrack *nfct) { const struct nf_ct_hook *ct_hook; if (!nfct) return; rcu_read_lock(); ct_hook = rcu_dereference(nf_ct_hook); if (ct_hook) ct_hook->set_closing(nfct); rcu_read_unlock(); } EXPORT_SYMBOL_GPL(nf_ct_set_closing); bool nf_ct_get_tuple_skb(struct nf_conntrack_tuple *dst_tuple, const struct sk_buff *skb) { const struct nf_ct_hook *ct_hook; bool ret = false; rcu_read_lock(); ct_hook = rcu_dereference(nf_ct_hook); if (ct_hook) ret = ct_hook->get_tuple_skb(dst_tuple, skb); rcu_read_unlock(); return ret; } EXPORT_SYMBOL(nf_ct_get_tuple_skb); /* Built-in default zone used e.g. by modules. */ const struct nf_conntrack_zone nf_ct_zone_dflt = { .id = NF_CT_DEFAULT_ZONE_ID, .dir = NF_CT_DEFAULT_ZONE_DIR, }; EXPORT_SYMBOL_GPL(nf_ct_zone_dflt); #endif /* CONFIG_NF_CONNTRACK */ static void __net_init __netfilter_net_init(struct nf_hook_entries __rcu **e, int max) { int h; for (h = 0; h < max; h++) RCU_INIT_POINTER(e[h], NULL); } static int __net_init netfilter_net_init(struct net *net) { __netfilter_net_init(net->nf.hooks_ipv4, ARRAY_SIZE(net->nf.hooks_ipv4)); __netfilter_net_init(net->nf.hooks_ipv6, ARRAY_SIZE(net->nf.hooks_ipv6)); #ifdef CONFIG_NETFILTER_FAMILY_ARP __netfilter_net_init(net->nf.hooks_arp, ARRAY_SIZE(net->nf.hooks_arp)); #endif #ifdef CONFIG_NETFILTER_FAMILY_BRIDGE __netfilter_net_init(net->nf.hooks_bridge, ARRAY_SIZE(net->nf.hooks_bridge)); #endif #ifdef CONFIG_PROC_FS net->nf.proc_netfilter = proc_net_mkdir(net, "netfilter", net->proc_net); if (!net->nf.proc_netfilter) { if (!net_eq(net, &init_net)) pr_err("cannot create netfilter proc entry"); return -ENOMEM; } #endif return 0; } static void __net_exit netfilter_net_exit(struct net *net) { remove_proc_entry("netfilter", net->proc_net); } static struct pernet_operations netfilter_net_ops = { .init = netfilter_net_init, .exit = netfilter_net_exit, }; int __init netfilter_init(void) { int ret; ret = register_pernet_subsys(&netfilter_net_ops); if (ret < 0) goto err; #ifdef CONFIG_LWTUNNEL ret = netfilter_lwtunnel_init(); if (ret < 0) goto err_lwtunnel_pernet; #endif ret = netfilter_log_init(); if (ret < 0) goto err_log_pernet; return 0; err_log_pernet: #ifdef CONFIG_LWTUNNEL netfilter_lwtunnel_fini(); err_lwtunnel_pernet: #endif unregister_pernet_subsys(&netfilter_net_ops); err: return ret; } |
| 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | // SPDX-License-Identifier: GPL-2.0-or-later /* * IPVS: Overflow-Connection Scheduling module * * Authors: Raducu Deaconu <rhadoo_io@yahoo.com> * * Scheduler implements "overflow" loadbalancing according to number of active * connections , will keep all connections to the node with the highest weight * and overflow to the next node if the number of connections exceeds the node's * weight. * Note that this scheduler might not be suitable for UDP because it only uses * active connections */ #define KMSG_COMPONENT "IPVS" #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt #include <linux/module.h> #include <linux/kernel.h> #include <net/ip_vs.h> /* OVF Connection scheduling */ static struct ip_vs_dest * ip_vs_ovf_schedule(struct ip_vs_service *svc, const struct sk_buff *skb, struct ip_vs_iphdr *iph) { struct ip_vs_dest *dest, *h = NULL; int hw = 0, w; IP_VS_DBG(6, "ip_vs_ovf_schedule(): Scheduling...\n"); /* select the node with highest weight, go to next in line if active * connections exceed weight */ list_for_each_entry_rcu(dest, &svc->destinations, n_list) { w = atomic_read(&dest->weight); if ((dest->flags & IP_VS_DEST_F_OVERLOAD) || atomic_read(&dest->activeconns) > w || w == 0) continue; if (!h || w > hw) { h = dest; hw = w; } } if (h) { IP_VS_DBG_BUF(6, "OVF: server %s:%u active %d w %d\n", IP_VS_DBG_ADDR(h->af, &h->addr), ntohs(h->port), atomic_read(&h->activeconns), atomic_read(&h->weight)); return h; } ip_vs_scheduler_err(svc, "no destination available"); return NULL; } static struct ip_vs_scheduler ip_vs_ovf_scheduler = { .name = "ovf", .refcnt = ATOMIC_INIT(0), .module = THIS_MODULE, .n_list = LIST_HEAD_INIT(ip_vs_ovf_scheduler.n_list), .schedule = ip_vs_ovf_schedule, }; static int __init ip_vs_ovf_init(void) { return register_ip_vs_scheduler(&ip_vs_ovf_scheduler); } static void __exit ip_vs_ovf_cleanup(void) { unregister_ip_vs_scheduler(&ip_vs_ovf_scheduler); synchronize_rcu(); } module_init(ip_vs_ovf_init); module_exit(ip_vs_ovf_cleanup); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("ipvs overflow connection scheduler"); |
| 7 7 253 226 244 244 27 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef RQ_QOS_H #define RQ_QOS_H #include <linux/kernel.h> #include <linux/blkdev.h> #include <linux/blk_types.h> #include <linux/atomic.h> #include <linux/wait.h> #include <linux/blk-mq.h> #include "blk-mq-debugfs.h" struct blk_mq_debugfs_attr; extern struct static_key_false block_rq_qos; enum rq_qos_id { RQ_QOS_WBT, RQ_QOS_LATENCY, RQ_QOS_COST, }; struct rq_wait { wait_queue_head_t wait; atomic_t inflight; }; struct rq_qos { const struct rq_qos_ops *ops; struct gendisk *disk; enum rq_qos_id id; struct rq_qos *next; #ifdef CONFIG_BLK_DEBUG_FS struct dentry *debugfs_dir; #endif }; struct rq_qos_ops { void (*throttle)(struct rq_qos *, struct bio *); void (*track)(struct rq_qos *, struct request *, struct bio *); void (*merge)(struct rq_qos *, struct request *, struct bio *); void (*issue)(struct rq_qos *, struct request *); void (*requeue)(struct rq_qos *, struct request *); void (*done)(struct rq_qos *, struct request *); void (*done_bio)(struct rq_qos *, struct bio *); void (*cleanup)(struct rq_qos *, struct bio *); void (*queue_depth_changed)(struct rq_qos *); void (*exit)(struct rq_qos *); const struct blk_mq_debugfs_attr *debugfs_attrs; }; struct rq_depth { unsigned int max_depth; int scale_step; bool scaled_max; unsigned int queue_depth; unsigned int default_depth; }; static inline struct rq_qos *rq_qos_id(struct request_queue *q, enum rq_qos_id id) { struct rq_qos *rqos; for (rqos = q->rq_qos; rqos; rqos = rqos->next) { if (rqos->id == id) break; } return rqos; } static inline struct rq_qos *wbt_rq_qos(struct request_queue *q) { return rq_qos_id(q, RQ_QOS_WBT); } static inline struct rq_qos *iolat_rq_qos(struct request_queue *q) { return rq_qos_id(q, RQ_QOS_LATENCY); } static inline void rq_wait_init(struct rq_wait *rq_wait) { atomic_set(&rq_wait->inflight, 0); init_waitqueue_head(&rq_wait->wait); } int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id, const struct rq_qos_ops *ops); void rq_qos_del(struct rq_qos *rqos); typedef bool (acquire_inflight_cb_t)(struct rq_wait *rqw, void *private_data); typedef void (cleanup_cb_t)(struct rq_wait *rqw, void *private_data); void rq_qos_wait(struct rq_wait *rqw, void *private_data, acquire_inflight_cb_t *acquire_inflight_cb, cleanup_cb_t *cleanup_cb); bool rq_wait_inc_below(struct rq_wait *rq_wait, unsigned int limit); bool rq_depth_scale_up(struct rq_depth *rqd); bool rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle); bool rq_depth_calc_max_depth(struct rq_depth *rqd); void __rq_qos_cleanup(struct rq_qos *rqos, struct bio *bio); void __rq_qos_done(struct rq_qos *rqos, struct request *rq); void __rq_qos_issue(struct rq_qos *rqos, struct request *rq); void __rq_qos_requeue(struct rq_qos *rqos, struct request *rq); void __rq_qos_throttle(struct rq_qos *rqos, struct bio *bio); void __rq_qos_track(struct rq_qos *rqos, struct request *rq, struct bio *bio); void __rq_qos_merge(struct rq_qos *rqos, struct request *rq, struct bio *bio); void __rq_qos_done_bio(struct rq_qos *rqos, struct bio *bio); void __rq_qos_queue_depth_changed(struct rq_qos *rqos); static inline void rq_qos_cleanup(struct request_queue *q, struct bio *bio) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) __rq_qos_cleanup(q->rq_qos, bio); } static inline void rq_qos_done(struct request_queue *q, struct request *rq) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos && !blk_rq_is_passthrough(rq)) __rq_qos_done(q->rq_qos, rq); } static inline void rq_qos_issue(struct request_queue *q, struct request *rq) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) __rq_qos_issue(q->rq_qos, rq); } static inline void rq_qos_requeue(struct request_queue *q, struct request *rq) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) __rq_qos_requeue(q->rq_qos, rq); } static inline void rq_qos_done_bio(struct bio *bio) { if (static_branch_unlikely(&block_rq_qos) && bio->bi_bdev && (bio_flagged(bio, BIO_QOS_THROTTLED) || bio_flagged(bio, BIO_QOS_MERGED))) { struct request_queue *q = bdev_get_queue(bio->bi_bdev); if (q->rq_qos) __rq_qos_done_bio(q->rq_qos, bio); } } static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) { bio_set_flag(bio, BIO_QOS_THROTTLED); __rq_qos_throttle(q->rq_qos, bio); } } static inline void rq_qos_track(struct request_queue *q, struct request *rq, struct bio *bio) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) __rq_qos_track(q->rq_qos, rq, bio); } static inline void rq_qos_merge(struct request_queue *q, struct request *rq, struct bio *bio) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) { bio_set_flag(bio, BIO_QOS_MERGED); __rq_qos_merge(q->rq_qos, rq, bio); } } static inline void rq_qos_queue_depth_changed(struct request_queue *q) { if (static_branch_unlikely(&block_rq_qos) && q->rq_qos) __rq_qos_queue_depth_changed(q->rq_qos); } void rq_qos_exit(struct request_queue *); #endif |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * Device memory TCP support * * Authors: Mina Almasry <almasrymina@google.com> * Willem de Bruijn <willemb@google.com> * Kaiyuan Zhang <kaiyuanz@google.com> * */ #ifndef _NET_DEVMEM_H #define _NET_DEVMEM_H #include <net/netmem.h> #include <net/netdev_netlink.h> struct netlink_ext_ack; struct net_devmem_dmabuf_binding { struct dma_buf *dmabuf; struct dma_buf_attachment *attachment; struct sg_table *sgt; struct net_device *dev; struct gen_pool *chunk_pool; /* Protect dev */ struct mutex lock; /* The user holds a ref (via the netlink API) for as long as they want * the binding to remain alive. Each page pool using this binding holds * a ref to keep the binding alive. The page_pool does not release the * ref until all the net_iovs allocated from this binding are released * back to the page_pool. * * The binding undos itself and unmaps the underlying dmabuf once all * those refs are dropped and the binding is no longer desired or in * use. * * net_devmem_get_net_iov() on dmabuf net_iovs will increment this * reference, making sure that the binding remains alive until all the * net_iovs are no longer used. net_iovs allocated from this binding * that are stuck in the TX path for any reason (such as awaiting * retransmits) hold a reference to the binding until the skb holding * them is freed. */ refcount_t ref; /* The list of bindings currently active. Used for netlink to notify us * of the user dropping the bind. */ struct list_head list; /* rxq's this binding is active on. */ struct xarray bound_rxqs; /* ID of this binding. Globally unique to all bindings currently * active. */ u32 id; /* Array of net_iov pointers for this binding, sorted by virtual * address. This array is convenient to map the virtual addresses to * net_iovs in the TX path. */ struct net_iov **tx_vec; struct work_struct unbind_w; }; #if defined(CONFIG_NET_DEVMEM) /* Owner of the dma-buf chunks inserted into the gen pool. Each scatterlist * entry from the dmabuf is inserted into the genpool as a chunk, and needs * this owner struct to keep track of some metadata necessary to create * allocations from this chunk. */ struct dmabuf_genpool_chunk_owner { struct net_iov_area area; struct net_devmem_dmabuf_binding *binding; /* dma_addr of the start of the chunk. */ dma_addr_t base_dma_addr; }; void __net_devmem_dmabuf_binding_free(struct work_struct *wq); struct net_devmem_dmabuf_binding * net_devmem_bind_dmabuf(struct net_device *dev, enum dma_data_direction direction, unsigned int dmabuf_fd, struct netdev_nl_sock *priv, struct netlink_ext_ack *extack); struct net_devmem_dmabuf_binding *net_devmem_lookup_dmabuf(u32 id); void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding); int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx, struct net_devmem_dmabuf_binding *binding, struct netlink_ext_ack *extack); void net_devmem_bind_tx_release(struct sock *sk); static inline struct dmabuf_genpool_chunk_owner * net_devmem_iov_to_chunk_owner(const struct net_iov *niov) { struct net_iov_area *owner = net_iov_owner(niov); return container_of(owner, struct dmabuf_genpool_chunk_owner, area); } static inline struct net_devmem_dmabuf_binding * net_devmem_iov_binding(const struct net_iov *niov) { return net_devmem_iov_to_chunk_owner(niov)->binding; } static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov) { return net_devmem_iov_binding(niov)->id; } static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov) { struct net_iov_area *owner = net_iov_owner(niov); return owner->base_virtual + ((unsigned long)net_iov_idx(niov) << PAGE_SHIFT); } static inline bool net_devmem_dmabuf_binding_get(struct net_devmem_dmabuf_binding *binding) { return refcount_inc_not_zero(&binding->ref); } static inline void net_devmem_dmabuf_binding_put(struct net_devmem_dmabuf_binding *binding) { if (!refcount_dec_and_test(&binding->ref)) return; INIT_WORK(&binding->unbind_w, __net_devmem_dmabuf_binding_free); schedule_work(&binding->unbind_w); } void net_devmem_get_net_iov(struct net_iov *niov); void net_devmem_put_net_iov(struct net_iov *niov); struct net_iov * net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding); void net_devmem_free_dmabuf(struct net_iov *ppiov); bool net_is_devmem_iov(struct net_iov *niov); struct net_devmem_dmabuf_binding * net_devmem_get_binding(struct sock *sk, unsigned int dmabuf_id); struct net_iov * net_devmem_get_niov_at(struct net_devmem_dmabuf_binding *binding, size_t addr, size_t *off, size_t *size); #else struct net_devmem_dmabuf_binding; static inline void net_devmem_dmabuf_binding_put(struct net_devmem_dmabuf_binding *binding) { } static inline void net_devmem_get_net_iov(struct net_iov *niov) { } static inline void net_devmem_put_net_iov(struct net_iov *niov) { } static inline void __net_devmem_dmabuf_binding_free(struct work_struct *wq) { } static inline struct net_devmem_dmabuf_binding * net_devmem_bind_dmabuf(struct net_device *dev, enum dma_data_direction direction, unsigned int dmabuf_fd, struct netdev_nl_sock *priv, struct netlink_ext_ack *extack) { return ERR_PTR(-EOPNOTSUPP); } static inline struct net_devmem_dmabuf_binding *net_devmem_lookup_dmabuf(u32 id) { return NULL; } static inline void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding) { } static inline int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx, struct net_devmem_dmabuf_binding *binding, struct netlink_ext_ack *extack) { return -EOPNOTSUPP; } static inline struct net_iov * net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding) { return NULL; } static inline void net_devmem_free_dmabuf(struct net_iov *ppiov) { } static inline unsigned long net_iov_virtual_addr(const struct net_iov *niov) { return 0; } static inline u32 net_devmem_iov_binding_id(const struct net_iov *niov) { return 0; } static inline bool net_is_devmem_iov(struct net_iov *niov) { return false; } static inline struct net_devmem_dmabuf_binding * net_devmem_get_binding(struct sock *sk, unsigned int dmabuf_id) { return ERR_PTR(-EOPNOTSUPP); } static inline struct net_iov * net_devmem_get_niov_at(struct net_devmem_dmabuf_binding *binding, size_t addr, size_t *off, size_t *size) { return NULL; } static inline struct net_devmem_dmabuf_binding * net_devmem_iov_binding(const struct net_iov *niov) { return NULL; } #endif #endif /* _NET_DEVMEM_H */ |
| 4 4 4 2 4 3 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* FS-Cache tracepoints * * Copyright (C) 2021 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #undef TRACE_SYSTEM #define TRACE_SYSTEM fscache #if !defined(_TRACE_FSCACHE_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_FSCACHE_H #include <linux/fscache.h> #include <linux/tracepoint.h> /* * Define enums for tracing information. */ #ifndef __FSCACHE_DECLARE_TRACE_ENUMS_ONCE_ONLY #define __FSCACHE_DECLARE_TRACE_ENUMS_ONCE_ONLY enum fscache_cache_trace { fscache_cache_collision, fscache_cache_get_acquire, fscache_cache_new_acquire, fscache_cache_put_alloc_volume, fscache_cache_put_cache, fscache_cache_put_prep_failed, fscache_cache_put_relinquish, fscache_cache_put_volume, }; enum fscache_volume_trace { fscache_volume_collision, fscache_volume_get_cookie, fscache_volume_get_create_work, fscache_volume_get_hash_collision, fscache_volume_get_withdraw, fscache_volume_free, fscache_volume_new_acquire, fscache_volume_put_cookie, fscache_volume_put_create_work, fscache_volume_put_hash_collision, fscache_volume_put_relinquish, fscache_volume_put_withdraw, fscache_volume_see_create_work, fscache_volume_see_hash_wake, fscache_volume_wait_create_work, }; enum fscache_cookie_trace { fscache_cookie_collision, fscache_cookie_discard, fscache_cookie_failed, fscache_cookie_get_attach_object, fscache_cookie_get_end_access, fscache_cookie_get_hash_collision, fscache_cookie_get_inval_work, fscache_cookie_get_lru, fscache_cookie_get_use_work, fscache_cookie_new_acquire, fscache_cookie_put_hash_collision, fscache_cookie_put_lru, fscache_cookie_put_object, fscache_cookie_put_over_queued, fscache_cookie_put_relinquish, fscache_cookie_put_withdrawn, fscache_cookie_put_work, fscache_cookie_see_active, fscache_cookie_see_lru_discard, fscache_cookie_see_lru_discard_clear, fscache_cookie_see_lru_do_one, fscache_cookie_see_relinquish, fscache_cookie_see_withdraw, fscache_cookie_see_work, }; enum fscache_active_trace { fscache_active_use, fscache_active_use_modify, fscache_active_unuse, }; enum fscache_access_trace { fscache_access_acquire_volume, fscache_access_acquire_volume_end, fscache_access_cache_pin, fscache_access_cache_unpin, fscache_access_invalidate_cookie, fscache_access_invalidate_cookie_end, fscache_access_io_end, fscache_access_io_not_live, fscache_access_io_read, fscache_access_io_resize, fscache_access_io_wait, fscache_access_io_write, fscache_access_lookup_cookie, fscache_access_lookup_cookie_end, fscache_access_lookup_cookie_end_failed, fscache_access_relinquish_volume, fscache_access_relinquish_volume_end, fscache_access_unlive, }; #endif /* * Declare tracing information enums and their string mappings for display. */ #define fscache_cache_traces \ EM(fscache_cache_collision, "*COLLIDE*") \ EM(fscache_cache_get_acquire, "GET acq ") \ EM(fscache_cache_new_acquire, "NEW acq ") \ EM(fscache_cache_put_alloc_volume, "PUT alvol") \ EM(fscache_cache_put_cache, "PUT cache") \ EM(fscache_cache_put_prep_failed, "PUT pfail") \ EM(fscache_cache_put_relinquish, "PUT relnq") \ E_(fscache_cache_put_volume, "PUT vol ") #define fscache_volume_traces \ EM(fscache_volume_collision, "*COLLIDE*") \ EM(fscache_volume_get_cookie, "GET cook ") \ EM(fscache_volume_get_create_work, "GET creat") \ EM(fscache_volume_get_hash_collision, "GET hcoll") \ EM(fscache_volume_get_withdraw, "GET withd") \ EM(fscache_volume_free, "FREE ") \ EM(fscache_volume_new_acquire, "NEW acq ") \ EM(fscache_volume_put_cookie, "PUT cook ") \ EM(fscache_volume_put_create_work, "PUT creat") \ EM(fscache_volume_put_hash_collision, "PUT hcoll") \ EM(fscache_volume_put_relinquish, "PUT relnq") \ EM(fscache_volume_put_withdraw, "PUT withd") \ EM(fscache_volume_see_create_work, "SEE creat") \ EM(fscache_volume_see_hash_wake, "SEE hwake") \ E_(fscache_volume_wait_create_work, "WAIT crea") #define fscache_cookie_traces \ EM(fscache_cookie_collision, "*COLLIDE*") \ EM(fscache_cookie_discard, "DISCARD ") \ EM(fscache_cookie_failed, "FAILED ") \ EM(fscache_cookie_get_attach_object, "GET attch") \ EM(fscache_cookie_get_hash_collision, "GET hcoll") \ EM(fscache_cookie_get_end_access, "GQ endac") \ EM(fscache_cookie_get_inval_work, "GQ inval") \ EM(fscache_cookie_get_lru, "GET lru ") \ EM(fscache_cookie_get_use_work, "GQ use ") \ EM(fscache_cookie_new_acquire, "NEW acq ") \ EM(fscache_cookie_put_hash_collision, "PUT hcoll") \ EM(fscache_cookie_put_lru, "PUT lru ") \ EM(fscache_cookie_put_object, "PUT obj ") \ EM(fscache_cookie_put_over_queued, "PQ overq") \ EM(fscache_cookie_put_relinquish, "PUT relnq") \ EM(fscache_cookie_put_withdrawn, "PUT wthdn") \ EM(fscache_cookie_put_work, "PQ work ") \ EM(fscache_cookie_see_active, "- activ") \ EM(fscache_cookie_see_lru_discard, "- x-lru") \ EM(fscache_cookie_see_lru_discard_clear,"- lrudc") \ EM(fscache_cookie_see_lru_do_one, "- lrudo") \ EM(fscache_cookie_see_relinquish, "- x-rlq") \ EM(fscache_cookie_see_withdraw, "- x-wth") \ E_(fscache_cookie_see_work, "- work ") #define fscache_active_traces \ EM(fscache_active_use, "USE ") \ EM(fscache_active_use_modify, "USE-m ") \ E_(fscache_active_unuse, "UNUSE ") #define fscache_access_traces \ EM(fscache_access_acquire_volume, "BEGIN acq_vol") \ EM(fscache_access_acquire_volume_end, "END acq_vol") \ EM(fscache_access_cache_pin, "PIN cache ") \ EM(fscache_access_cache_unpin, "UNPIN cache ") \ EM(fscache_access_invalidate_cookie, "BEGIN inval ") \ EM(fscache_access_invalidate_cookie_end,"END inval ") \ EM(fscache_access_io_end, "END io ") \ EM(fscache_access_io_not_live, "END io_notl") \ EM(fscache_access_io_read, "BEGIN io_read") \ EM(fscache_access_io_resize, "BEGIN io_resz") \ EM(fscache_access_io_wait, "WAIT io ") \ EM(fscache_access_io_write, "BEGIN io_writ") \ EM(fscache_access_lookup_cookie, "BEGIN lookup ") \ EM(fscache_access_lookup_cookie_end, "END lookup ") \ EM(fscache_access_lookup_cookie_end_failed,"END lookupf") \ EM(fscache_access_relinquish_volume, "BEGIN rlq_vol") \ EM(fscache_access_relinquish_volume_end,"END rlq_vol") \ E_(fscache_access_unlive, "END unlive ") /* * Export enum symbols via userspace. */ #undef EM #undef E_ #define EM(a, b) TRACE_DEFINE_ENUM(a); #define E_(a, b) TRACE_DEFINE_ENUM(a); fscache_cache_traces; fscache_volume_traces; fscache_cookie_traces; fscache_access_traces; /* * Now redefine the EM() and E_() macros to map the enums to the strings that * will be printed in the output. */ #undef EM #undef E_ #define EM(a, b) { a, b }, #define E_(a, b) { a, b } TRACE_EVENT(fscache_cache, TP_PROTO(unsigned int cache_debug_id, int usage, enum fscache_cache_trace where), TP_ARGS(cache_debug_id, usage, where), TP_STRUCT__entry( __field(unsigned int, cache ) __field(int, usage ) __field(enum fscache_cache_trace, where ) ), TP_fast_assign( __entry->cache = cache_debug_id; __entry->usage = usage; __entry->where = where; ), TP_printk("C=%08x %s r=%d", __entry->cache, __print_symbolic(__entry->where, fscache_cache_traces), __entry->usage) ); TRACE_EVENT(fscache_volume, TP_PROTO(unsigned int volume_debug_id, int usage, enum fscache_volume_trace where), TP_ARGS(volume_debug_id, usage, where), TP_STRUCT__entry( __field(unsigned int, volume ) __field(int, usage ) __field(enum fscache_volume_trace, where ) ), TP_fast_assign( __entry->volume = volume_debug_id; __entry->usage = usage; __entry->where = where; ), TP_printk("V=%08x %s u=%d", __entry->volume, __print_symbolic(__entry->where, fscache_volume_traces), __entry->usage) ); TRACE_EVENT(fscache_cookie, TP_PROTO(unsigned int cookie_debug_id, int ref, enum fscache_cookie_trace where), TP_ARGS(cookie_debug_id, ref, where), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(int, ref ) __field(enum fscache_cookie_trace, where ) ), TP_fast_assign( __entry->cookie = cookie_debug_id; __entry->ref = ref; __entry->where = where; ), TP_printk("c=%08x %s r=%d", __entry->cookie, __print_symbolic(__entry->where, fscache_cookie_traces), __entry->ref) ); TRACE_EVENT(fscache_active, TP_PROTO(unsigned int cookie_debug_id, int ref, int n_active, int n_accesses, enum fscache_active_trace why), TP_ARGS(cookie_debug_id, ref, n_active, n_accesses, why), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(int, ref ) __field(int, n_active ) __field(int, n_accesses ) __field(enum fscache_active_trace, why ) ), TP_fast_assign( __entry->cookie = cookie_debug_id; __entry->ref = ref; __entry->n_active = n_active; __entry->n_accesses = n_accesses; __entry->why = why; ), TP_printk("c=%08x %s r=%d a=%d c=%d", __entry->cookie, __print_symbolic(__entry->why, fscache_active_traces), __entry->ref, __entry->n_accesses, __entry->n_active) ); TRACE_EVENT(fscache_access_cache, TP_PROTO(unsigned int cache_debug_id, int ref, int n_accesses, enum fscache_access_trace why), TP_ARGS(cache_debug_id, ref, n_accesses, why), TP_STRUCT__entry( __field(unsigned int, cache ) __field(int, ref ) __field(int, n_accesses ) __field(enum fscache_access_trace, why ) ), TP_fast_assign( __entry->cache = cache_debug_id; __entry->ref = ref; __entry->n_accesses = n_accesses; __entry->why = why; ), TP_printk("C=%08x %s r=%d a=%d", __entry->cache, __print_symbolic(__entry->why, fscache_access_traces), __entry->ref, __entry->n_accesses) ); TRACE_EVENT(fscache_access_volume, TP_PROTO(unsigned int volume_debug_id, unsigned int cookie_debug_id, int ref, int n_accesses, enum fscache_access_trace why), TP_ARGS(volume_debug_id, cookie_debug_id, ref, n_accesses, why), TP_STRUCT__entry( __field(unsigned int, volume ) __field(unsigned int, cookie ) __field(int, ref ) __field(int, n_accesses ) __field(enum fscache_access_trace, why ) ), TP_fast_assign( __entry->volume = volume_debug_id; __entry->cookie = cookie_debug_id; __entry->ref = ref; __entry->n_accesses = n_accesses; __entry->why = why; ), TP_printk("V=%08x c=%08x %s r=%d a=%d", __entry->volume, __entry->cookie, __print_symbolic(__entry->why, fscache_access_traces), __entry->ref, __entry->n_accesses) ); TRACE_EVENT(fscache_access, TP_PROTO(unsigned int cookie_debug_id, int ref, int n_accesses, enum fscache_access_trace why), TP_ARGS(cookie_debug_id, ref, n_accesses, why), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(int, ref ) __field(int, n_accesses ) __field(enum fscache_access_trace, why ) ), TP_fast_assign( __entry->cookie = cookie_debug_id; __entry->ref = ref; __entry->n_accesses = n_accesses; __entry->why = why; ), TP_printk("c=%08x %s r=%d a=%d", __entry->cookie, __print_symbolic(__entry->why, fscache_access_traces), __entry->ref, __entry->n_accesses) ); TRACE_EVENT(fscache_acquire, TP_PROTO(struct fscache_cookie *cookie), TP_ARGS(cookie), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(unsigned int, volume ) __field(int, v_ref ) __field(int, v_n_cookies ) ), TP_fast_assign( __entry->cookie = cookie->debug_id; __entry->volume = cookie->volume->debug_id; __entry->v_ref = refcount_read(&cookie->volume->ref); __entry->v_n_cookies = atomic_read(&cookie->volume->n_cookies); ), TP_printk("c=%08x V=%08x vr=%d vc=%d", __entry->cookie, __entry->volume, __entry->v_ref, __entry->v_n_cookies) ); TRACE_EVENT(fscache_relinquish, TP_PROTO(struct fscache_cookie *cookie, bool retire), TP_ARGS(cookie, retire), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(unsigned int, volume ) __field(int, ref ) __field(int, n_active ) __field(u8, flags ) __field(bool, retire ) ), TP_fast_assign( __entry->cookie = cookie->debug_id; __entry->volume = cookie->volume->debug_id; __entry->ref = refcount_read(&cookie->ref); __entry->n_active = atomic_read(&cookie->n_active); __entry->flags = cookie->flags; __entry->retire = retire; ), TP_printk("c=%08x V=%08x r=%d U=%d f=%02x rt=%u", __entry->cookie, __entry->volume, __entry->ref, __entry->n_active, __entry->flags, __entry->retire) ); TRACE_EVENT(fscache_invalidate, TP_PROTO(struct fscache_cookie *cookie, loff_t new_size), TP_ARGS(cookie, new_size), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(loff_t, new_size ) ), TP_fast_assign( __entry->cookie = cookie->debug_id; __entry->new_size = new_size; ), TP_printk("c=%08x sz=%llx", __entry->cookie, __entry->new_size) ); TRACE_EVENT(fscache_resize, TP_PROTO(struct fscache_cookie *cookie, loff_t new_size), TP_ARGS(cookie, new_size), TP_STRUCT__entry( __field(unsigned int, cookie ) __field(loff_t, old_size ) __field(loff_t, new_size ) ), TP_fast_assign( __entry->cookie = cookie->debug_id; __entry->old_size = cookie->object_size; __entry->new_size = new_size; ), TP_printk("c=%08x os=%08llx sz=%08llx", __entry->cookie, __entry->old_size, __entry->new_size) ); #endif /* _TRACE_FSCACHE_H */ /* This part must be outside protection */ #include <trace/define_trace.h> |
| 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 | // SPDX-License-Identifier: LGPL-2.1-or-later /* * Linux driver for digital TV devices equipped with B2C2 FlexcopII(b)/III * flexcop.c - main module part * Copyright (C) 2004-9 Patrick Boettcher <patrick.boettcher@posteo.de> * based on skystar2-driver Copyright (C) 2003 Vadim Catana, skystar@moldova.cc * * Acknowledgements: * John Jurrius from BBTI, Inc. for extensive support * with code examples and data books * Bjarne Steinsbo, bjarne at steinsbo.com (some ideas for rewriting) * * Contributions to the skystar2-driver have been done by * Vincenzo Di Massa, hawk.it at tiscalinet.it (several DiSEqC fixes) * Roberto Ragusa, r.ragusa at libero.it (polishing, restyling the code) * Uwe Bugla, uwe.bugla at gmx.de (doing tests, restyling code, writing docu) * Niklas Peinecke, peinecke at gdv.uni-hannover.de (hardware pid/mac * filtering) */ #include "flexcop.h" #define DRIVER_NAME "B2C2 FlexcopII/II(b)/III digital TV receiver chip" #define DRIVER_AUTHOR "Patrick Boettcher <patrick.boettcher@posteo.de" #ifdef CONFIG_DVB_B2C2_FLEXCOP_DEBUG #define DEBSTATUS "" #else #define DEBSTATUS " (debugging is not enabled)" #endif int b2c2_flexcop_debug; EXPORT_SYMBOL_GPL(b2c2_flexcop_debug); module_param_named(debug, b2c2_flexcop_debug, int, 0644); MODULE_PARM_DESC(debug, "set debug level (1=info,2=tuner,4=i2c,8=ts,16=sram,32=reg,64=i2cdump (|-able))." DEBSTATUS); #undef DEBSTATUS DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr); /* global zero for ibi values */ flexcop_ibi_value ibi_zero; static int flexcop_dvb_start_feed(struct dvb_demux_feed *dvbdmxfeed) { struct flexcop_device *fc = dvbdmxfeed->demux->priv; return flexcop_pid_feed_control(fc, dvbdmxfeed, 1); } static int flexcop_dvb_stop_feed(struct dvb_demux_feed *dvbdmxfeed) { struct flexcop_device *fc = dvbdmxfeed->demux->priv; return flexcop_pid_feed_control(fc, dvbdmxfeed, 0); } static int flexcop_dvb_init(struct flexcop_device *fc) { int ret = dvb_register_adapter(&fc->dvb_adapter, "FlexCop Digital TV device", fc->owner, fc->dev, adapter_nr); if (ret < 0) { err("error registering DVB adapter"); return ret; } fc->dvb_adapter.priv = fc; fc->demux.dmx.capabilities = (DMX_TS_FILTERING | DMX_SECTION_FILTERING | DMX_MEMORY_BASED_FILTERING); fc->demux.priv = fc; fc->demux.filternum = fc->demux.feednum = FC_MAX_FEED; fc->demux.start_feed = flexcop_dvb_start_feed; fc->demux.stop_feed = flexcop_dvb_stop_feed; fc->demux.write_to_decoder = NULL; ret = dvb_dmx_init(&fc->demux); if (ret < 0) { err("dvb_dmx failed: error %d", ret); goto err_dmx; } fc->hw_frontend.source = DMX_FRONTEND_0; fc->dmxdev.filternum = fc->demux.feednum; fc->dmxdev.demux = &fc->demux.dmx; fc->dmxdev.capabilities = 0; ret = dvb_dmxdev_init(&fc->dmxdev, &fc->dvb_adapter); if (ret < 0) { err("dvb_dmxdev_init failed: error %d", ret); goto err_dmx_dev; } ret = fc->demux.dmx.add_frontend(&fc->demux.dmx, &fc->hw_frontend); if (ret < 0) { err("adding hw_frontend to dmx failed: error %d", ret); goto err_dmx_add_hw_frontend; } fc->mem_frontend.source = DMX_MEMORY_FE; ret = fc->demux.dmx.add_frontend(&fc->demux.dmx, &fc->mem_frontend); if (ret < 0) { err("adding mem_frontend to dmx failed: error %d", ret); goto err_dmx_add_mem_frontend; } ret = fc->demux.dmx.connect_frontend(&fc->demux.dmx, &fc->hw_frontend); if (ret < 0) { err("connect frontend failed: error %d", ret); goto err_connect_frontend; } ret = dvb_net_init(&fc->dvb_adapter, &fc->dvbnet, &fc->demux.dmx); if (ret < 0) { err("dvb_net_init failed: error %d", ret); goto err_net; } fc->init_state |= FC_STATE_DVB_INIT; return 0; err_net: fc->demux.dmx.disconnect_frontend(&fc->demux.dmx); err_connect_frontend: fc->demux.dmx.remove_frontend(&fc->demux.dmx, &fc->mem_frontend); err_dmx_add_mem_frontend: fc->demux.dmx.remove_frontend(&fc->demux.dmx, &fc->hw_frontend); err_dmx_add_hw_frontend: dvb_dmxdev_release(&fc->dmxdev); err_dmx_dev: dvb_dmx_release(&fc->demux); err_dmx: dvb_unregister_adapter(&fc->dvb_adapter); return ret; } static void flexcop_dvb_exit(struct flexcop_device *fc) { if (fc->init_state & FC_STATE_DVB_INIT) { dvb_net_release(&fc->dvbnet); fc->demux.dmx.close(&fc->demux.dmx); fc->demux.dmx.remove_frontend(&fc->demux.dmx, &fc->mem_frontend); fc->demux.dmx.remove_frontend(&fc->demux.dmx, &fc->hw_frontend); dvb_dmxdev_release(&fc->dmxdev); dvb_dmx_release(&fc->demux); dvb_unregister_adapter(&fc->dvb_adapter); deb_info("deinitialized dvb stuff\n"); } fc->init_state &= ~FC_STATE_DVB_INIT; } /* these methods are necessary to achieve the long-term-goal of hiding the * struct flexcop_device from the bus-parts */ void flexcop_pass_dmx_data(struct flexcop_device *fc, u8 *buf, u32 len) { dvb_dmx_swfilter(&fc->demux, buf, len); } EXPORT_SYMBOL(flexcop_pass_dmx_data); void flexcop_pass_dmx_packets(struct flexcop_device *fc, u8 *buf, u32 no) { dvb_dmx_swfilter_packets(&fc->demux, buf, no); } EXPORT_SYMBOL(flexcop_pass_dmx_packets); static void flexcop_reset(struct flexcop_device *fc) { flexcop_ibi_value v210, v204; /* reset the flexcop itself */ fc->write_ibi_reg(fc,ctrl_208,ibi_zero); v210.raw = 0; v210.sw_reset_210.reset_block_000 = 1; v210.sw_reset_210.reset_block_100 = 1; v210.sw_reset_210.reset_block_200 = 1; v210.sw_reset_210.reset_block_300 = 1; v210.sw_reset_210.reset_block_400 = 1; v210.sw_reset_210.reset_block_500 = 1; v210.sw_reset_210.reset_block_600 = 1; v210.sw_reset_210.reset_block_700 = 1; v210.sw_reset_210.Block_reset_enable = 0xb2; v210.sw_reset_210.Special_controls = 0xc259; fc->write_ibi_reg(fc,sw_reset_210,v210); msleep(1); /* reset the periphical devices */ v204 = fc->read_ibi_reg(fc,misc_204); v204.misc_204.Per_reset_sig = 0; fc->write_ibi_reg(fc,misc_204,v204); msleep(1); v204.misc_204.Per_reset_sig = 1; fc->write_ibi_reg(fc,misc_204,v204); } void flexcop_reset_block_300(struct flexcop_device *fc) { flexcop_ibi_value v208_save = fc->read_ibi_reg(fc, ctrl_208), v210 = fc->read_ibi_reg(fc, sw_reset_210); deb_rdump("208: %08x, 210: %08x\n", v208_save.raw, v210.raw); fc->write_ibi_reg(fc,ctrl_208,ibi_zero); v210.sw_reset_210.reset_block_300 = 1; v210.sw_reset_210.Block_reset_enable = 0xb2; fc->write_ibi_reg(fc,sw_reset_210,v210); fc->write_ibi_reg(fc,ctrl_208,v208_save); } struct flexcop_device *flexcop_device_kmalloc(size_t bus_specific_len) { void *bus; struct flexcop_device *fc = kzalloc(sizeof(struct flexcop_device), GFP_KERNEL); if (!fc) { err("no memory"); return NULL; } bus = kzalloc(bus_specific_len, GFP_KERNEL); if (!bus) { err("no memory"); kfree(fc); return NULL; } fc->bus_specific = bus; return fc; } EXPORT_SYMBOL(flexcop_device_kmalloc); void flexcop_device_kfree(struct flexcop_device *fc) { kfree(fc->bus_specific); kfree(fc); } EXPORT_SYMBOL(flexcop_device_kfree); int flexcop_device_initialize(struct flexcop_device *fc) { int ret; ibi_zero.raw = 0; flexcop_reset(fc); flexcop_determine_revision(fc); flexcop_sram_init(fc); flexcop_hw_filter_init(fc); flexcop_smc_ctrl(fc, 0); ret = flexcop_dvb_init(fc); if (ret) goto error; /* i2c has to be done before doing EEProm stuff - * because the EEProm is accessed via i2c */ ret = flexcop_i2c_init(fc); if (ret) goto error; /* do the MAC address reading after initializing the dvb_adapter */ if (fc->get_mac_addr(fc, 0) == 0) { u8 *b = fc->dvb_adapter.proposed_mac; info("MAC address = %pM", b); flexcop_set_mac_filter(fc,b); flexcop_mac_filter_ctrl(fc,1); } else warn("reading of MAC address failed.\n"); ret = flexcop_frontend_init(fc); if (ret) goto error; flexcop_device_name(fc,"initialization of","complete"); return 0; error: flexcop_device_exit(fc); return ret; } EXPORT_SYMBOL(flexcop_device_initialize); void flexcop_device_exit(struct flexcop_device *fc) { flexcop_frontend_exit(fc); flexcop_i2c_exit(fc); flexcop_dvb_exit(fc); } EXPORT_SYMBOL(flexcop_device_exit); static int flexcop_module_init(void) { info(DRIVER_NAME " loaded successfully"); return 0; } static void flexcop_module_cleanup(void) { info(DRIVER_NAME " unloaded successfully"); } module_init(flexcop_module_init); module_exit(flexcop_module_cleanup); MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_NAME); MODULE_LICENSE("GPL"); |
| 1499 119 1495 1072 1498 22 1511 24 24 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | // SPDX-License-Identifier: GPL-2.0 OR MIT /* * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. * * This is an implementation of the BLAKE2s hash and PRF functions. * * Information: https://blake2.net/ * */ #include <crypto/internal/blake2s.h> #include <linux/types.h> #include <linux/string.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/init.h> #include <linux/bug.h> static inline void blake2s_set_lastblock(struct blake2s_state *state) { state->f[0] = -1; } void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen) { const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; if (unlikely(!inlen)) return; if (inlen > fill) { memcpy(state->buf + state->buflen, in, fill); blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); state->buflen = 0; in += fill; inlen -= fill; } if (inlen > BLAKE2S_BLOCK_SIZE) { const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); } memcpy(state->buf + state->buflen, in, inlen); state->buflen += inlen; } EXPORT_SYMBOL(blake2s_update); void blake2s_final(struct blake2s_state *state, u8 *out) { WARN_ON(IS_ENABLED(DEBUG) && !out); blake2s_set_lastblock(state); memset(state->buf + state->buflen, 0, BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ blake2s_compress(state, state->buf, 1, state->buflen); cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); memcpy(out, state->h, state->outlen); memzero_explicit(state, sizeof(*state)); } EXPORT_SYMBOL(blake2s_final); static int __init blake2s_mod_init(void) { if (IS_ENABLED(CONFIG_CRYPTO_SELFTESTS) && WARN_ON(!blake2s_selftest())) return -ENODEV; return 0; } module_init(blake2s_mod_init); MODULE_DESCRIPTION("BLAKE2s hash function"); MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>"); |
| 8 40 17 17 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 | /* SPDX-License-Identifier: GPL-2.0-or-later */ /* internal AFS stuff * * Copyright (C) 2002, 2007 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/compiler.h> #include <linux/kernel.h> #include <linux/ktime.h> #include <linux/fs.h> #include <linux/filelock.h> #include <linux/pagemap.h> #include <linux/rxrpc.h> #include <linux/key.h> #include <linux/workqueue.h> #include <linux/sched.h> #include <linux/fscache.h> #include <linux/backing-dev.h> #include <linux/uuid.h> #include <linux/mm_types.h> #include <linux/dns_resolver.h> #include <crypto/krb5.h> #include <net/net_namespace.h> #include <net/netns/generic.h> #include <net/sock.h> #include <net/af_rxrpc.h> #include "afs.h" #include "afs_vl.h" #define AFS_CELL_MAX_ADDRS 15 struct pagevec; struct afs_call; struct afs_vnode; struct afs_server_probe; /* * Partial file-locking emulation mode. (The problem being that AFS3 only * allows whole-file locks and no upgrading/downgrading). */ enum afs_flock_mode { afs_flock_mode_unset, afs_flock_mode_local, /* Local locking only */ afs_flock_mode_openafs, /* Don't get server lock for a partial lock */ afs_flock_mode_strict, /* Always get a server lock for a partial lock */ afs_flock_mode_write, /* Get an exclusive server lock for a partial lock */ }; struct afs_fs_context { bool force; /* T to force cell type */ bool autocell; /* T if set auto mount operation */ bool dyn_root; /* T if dynamic root */ bool no_cell; /* T if the source is "none" (for dynroot) */ enum afs_flock_mode flock_mode; /* Partial file-locking emulation mode */ afs_voltype_t type; /* type of volume requested */ unsigned int volnamesz; /* size of volume name */ const char *volname; /* name of volume to mount */ struct afs_net *net; /* the AFS net namespace stuff */ struct afs_cell *cell; /* cell in which to find volume */ struct afs_volume *volume; /* volume record */ struct key *key; /* key to use for secure mounting */ }; enum afs_call_state { AFS_CALL_CL_REQUESTING, /* Client: Request is being sent */ AFS_CALL_CL_AWAIT_REPLY, /* Client: Awaiting reply */ AFS_CALL_CL_PROC_REPLY, /* Client: rxrpc call complete; processing reply */ AFS_CALL_SV_AWAIT_OP_ID, /* Server: Awaiting op ID */ AFS_CALL_SV_AWAIT_REQUEST, /* Server: Awaiting request data */ AFS_CALL_SV_REPLYING, /* Server: Replying */ AFS_CALL_SV_AWAIT_ACK, /* Server: Awaiting final ACK */ AFS_CALL_COMPLETE, /* Completed or failed */ }; /* * Address preferences. */ struct afs_addr_preference { union { struct in_addr ipv4_addr; /* AF_INET address to compare against */ struct in6_addr ipv6_addr; /* AF_INET6 address to compare against */ }; sa_family_t family; /* Which address to use */ u16 prio; /* Priority */ u8 subnet_mask; /* How many bits to compare */ }; struct afs_addr_preference_list { struct rcu_head rcu; u16 version; /* Incremented when prefs list changes */ u8 ipv6_off; /* Offset of IPv6 addresses */ u8 nr; /* Number of addresses in total */ u8 max_prefs; /* Number of prefs allocated */ struct afs_addr_preference prefs[] __counted_by(max_prefs); }; struct afs_address { struct rxrpc_peer *peer; short last_error; /* Last error from this address */ u16 prio; /* Address priority */ }; /* * List of server addresses. */ struct afs_addr_list { struct rcu_head rcu; refcount_t usage; u32 version; /* Version */ unsigned int debug_id; unsigned int addr_pref_version; /* Version of address preference list */ unsigned char max_addrs; unsigned char nr_addrs; unsigned char preferred; /* Preferred address */ unsigned char nr_ipv4; /* Number of IPv4 addresses */ enum dns_record_source source:8; enum dns_lookup_status status:8; unsigned long probe_failed; /* Mask of addrs that failed locally/ICMP */ unsigned long responded; /* Mask of addrs that responded */ struct afs_address addrs[] __counted_by(max_addrs); #define AFS_MAX_ADDRESSES ((unsigned int)(sizeof(unsigned long) * 8)) }; /* * a record of an in-progress RxRPC call */ struct afs_call { const struct afs_call_type *type; /* type of call */ wait_queue_head_t waitq; /* processes awaiting completion */ struct work_struct async_work; /* async I/O processor */ struct work_struct work; /* actual work processor */ struct work_struct free_work; /* Deferred free processor */ struct rxrpc_call *rxcall; /* RxRPC call handle */ struct rxrpc_peer *peer; /* Remote endpoint */ struct key *key; /* security for this call */ struct afs_net *net; /* The network namespace */ struct afs_server *server; /* The fileserver record if fs op (pins ref) */ struct afs_vlserver *vlserver; /* The vlserver record if vl op */ void *request; /* request data (first part) */ size_t iov_len; /* Size of *iter to be used */ struct iov_iter def_iter; /* Default buffer/data iterator */ struct iov_iter *write_iter; /* Iterator defining write to be made */ struct iov_iter *iter; /* Iterator currently in use */ union { /* Convenience for ->def_iter */ struct kvec kvec[1]; struct bio_vec bvec[1]; }; void *buffer; /* reply receive buffer */ union { struct afs_endpoint_state *probe; struct afs_addr_list *vl_probe; struct afs_addr_list *ret_alist; struct afs_vldb_entry *ret_vldb; char *ret_str; }; struct afs_fid fid; /* Primary vnode ID (or all zeroes) */ unsigned char probe_index; /* Address in ->probe_alist */ struct afs_operation *op; unsigned int server_index; refcount_t ref; enum afs_call_state state; spinlock_t state_lock; int error; /* error code */ u32 abort_code; /* Remote abort ID or 0 */ unsigned long long remaining; /* How much is left to receive */ unsigned int max_lifespan; /* Maximum lifespan in secs to set if not 0 */ unsigned request_size; /* size of request data */ unsigned reply_max; /* maximum size of reply */ unsigned count2; /* count used in unmarshalling */ unsigned char unmarshall; /* unmarshalling phase */ bool drop_ref; /* T if need to drop ref for incoming call */ bool need_attention; /* T if RxRPC poked us */ bool async; /* T if asynchronous */ bool upgrade; /* T to request service upgrade */ bool intr; /* T if interruptible */ bool unmarshalling_error; /* T if an unmarshalling error occurred */ bool responded; /* Got a response from the call (may be abort) */ u8 security_ix; /* Security class */ u16 service_id; /* Actual service ID (after upgrade) */ unsigned int debug_id; /* Trace ID */ u32 enctype; /* Security encoding type */ u32 operation_ID; /* operation ID for an incoming call */ u32 count; /* count for use in unmarshalling */ union { /* place to extract temporary data */ struct { __be32 tmp_u; __be32 tmp; } __attribute__((packed)); __be64 tmp64; }; ktime_t issue_time; /* Time of issue of operation */ }; struct afs_call_type { const char *name; unsigned int op; /* Really enum afs_fs_operation */ /* deliver request or reply data to an call * - returning an error will cause the call to be aborted */ int (*deliver)(struct afs_call *call); /* clean up a call */ void (*destructor)(struct afs_call *call); /* Async receive processing function */ void (*async_rx)(struct work_struct *work); /* Work function */ void (*work)(struct work_struct *work); /* Call done function (gets called immediately on success or failure) */ void (*done)(struct afs_call *call); /* Handle a call being immediately cancelled. */ void (*immediate_cancel)(struct afs_call *call); }; /* * Key available for writeback on a file. */ struct afs_wb_key { refcount_t usage; struct key *key; struct list_head vnode_link; /* Link in vnode->wb_keys */ }; /* * AFS open file information record. Pointed to by file->private_data. */ struct afs_file { struct key *key; /* The key this file was opened with */ struct afs_wb_key *wb; /* Writeback key record for this file */ }; static inline struct key *afs_file_key(struct file *file) { struct afs_file *af = file->private_data; return af->key; } /* * AFS superblock private data * - there's one superblock per volume */ struct afs_super_info { struct net *net_ns; /* Network namespace */ struct afs_cell *cell; /* The cell in which the volume resides */ struct afs_volume *volume; /* volume record */ enum afs_flock_mode flock_mode:8; /* File locking emulation mode */ bool dyn_root; /* True if dynamic root */ }; static inline struct afs_super_info *AFS_FS_S(struct super_block *sb) { return sb->s_fs_info; } extern struct file_system_type afs_fs_type; /* * Set of substitutes for @sys. */ struct afs_sysnames { #define AFS_NR_SYSNAME 16 char *subs[AFS_NR_SYSNAME]; refcount_t usage; unsigned short nr; char blank[1]; }; /* * AFS network namespace record. */ struct afs_net { struct net *net; /* Backpointer to the owning net namespace */ struct afs_uuid uuid; bool live; /* F if this namespace is being removed */ /* AF_RXRPC I/O stuff */ struct socket *socket; struct afs_call *spare_incoming_call; struct work_struct charge_preallocation_work; struct work_struct rx_oob_work; struct mutex socket_mutex; atomic_t nr_outstanding_calls; atomic_t nr_superblocks; /* Cell database */ struct rb_root cells; struct idr cells_dyn_ino; /* cell->dynroot_ino mapping */ struct afs_cell __rcu *ws_cell; atomic_t cells_outstanding; struct rw_semaphore cells_lock; struct mutex cells_alias_lock; struct mutex proc_cells_lock; struct hlist_head proc_cells; /* Known servers. Theoretically each fileserver can only be in one * cell, but in practice, people create aliases and subsets and there's * no easy way to distinguish them. */ seqlock_t fs_lock; /* For fs_probe_*, fs_proc */ struct list_head fs_probe_fast; /* List of afs_server to probe at 30s intervals */ struct list_head fs_probe_slow; /* List of afs_server to probe at 5m intervals */ struct hlist_head fs_proc; /* procfs servers list */ struct key *fs_cm_token_key; /* Key for creating CM tokens */ struct work_struct fs_prober; struct timer_list fs_probe_timer; atomic_t servers_outstanding; /* File locking renewal management */ struct mutex lock_manager_mutex; /* Misc */ struct super_block *dynroot_sb; /* Dynamic root mount superblock */ struct proc_dir_entry *proc_afs; /* /proc/net/afs directory */ struct afs_sysnames *sysnames; rwlock_t sysnames_lock; struct afs_addr_preference_list __rcu *address_prefs; u16 address_pref_version; /* Statistics counters */ atomic_t n_lookup; /* Number of lookups done */ atomic_t n_reval; /* Number of dentries needing revalidation */ atomic_t n_inval; /* Number of invalidations by the server */ atomic_t n_relpg; /* Number of invalidations by release_folio */ atomic_t n_read_dir; /* Number of directory pages read */ atomic_t n_dir_cr; /* Number of directory entry creation edits */ atomic_t n_dir_rm; /* Number of directory entry removal edits */ atomic_t n_stores; /* Number of store ops */ atomic_long_t n_store_bytes; /* Number of bytes stored */ atomic_long_t n_fetch_bytes; /* Number of bytes fetched */ atomic_t n_fetches; /* Number of data fetch ops */ }; extern const char afs_init_sysname[]; enum afs_cell_state { AFS_CELL_SETTING_UP, AFS_CELL_ACTIVE, AFS_CELL_REMOVING, AFS_CELL_DEAD, }; /* * AFS cell record. * * This is a tricky concept to get right as it is possible to create aliases * simply by pointing AFSDB/SRV records for two names at the same set of VL * servers; it is also possible to do things like setting up two sets of VL * servers, one of which provides a superset of the volumes provided by the * other (for internal/external division, for example). * * Cells only exist in the sense that (a) a cell's name maps to a set of VL * servers and (b) a cell's name is used by the client to select the key to use * for authentication and encryption. The cell name is not typically used in * the protocol. * * Two cells are determined to be aliases if they have an explicit alias (YFS * only), share any VL servers in common or have at least one volume in common. * "In common" means that the address list of the VL servers or the fileservers * share at least one endpoint. */ struct afs_cell { union { struct rcu_head rcu; struct rb_node net_node; /* Node in net->cells */ }; struct afs_net *net; struct afs_cell *alias_of; /* The cell this is an alias of */ struct afs_volume *root_volume; /* The root.cell volume if there is one */ struct key *anonymous_key; /* anonymous user key for this cell */ struct work_struct destroyer; /* Destroyer for cell */ struct work_struct manager; /* Manager for init/deinit/dns */ struct timer_list management_timer; /* General management timer */ struct hlist_node proc_link; /* /proc cell list link */ time64_t dns_expiry; /* Time AFSDB/SRV record expires */ time64_t last_inactive; /* Time of last drop of usage count */ refcount_t ref; /* Struct refcount */ atomic_t active; /* Active usage counter */ unsigned long flags; #define AFS_CELL_FL_NO_GC 0 /* The cell was added manually, don't auto-gc */ #define AFS_CELL_FL_DO_LOOKUP 1 /* DNS lookup requested */ #define AFS_CELL_FL_CHECK_ALIAS 2 /* Need to check for aliases */ enum afs_cell_state state; short error; enum dns_record_source dns_source:8; /* Latest source of data from lookup */ enum dns_lookup_status dns_status:8; /* Latest status of data from lookup */ unsigned int dns_lookup_count; /* Counter of DNS lookups */ unsigned int debug_id; unsigned int dynroot_ino; /* Inode numbers for dynroot (a pair) */ /* The volumes belonging to this cell */ struct rw_semaphore vs_lock; /* Lock for server->volumes */ struct rb_root volumes; /* Tree of volumes on this server */ struct hlist_head proc_volumes; /* procfs volume list */ seqlock_t volume_lock; /* For volumes */ /* Active fileserver interaction state. */ struct rb_root fs_servers; /* afs_server (by server UUID) */ struct rw_semaphore fs_lock; /* For fs_servers */ /* VL server list. */ rwlock_t vl_servers_lock; /* Lock on vl_servers */ struct afs_vlserver_list __rcu *vl_servers; u8 name_len; /* Length of name */ char *name; /* Cell name, case-flattened and NUL-padded */ }; /* * Volume Location server record. */ struct afs_vlserver { struct rcu_head rcu; struct afs_addr_list __rcu *addresses; /* List of addresses for this VL server */ unsigned long flags; #define AFS_VLSERVER_FL_PROBED 0 /* The VL server has been probed */ #define AFS_VLSERVER_FL_PROBING 1 /* VL server is being probed */ #define AFS_VLSERVER_FL_IS_YFS 2 /* Server is YFS not AFS */ #define AFS_VLSERVER_FL_RESPONDING 3 /* VL server is responding */ rwlock_t lock; /* Lock on addresses */ refcount_t ref; unsigned int rtt; /* Server's current RTT in uS */ unsigned int debug_id; /* Probe state */ wait_queue_head_t probe_wq; atomic_t probe_outstanding; spinlock_t probe_lock; struct { unsigned int rtt; /* Best RTT in uS (or UINT_MAX) */ u32 abort_code; short error; unsigned short flags; #define AFS_VLSERVER_PROBE_RESPONDED 0x01 /* At least once response (may be abort) */ #define AFS_VLSERVER_PROBE_IS_YFS 0x02 /* The peer appears to be YFS */ #define AFS_VLSERVER_PROBE_NOT_YFS 0x04 /* The peer appears not to be YFS */ #define AFS_VLSERVER_PROBE_LOCAL_FAILURE 0x08 /* A local failure prevented a probe */ } probe; u16 service_id; /* Service ID we're using */ u16 port; u16 name_len; /* Length of name */ char name[]; /* Server name, case-flattened */ }; /* * Weighted list of Volume Location servers. */ struct afs_vlserver_entry { u16 priority; /* Preference (as SRV) */ u16 weight; /* Weight (as SRV) */ enum dns_record_source source:8; enum dns_lookup_status status:8; struct afs_vlserver *server; }; struct afs_vlserver_list { struct rcu_head rcu; refcount_t ref; u8 nr_servers; u8 index; /* Server currently in use */ u8 preferred; /* Preferred server */ enum dns_record_source source:8; enum dns_lookup_status status:8; rwlock_t lock; struct afs_vlserver_entry servers[]; }; /* * Cached VLDB entry. * * This is pointed to by cell->vldb_entries, indexed by name. */ struct afs_vldb_entry { afs_volid_t vid[3]; /* Volume IDs for R/W, R/O and Bak volumes */ unsigned long flags; #define AFS_VLDB_HAS_RW 0 /* - R/W volume exists */ #define AFS_VLDB_HAS_RO 1 /* - R/O volume exists */ #define AFS_VLDB_HAS_BAK 2 /* - Backup volume exists */ #define AFS_VLDB_QUERY_VALID 3 /* - Record is valid */ #define AFS_VLDB_QUERY_ERROR 4 /* - VL server returned error */ uuid_t fs_server[AFS_NMAXNSERVERS]; u32 addr_version[AFS_NMAXNSERVERS]; /* Registration change counters */ u8 fs_mask[AFS_NMAXNSERVERS]; #define AFS_VOL_VTM_RW 0x01 /* R/W version of the volume is available (on this server) */ #define AFS_VOL_VTM_RO 0x02 /* R/O version of the volume is available (on this server) */ #define AFS_VOL_VTM_BAK 0x04 /* backup version of the volume is available (on this server) */ u8 vlsf_flags[AFS_NMAXNSERVERS]; short error; u8 nr_servers; /* Number of server records */ u8 name_len; u8 name[AFS_MAXVOLNAME + 1]; /* NUL-padded volume name */ }; /* * Fileserver endpoint state. The records the addresses of a fileserver's * endpoints and the state and result of a round of probing on them. This * allows the rotation algorithm to access those results without them being * erased by a subsequent round of probing. */ struct afs_endpoint_state { struct rcu_head rcu; struct afs_addr_list *addresses; /* The addresses being probed */ unsigned long responsive_set; /* Bitset of responsive endpoints */ unsigned long failed_set; /* Bitset of endpoints we failed to probe */ refcount_t ref; unsigned int server_id; /* Debug ID of server */ unsigned int probe_seq; /* Probe sequence (from server::probe_counter) */ atomic_t nr_probing; /* Number of outstanding probes */ unsigned int rtt; /* Best RTT in uS (or UINT_MAX) */ s32 abort_code; short error; unsigned long flags; #define AFS_ESTATE_RESPONDED 0 /* Set if the server responded */ #define AFS_ESTATE_SUPERSEDED 1 /* Set if this record has been superseded */ #define AFS_ESTATE_IS_YFS 2 /* Set if probe upgraded to YFS */ #define AFS_ESTATE_NOT_YFS 3 /* Set if probe didn't upgrade to YFS */ #define AFS_ESTATE_LOCAL_FAILURE 4 /* Set if there was a local failure (eg. ENOMEM) */ }; /* * Record of fileserver with which we're actively communicating. */ struct afs_server { struct rcu_head rcu; union { uuid_t uuid; /* Server ID */ struct afs_uuid _uuid; }; struct afs_cell *cell; /* Cell to which belongs (pins ref) */ struct rb_node uuid_rb; /* Link in cell->fs_servers */ struct list_head probe_link; /* Link in net->fs_probe_* */ struct hlist_node proc_link; /* Link in net->fs_proc */ struct list_head volumes; /* RCU list of afs_server_entry objects */ struct work_struct destroyer; /* Work item to try and destroy a server */ struct timer_list timer; /* Management timer */ struct mutex cm_token_lock; /* Lock governing creation of appdata */ struct krb5_buffer cm_rxgk_appdata; /* Appdata to be included in RESPONSE packet */ time64_t unuse_time; /* Time at which last unused */ unsigned long flags; #define AFS_SERVER_FL_RESPONDING 0 /* The server is responding */ #define AFS_SERVER_FL_UPDATING 1 #define AFS_SERVER_FL_NEEDS_UPDATE 2 /* Fileserver address list is out of date */ #define AFS_SERVER_FL_UNCREATED 3 /* The record needs creating */ #define AFS_SERVER_FL_CREATING 4 /* The record is being created */ #define AFS_SERVER_FL_EXPIRED 5 /* The record has expired */ #define AFS_SERVER_FL_NOT_FOUND 6 /* VL server says no such server */ #define AFS_SERVER_FL_VL_FAIL 7 /* Failed to access VL server */ #define AFS_SERVER_FL_MAY_HAVE_CB 8 /* May have callbacks on this fileserver */ #define AFS_SERVER_FL_IS_YFS 16 /* Server is YFS not AFS */ #define AFS_SERVER_FL_NO_IBULK 17 /* Fileserver doesn't support FS.InlineBulkStatus */ #define AFS_SERVER_FL_NO_RM2 18 /* Fileserver doesn't support YFS.RemoveFile2 */ #define AFS_SERVER_FL_HAS_FS64 19 /* Fileserver supports FS.{Fetch,Store}Data64 */ refcount_t ref; /* Object refcount */ atomic_t active; /* Active user count */ u32 addr_version; /* Address list version */ u16 service_id; /* Service ID we're using. */ short create_error; /* Creation error */ unsigned int rtt; /* Server's current RTT in uS */ unsigned int debug_id; /* Debugging ID for traces */ /* file service access */ rwlock_t fs_lock; /* access lock */ /* Probe state */ struct afs_endpoint_state __rcu *endpoint_state; /* Latest endpoint/probe state */ unsigned long probed_at; /* Time last probe was dispatched (jiffies) */ wait_queue_head_t probe_wq; unsigned int probe_counter; /* Number of probes issued */ spinlock_t probe_lock; }; enum afs_ro_replicating { AFS_RO_NOT_REPLICATING, /* Not doing replication */ AFS_RO_REPLICATING_USE_OLD, /* Replicating; use old version */ AFS_RO_REPLICATING_USE_NEW, /* Replicating; switch to new version */ } __mode(byte); /* * Replaceable volume server list. */ struct afs_server_entry { struct afs_server *server; struct afs_volume *volume; struct list_head slink; /* Link in server->volumes */ time64_t cb_expires_at; /* Time at which volume-level callback expires */ unsigned long flags; #define AFS_SE_EXCLUDED 0 /* Set if server is to be excluded in rotation */ #define AFS_SE_VOLUME_OFFLINE 1 /* Set if volume offline notice given */ #define AFS_SE_VOLUME_BUSY 2 /* Set if volume busy notice given */ }; struct afs_server_list { struct rcu_head rcu; refcount_t usage; bool attached; /* T if attached to servers */ enum afs_ro_replicating ro_replicating; /* RW->RO update (probably) in progress */ unsigned char nr_servers; unsigned short vnovol_mask; /* Servers to be skipped due to VNOVOL */ unsigned int seq; /* Set to ->servers_seq when installed */ rwlock_t lock; struct afs_server_entry servers[]; }; /* * Live AFS volume management. */ struct afs_volume { struct rcu_head rcu; afs_volid_t vid; /* The volume ID of this volume */ afs_volid_t vids[AFS_MAXTYPES]; /* All associated volume IDs */ refcount_t ref; unsigned int debug_id; /* Debugging ID for traces */ time64_t update_at; /* Time at which to next update */ struct afs_cell *cell; /* Cell to which belongs (pins ref) */ struct rb_node cell_node; /* Link in cell->volumes */ struct hlist_node proc_link; /* Link in cell->proc_volumes */ struct super_block __rcu *sb; /* Superblock on which inodes reside */ struct work_struct destructor; /* Deferred destructor */ unsigned long flags; #define AFS_VOLUME_NEEDS_UPDATE 0 /* - T if an update needs performing */ #define AFS_VOLUME_UPDATING 1 /* - T if an update is in progress */ #define AFS_VOLUME_WAIT 2 /* - T if users must wait for update */ #define AFS_VOLUME_DELETED 3 /* - T if volume appears deleted */ #define AFS_VOLUME_MAYBE_NO_IBULK 4 /* - T if some servers don't have InlineBulkStatus */ #define AFS_VOLUME_RM_TREE 5 /* - Set if volume removed from cell->volumes */ #ifdef CONFIG_AFS_FSCACHE struct fscache_volume *cache; /* Caching cookie */ #endif struct afs_server_list __rcu *servers; /* List of servers on which volume resides */ rwlock_t servers_lock; /* Lock for ->servers */ unsigned int servers_seq; /* Incremented each time ->servers changes */ /* RO release tracking */ struct mutex volsync_lock; /* Time/state evaluation lock */ time64_t creation_time; /* Volume creation time (or TIME64_MIN) */ time64_t update_time; /* Volume update time (or TIME64_MIN) */ /* Callback management */ struct mutex cb_check_lock; /* Lock to control race to check after v_break */ time64_t cb_expires_at; /* Earliest volume callback expiry time */ atomic_t cb_ro_snapshot; /* RO volume update-from-snapshot counter */ atomic_t cb_v_break; /* Volume-break event counter. */ atomic_t cb_v_check; /* Volume-break has-been-checked counter. */ atomic_t cb_scrub; /* Scrub-all-data event counter. */ rwlock_t cb_v_break_lock; struct rw_semaphore open_mmaps_lock; struct list_head open_mmaps; /* List of vnodes that are mmapped */ afs_voltype_t type; /* type of volume */ char type_force; /* force volume type (suppress R/O -> R/W) */ u8 name_len; u8 name[AFS_MAXVOLNAME + 1]; /* NUL-padded volume name */ }; enum afs_lock_state { AFS_VNODE_LOCK_NONE, /* The vnode has no lock on the server */ AFS_VNODE_LOCK_WAITING_FOR_CB, /* We're waiting for the server to break the callback */ AFS_VNODE_LOCK_SETTING, /* We're asking the server for a lock */ AFS_VNODE_LOCK_GRANTED, /* We have a lock on the server */ AFS_VNODE_LOCK_EXTENDING, /* We're extending a lock on the server */ AFS_VNODE_LOCK_NEED_UNLOCK, /* We need to unlock on the server */ AFS_VNODE_LOCK_UNLOCKING, /* We're telling the server to unlock */ AFS_VNODE_LOCK_DELETED, /* The vnode has been deleted whilst we have a lock */ }; /* * AFS inode private data. * * Note that afs_alloc_inode() *must* reset anything that could incorrectly * leak from one inode to another. */ struct afs_vnode { struct netfs_inode netfs; /* Netfslib context and vfs inode */ struct afs_volume *volume; /* volume on which vnode resides */ struct afs_fid fid; /* the file identifier for this inode */ struct afs_file_status status; /* AFS status info for this file */ afs_dataversion_t invalid_before; /* Child dentries are invalid before this */ struct afs_permits __rcu *permit_cache; /* cache of permits so far obtained */ struct list_head io_lock_waiters; /* Threads waiting for the I/O lock */ struct rw_semaphore validate_lock; /* lock for validating this vnode */ struct rw_semaphore rmdir_lock; /* Lock for rmdir vs sillyrename */ struct key *silly_key; /* Silly rename key */ spinlock_t wb_lock; /* lock for wb_keys */ spinlock_t lock; /* waitqueue/flags lock */ unsigned long flags; #define AFS_VNODE_IO_LOCK 0 /* Set if the I/O serialisation lock is held */ #define AFS_VNODE_UNSET 1 /* set if vnode attributes not yet set */ #define AFS_VNODE_DIR_VALID 2 /* Set if dir contents are valid */ #define AFS_VNODE_ZAP_DATA 3 /* set if vnode's data should be invalidated */ #define AFS_VNODE_DELETED 4 /* set if vnode deleted on server */ #define AFS_VNODE_MOUNTPOINT 5 /* set if vnode is a mountpoint symlink */ #define AFS_VNODE_PSEUDODIR 7 /* set if Vnode is a pseudo directory */ #define AFS_VNODE_NEW_CONTENT 8 /* Set if file has new content (create/trunc-0) */ #define AFS_VNODE_SILLY_DELETED 9 /* Set if file has been silly-deleted */ #define AFS_VNODE_MODIFYING 10 /* Set if we're performing a modification op */ #define AFS_VNODE_DIR_READ 11 /* Set if we've read a dir's contents */ struct folio_queue *directory; /* Directory contents */ struct list_head wb_keys; /* List of keys available for writeback */ struct list_head pending_locks; /* locks waiting to be granted */ struct list_head granted_locks; /* locks granted on this file */ struct delayed_work lock_work; /* work to be done in locking */ struct key *lock_key; /* Key to be used in lock ops */ ktime_t locked_at; /* Time at which lock obtained */ enum afs_lock_state lock_state : 8; afs_lock_type_t lock_type : 8; unsigned int directory_size; /* Amount of space in ->directory */ /* outstanding callback notification on this file */ struct work_struct cb_work; /* Work for mmap'd files */ struct list_head cb_mmap_link; /* Link in cell->fs_open_mmaps */ void *cb_server; /* Server with callback/filelock */ atomic_t cb_nr_mmap; /* Number of mmaps */ unsigned int cb_ro_snapshot; /* RO volume release counter on ->volume */ unsigned int cb_scrub; /* Scrub counter on ->volume */ unsigned int cb_break; /* Break counter on vnode */ unsigned int cb_v_check; /* Break check counter on ->volume */ seqlock_t cb_lock; /* Lock for ->cb_server, ->status, ->cb_*break */ atomic64_t cb_expires_at; /* time at which callback expires */ #define AFS_NO_CB_PROMISE TIME64_MIN }; static inline struct fscache_cookie *afs_vnode_cache(struct afs_vnode *vnode) { #ifdef CONFIG_AFS_FSCACHE return netfs_i_cookie(&vnode->netfs); #else return NULL; #endif } static inline void afs_vnode_set_cache(struct afs_vnode *vnode, struct fscache_cookie *cookie) { #ifdef CONFIG_AFS_FSCACHE vnode->netfs.cache = cookie; if (cookie) mapping_set_release_always(vnode->netfs.inode.i_mapping); #endif } /* * cached security record for one user's attempt to access a vnode */ struct afs_permit { struct key *key; /* RxRPC ticket holding a security context */ afs_access_t access; /* CallerAccess value for this key */ }; /* * Immutable cache of CallerAccess records from attempts to access vnodes. * These may be shared between multiple vnodes. */ struct afs_permits { struct rcu_head rcu; struct hlist_node hash_node; /* Link in hash */ unsigned long h; /* Hash value for this permit list */ refcount_t usage; unsigned short nr_permits; /* Number of records */ bool invalidated; /* Invalidated due to key change */ struct afs_permit permits[] __counted_by(nr_permits); /* List of permits sorted by key pointer */ }; /* * Error prioritisation and accumulation. */ struct afs_error { s32 abort_code; /* Cumulative abort code */ short error; /* Cumulative error */ bool responded; /* T if server responded */ bool aborted; /* T if ->error is from an abort */ }; /* * Cursor for iterating over a set of volume location servers. */ struct afs_vl_cursor { struct afs_cell *cell; /* The cell we're querying */ struct afs_vlserver_list *server_list; /* Current server list (pins ref) */ struct afs_vlserver *server; /* Server on which this resides */ struct afs_addr_list *alist; /* Current address list (pins ref) */ struct key *key; /* Key for the server */ unsigned long untried_servers; /* Bitmask of untried servers */ unsigned long addr_tried; /* Tried addresses */ struct afs_error cumul_error; /* Cumulative error */ unsigned int debug_id; s32 call_abort_code; short call_error; /* Error from single call */ short server_index; /* Current server */ signed char addr_index; /* Current address */ unsigned short flags; #define AFS_VL_CURSOR_STOP 0x0001 /* Set to cease iteration */ #define AFS_VL_CURSOR_RETRY 0x0002 /* Set to do a retry */ #define AFS_VL_CURSOR_RETRIED 0x0004 /* Set if started a retry */ short nr_iterations; /* Number of server iterations */ bool call_responded; /* T if the current address responded */ }; /* * Fileserver state tracking for an operation. An array of these is kept, * indexed by server index. */ struct afs_server_state { /* Tracking of fileserver probe state. Other operations may interfere * by probing a fileserver when accessing other volumes. */ unsigned int probe_seq; unsigned long untried_addrs; /* Addresses we haven't tried yet */ struct wait_queue_entry probe_waiter; struct afs_endpoint_state *endpoint_state; /* Endpoint state being monitored */ }; /* * Fileserver operation methods. */ struct afs_operation_ops { void (*issue_afs_rpc)(struct afs_operation *op); void (*issue_yfs_rpc)(struct afs_operation *op); void (*success)(struct afs_operation *op); void (*aborted)(struct afs_operation *op); void (*failed)(struct afs_operation *op); void (*edit_dir)(struct afs_operation *op); void (*put)(struct afs_operation *op); }; struct afs_vnode_param { struct afs_vnode *vnode; struct afs_fid fid; /* Fid to access */ struct afs_status_cb scb; /* Returned status and callback promise */ afs_dataversion_t dv_before; /* Data version before the call */ unsigned int cb_break_before; /* cb_break before the call */ u8 dv_delta; /* Expected change in data version */ bool put_vnode:1; /* T if we have a ref on the vnode */ bool need_io_lock:1; /* T if we need the I/O lock on this */ bool update_ctime:1; /* Need to update the ctime */ bool set_size:1; /* Must update i_size */ bool op_unlinked:1; /* True if file was unlinked by op */ bool speculative:1; /* T if speculative status fetch (no vnode lock) */ bool modification:1; /* Set if the content gets modified */ }; /* * Fileserver operation wrapper, handling server and address rotation * asynchronously. May make simultaneous calls to multiple servers. */ struct afs_operation { struct afs_net *net; /* Network namespace */ struct key *key; /* Key for the cell */ const struct afs_call_type *type; /* Type of call done */ const struct afs_operation_ops *ops; /* Parameters/results for the operation */ struct afs_volume *volume; /* Volume being accessed */ struct afs_vnode_param file[2]; struct afs_vnode_param *more_files; struct afs_volsync pre_volsync; /* Volsync before op */ struct afs_volsync volsync; /* Volsync returned by op */ struct dentry *dentry; /* Dentry to be altered */ struct dentry *dentry_2; /* Second dentry to be altered */ struct timespec64 mtime; /* Modification time to record */ struct timespec64 ctime; /* Change time to set */ struct afs_error cumul_error; /* Cumulative error */ short nr_files; /* Number of entries in file[], more_files */ unsigned int debug_id; unsigned int cb_v_break; /* Volume break counter before op */ union { struct { int which; /* Which ->file[] to fetch for */ } fetch_status; struct { int reason; /* enum afs_edit_dir_reason */ mode_t mode; const char *symlink; } create; struct { bool need_rehash; } unlink; struct { struct dentry *rehash; struct dentry *tmp; bool new_negative; } rename; struct { struct netfs_io_subrequest *subreq; } fetch; struct { afs_lock_type_t type; } lock; struct { struct iov_iter *write_iter; loff_t pos; loff_t size; loff_t i_size; } store; struct { struct iattr *attr; loff_t old_i_size; } setattr; struct afs_acl *acl; struct yfs_acl *yacl; struct { struct afs_volume_status vs; struct kstatfs *buf; } volstatus; }; /* Fileserver iteration state */ struct afs_server_list *server_list; /* Current server list (pins ref) */ struct afs_server *server; /* Server we're using (ref pinned by server_list) */ struct afs_endpoint_state *estate; /* Current endpoint state (doesn't pin ref) */ struct afs_server_state *server_states; /* States of the servers involved */ struct afs_call *call; unsigned long untried_servers; /* Bitmask of untried servers */ unsigned long addr_tried; /* Tried addresses */ s32 call_abort_code; /* Abort code from single call */ short call_error; /* Error from single call */ short server_index; /* Current server */ short nr_iterations; /* Number of server iterations */ signed char addr_index; /* Current address */ bool call_responded; /* T if the current address responded */ unsigned int flags; #define AFS_OPERATION_STOP 0x0001 /* Set to cease iteration */ #define AFS_OPERATION_VBUSY 0x0002 /* Set if seen VBUSY */ #define AFS_OPERATION_VMOVED 0x0004 /* Set if seen VMOVED */ #define AFS_OPERATION_VNOVOL 0x0008 /* Set if seen VNOVOL */ #define AFS_OPERATION_CUR_ONLY 0x0010 /* Set if current server only (file lock held) */ #define AFS_OPERATION_NO_VSLEEP 0x0020 /* Set to prevent sleep on VBUSY, VOFFLINE, ... */ #define AFS_OPERATION_UNINTR 0x0040 /* Set if op is uninterruptible */ #define AFS_OPERATION_DOWNGRADE 0x0080 /* Set to retry with downgraded opcode */ #define AFS_OPERATION_LOCK_0 0x0100 /* Set if have io_lock on file[0] */ #define AFS_OPERATION_LOCK_1 0x0200 /* Set if have io_lock on file[1] */ #define AFS_OPERATION_TRIED_ALL 0x0400 /* Set if we've tried all the fileservers */ #define AFS_OPERATION_RETRY_SERVER 0x0800 /* Set if we should retry the current server */ #define AFS_OPERATION_DIR_CONFLICT 0x1000 /* Set if we detected a 3rd-party dir change */ #define AFS_OPERATION_ASYNC 0x2000 /* Set if should run asynchronously */ }; /* * Cache auxiliary data. */ struct afs_vnode_cache_aux { __be64 data_version; } __packed; static inline void afs_set_cache_aux(struct afs_vnode *vnode, struct afs_vnode_cache_aux *aux) { aux->data_version = cpu_to_be64(vnode->status.data_version); } static inline void afs_invalidate_cache(struct afs_vnode *vnode, unsigned int flags) { struct afs_vnode_cache_aux aux; afs_set_cache_aux(vnode, &aux); fscache_invalidate(afs_vnode_cache(vnode), &aux, i_size_read(&vnode->netfs.inode), flags); } /* * Directory iteration management. */ struct afs_dir_iter { struct afs_vnode *dvnode; union afs_xdr_dir_block *block; struct folio_queue *fq; unsigned int fpos; int fq_slot; unsigned int loop_check; u8 nr_slots; u8 bucket; unsigned int prev_entry; }; #include <trace/events/afs.h> /*****************************************************************************/ /* * addr_list.c */ struct afs_addr_list *afs_get_addrlist(struct afs_addr_list *alist, enum afs_alist_trace reason); extern struct afs_addr_list *afs_alloc_addrlist(unsigned int nr); extern void afs_put_addrlist(struct afs_addr_list *alist, enum afs_alist_trace reason); extern struct afs_vlserver_list *afs_parse_text_addrs(struct afs_net *, const char *, size_t, char, unsigned short, unsigned short); bool afs_addr_list_same(const struct afs_addr_list *a, const struct afs_addr_list *b); extern struct afs_vlserver_list *afs_dns_query(struct afs_cell *, time64_t *); extern int afs_merge_fs_addr4(struct afs_net *net, struct afs_addr_list *addr, __be32 xdr, u16 port); extern int afs_merge_fs_addr6(struct afs_net *net, struct afs_addr_list *addr, __be32 *xdr, u16 port); void afs_set_peer_appdata(struct afs_server *server, struct afs_addr_list *old_alist, struct afs_addr_list *new_alist); /* * addr_prefs.c */ int afs_proc_addr_prefs_write(struct file *file, char *buf, size_t size); void afs_get_address_preferences_rcu(struct afs_net *net, struct afs_addr_list *alist); void afs_get_address_preferences(struct afs_net *net, struct afs_addr_list *alist); /* * callback.c */ extern void afs_invalidate_mmap_work(struct work_struct *); extern void afs_init_callback_state(struct afs_server *); extern void __afs_break_callback(struct afs_vnode *, enum afs_cb_break_reason); extern void afs_break_callback(struct afs_vnode *, enum afs_cb_break_reason); extern void afs_break_callbacks(struct afs_server *, size_t, struct afs_callback_break *); static inline unsigned int afs_calc_vnode_cb_break(struct afs_vnode *vnode) { return vnode->cb_break + vnode->cb_ro_snapshot + vnode->cb_scrub; } static inline bool afs_cb_is_broken(unsigned int cb_break, const struct afs_vnode *vnode) { return cb_break != (vnode->cb_break + atomic_read(&vnode->volume->cb_ro_snapshot) + atomic_read(&vnode->volume->cb_scrub)); } /* * cell.c */ extern int afs_cell_init(struct afs_net *, const char *); extern struct afs_cell *afs_find_cell(struct afs_net *, const char *, unsigned, enum afs_cell_trace); struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *name, unsigned int namesz, const char *vllist, bool excl, enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason); extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_see_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_put_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_queue_cell(struct afs_cell *, enum afs_cell_trace); void afs_set_cell_timer(struct afs_cell *cell, unsigned int delay_secs); extern void __net_exit afs_cell_purge(struct afs_net *); /* * cmservice.c */ extern bool afs_cm_incoming_call(struct afs_call *); /* * cm_security.c */ void afs_process_oob_queue(struct work_struct *work); #ifdef CONFIG_RXGK int afs_create_token_key(struct afs_net *net, struct socket *socket); #else static inline int afs_create_token_key(struct afs_net *net, struct socket *socket) { return 0; } #endif /* * dir.c */ extern const struct file_operations afs_dir_file_operations; extern const struct inode_operations afs_dir_inode_operations; extern const struct address_space_operations afs_dir_aops; extern const struct dentry_operations afs_fs_dentry_operations; ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file); ssize_t afs_read_dir(struct afs_vnode *dvnode, struct file *file) __acquires(&dvnode->validate_lock); extern void afs_d_release(struct dentry *); extern void afs_check_for_remote_deletion(struct afs_operation *); int afs_single_writepages(struct address_space *mapping, struct writeback_control *wbc); /* * dir_edit.c */ extern void afs_edit_dir_add(struct afs_vnode *, struct qstr *, struct afs_fid *, enum afs_edit_dir_reason); extern void afs_edit_dir_remove(struct afs_vnode *, struct qstr *, enum afs_edit_dir_reason); void afs_edit_dir_update_dotdot(struct afs_vnode *vnode, struct afs_vnode *new_dvnode, enum afs_edit_dir_reason why); void afs_mkdir_init_dir(struct afs_vnode *dvnode, struct afs_vnode *parent_vnode); /* * dir_search.c */ unsigned int afs_dir_hash_name(const struct qstr *name); bool afs_dir_init_iter(struct afs_dir_iter *iter, const struct qstr *name); union afs_xdr_dir_block *afs_dir_find_block(struct afs_dir_iter *iter, size_t block); int afs_dir_search_bucket(struct afs_dir_iter *iter, const struct qstr *name, struct afs_fid *_fid); int afs_dir_search(struct afs_vnode *dvnode, struct qstr *name, struct afs_fid *_fid, afs_dataversion_t *_dir_version); /* * dir_silly.c */ extern int afs_sillyrename(struct afs_vnode *, struct afs_vnode *, struct dentry *, struct key *); extern int afs_silly_iput(struct dentry *, struct inode *); /* * dynroot.c */ extern const struct inode_operations afs_dynroot_inode_operations; extern const struct dentry_operations afs_dynroot_dentry_operations; struct inode *afs_dynroot_iget_root(struct super_block *sb); /* * file.c */ extern const struct address_space_operations afs_file_aops; extern const struct inode_operations afs_file_inode_operations; extern const struct file_operations afs_file_operations; extern const struct afs_operation_ops afs_fetch_data_operation; extern const struct netfs_request_ops afs_req_ops; extern int afs_cache_wb_key(struct afs_vnode *, struct afs_file *); extern void afs_put_wb_key(struct afs_wb_key *); extern int afs_open(struct inode *, struct file *); extern int afs_release(struct inode *, struct file *); void afs_fetch_data_async_rx(struct work_struct *work); void afs_fetch_data_immediate_cancel(struct afs_call *call); /* * flock.c */ extern struct workqueue_struct *afs_lock_manager; extern void afs_lock_op_done(struct afs_call *); extern void afs_lock_work(struct work_struct *); extern void afs_lock_may_be_available(struct afs_vnode *); extern int afs_lock(struct file *, int, struct file_lock *); extern int afs_flock(struct file *, int, struct file_lock *); /* * fsclient.c */ extern void afs_fs_fetch_status(struct afs_operation *); extern void afs_fs_fetch_data(struct afs_operation *); extern void afs_fs_create_file(struct afs_operation *); extern void afs_fs_make_dir(struct afs_operation *); extern void afs_fs_remove_file(struct afs_operation *); extern void afs_fs_remove_dir(struct afs_operation *); extern void afs_fs_link(struct afs_operation *); extern void afs_fs_symlink(struct afs_operation *); extern void afs_fs_rename(struct afs_operation *); extern void afs_fs_store_data(struct afs_operation *); extern void afs_fs_setattr(struct afs_operation *); extern void afs_fs_get_volume_status(struct afs_operation *); extern void afs_fs_set_lock(struct afs_operation *); extern void afs_fs_extend_lock(struct afs_operation *); extern void afs_fs_release_lock(struct afs_operation *); int afs_fs_give_up_all_callbacks(struct afs_net *net, struct afs_server *server, struct afs_address *addr, struct key *key); bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server, struct afs_endpoint_state *estate, unsigned int addr_index, struct key *key); extern void afs_fs_inline_bulk_status(struct afs_operation *); struct afs_acl { u32 size; u8 data[] __counted_by(size); }; extern void afs_fs_fetch_acl(struct afs_operation *); extern void afs_fs_store_acl(struct afs_operation *); /* * fs_operation.c */ extern struct afs_operation *afs_alloc_operation(struct key *, struct afs_volume *); extern int afs_put_operation(struct afs_operation *); extern bool afs_begin_vnode_operation(struct afs_operation *); extern void afs_end_vnode_operation(struct afs_operation *op); extern void afs_wait_for_operation(struct afs_operation *); extern int afs_do_sync_operation(struct afs_operation *); static inline void afs_op_set_vnode(struct afs_operation *op, unsigned int n, struct afs_vnode *vnode) { op->file[n].vnode = vnode; op->file[n].need_io_lock = true; } static inline void afs_op_set_fid(struct afs_operation *op, unsigned int n, const struct afs_fid *fid) { op->file[n].fid = *fid; } /* * fs_probe.c */ struct afs_endpoint_state *afs_get_endpoint_state(struct afs_endpoint_state *estate, enum afs_estate_trace where); void afs_put_endpoint_state(struct afs_endpoint_state *estate, enum afs_estate_trace where); extern void afs_fileserver_probe_result(struct afs_call *); int afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, struct afs_addr_list *new_alist, struct key *key); int afs_wait_for_fs_probes(struct afs_operation *op, struct afs_server_state *states, bool intr); extern void afs_probe_fileserver(struct afs_net *, struct afs_server *); extern void afs_fs_probe_dispatcher(struct work_struct *); int afs_wait_for_one_fs_probe(struct afs_server *server, struct afs_endpoint_state *estate, unsigned long exclude, bool is_intr); extern void afs_fs_probe_cleanup(struct afs_net *); /* * inode.c */ extern const struct afs_operation_ops afs_fetch_status_operation; void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *op); const char *afs_get_link(struct dentry *dentry, struct inode *inode, struct delayed_call *callback); int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); extern void afs_vnode_commit_status(struct afs_operation *, struct afs_vnode_param *); extern int afs_fetch_status(struct afs_vnode *, struct key *, bool, afs_access_t *); extern int afs_ilookup5_test_by_fid(struct inode *, void *); extern struct inode *afs_iget(struct afs_operation *, struct afs_vnode_param *); extern struct inode *afs_root_iget(struct super_block *, struct key *); extern int afs_getattr(struct mnt_idmap *idmap, const struct path *, struct kstat *, u32, unsigned int); extern int afs_setattr(struct mnt_idmap *idmap, struct dentry *, struct iattr *); extern void afs_evict_inode(struct inode *); extern int afs_drop_inode(struct inode *); /* * main.c */ extern struct workqueue_struct *afs_wq; extern int afs_net_id; static inline struct afs_net *afs_net(struct net *net) { return net_generic(net, afs_net_id); } static inline struct afs_net *afs_sb2net(struct super_block *sb) { return afs_net(AFS_FS_S(sb)->net_ns); } static inline struct afs_net *afs_d2net(struct dentry *dentry) { return afs_sb2net(dentry->d_sb); } static inline struct afs_net *afs_i2net(struct inode *inode) { return afs_sb2net(inode->i_sb); } static inline struct afs_net *afs_v2net(struct afs_vnode *vnode) { return afs_i2net(&vnode->netfs.inode); } static inline struct afs_net *afs_sock2net(struct sock *sk) { return net_generic(sock_net(sk), afs_net_id); } static inline void __afs_stat(atomic_t *s) { atomic_inc(s); } #define afs_stat_v(vnode, n) __afs_stat(&afs_v2net(vnode)->n) /* * misc.c */ extern int afs_abort_to_error(u32); extern void afs_prioritise_error(struct afs_error *, int, u32); static inline void afs_op_nomem(struct afs_operation *op) { op->cumul_error.error = -ENOMEM; } static inline int afs_op_error(const struct afs_operation *op) { return op->cumul_error.error; } static inline s32 afs_op_abort_code(const struct afs_operation *op) { return op->cumul_error.abort_code; } static inline int afs_op_set_error(struct afs_operation *op, int error) { return op->cumul_error.error = error; } static inline void afs_op_accumulate_error(struct afs_operation *op, int error, s32 abort_code) { afs_prioritise_error(&op->cumul_error, error, abort_code); } /* * mntpt.c */ extern const struct inode_operations afs_mntpt_inode_operations; extern const struct inode_operations afs_autocell_inode_operations; extern const struct file_operations afs_mntpt_file_operations; extern struct vfsmount *afs_d_automount(struct path *); extern void afs_mntpt_kill_timer(void); /* * proc.c */ #ifdef CONFIG_PROC_FS extern int __net_init afs_proc_init(struct afs_net *); extern void __net_exit afs_proc_cleanup(struct afs_net *); extern int afs_proc_cell_setup(struct afs_cell *); extern void afs_proc_cell_remove(struct afs_cell *); extern void afs_put_sysnames(struct afs_sysnames *); #else static inline int afs_proc_init(struct afs_net *net) { return 0; } static inline void afs_proc_cleanup(struct afs_net *net) {} static inline int afs_proc_cell_setup(struct afs_cell *cell) { return 0; } static inline void afs_proc_cell_remove(struct afs_cell *cell) {} static inline void afs_put_sysnames(struct afs_sysnames *sysnames) {} #endif /* * rotate.c */ void afs_clear_server_states(struct afs_operation *op); extern bool afs_select_fileserver(struct afs_operation *); extern void afs_dump_edestaddrreq(const struct afs_operation *); /* * rxrpc.c */ extern struct workqueue_struct *afs_async_calls; extern int __net_init afs_open_socket(struct afs_net *); extern void __net_exit afs_close_socket(struct afs_net *); extern void afs_charge_preallocation(struct work_struct *); extern void afs_put_call(struct afs_call *); void afs_deferred_put_call(struct afs_call *call); void afs_make_call(struct afs_call *call, gfp_t gfp); void afs_deliver_to_call(struct afs_call *call); void afs_wait_for_call_to_complete(struct afs_call *call); extern struct afs_call *afs_alloc_flat_call(struct afs_net *, const struct afs_call_type *, size_t, size_t); extern void afs_flat_call_destructor(struct afs_call *); extern void afs_send_empty_reply(struct afs_call *); extern void afs_send_simple_reply(struct afs_call *, const void *, size_t); extern int afs_extract_data(struct afs_call *, bool); extern int afs_protocol_error(struct afs_call *, enum afs_eproto_cause); static inline struct afs_call *afs_get_call(struct afs_call *call, enum afs_call_trace why) { int r; __refcount_inc(&call->ref, &r); trace_afs_call(call->debug_id, why, r + 1, atomic_read(&call->net->nr_outstanding_calls), __builtin_return_address(0)); return call; } static inline void afs_see_call(struct afs_call *call, enum afs_call_trace why) { int r = refcount_read(&call->ref); trace_afs_call(call->debug_id, why, r, atomic_read(&call->net->nr_outstanding_calls), __builtin_return_address(0)); } static inline void afs_make_op_call(struct afs_operation *op, struct afs_call *call, gfp_t gfp) { struct afs_addr_list *alist = op->estate->addresses; op->call = call; op->type = call->type; call->op = op; call->key = op->key; call->intr = !(op->flags & AFS_OPERATION_UNINTR); call->peer = rxrpc_kernel_get_peer(alist->addrs[op->addr_index].peer); call->service_id = op->server->service_id; afs_make_call(call, gfp); } static inline void afs_extract_begin(struct afs_call *call, void *buf, size_t size) { call->iov_len = size; call->kvec[0].iov_base = buf; call->kvec[0].iov_len = size; iov_iter_kvec(&call->def_iter, ITER_DEST, call->kvec, 1, size); } static inline void afs_extract_to_tmp(struct afs_call *call) { call->iov_len = sizeof(call->tmp); afs_extract_begin(call, &call->tmp, sizeof(call->tmp)); } static inline void afs_extract_to_tmp64(struct afs_call *call) { call->iov_len = sizeof(call->tmp64); afs_extract_begin(call, &call->tmp64, sizeof(call->tmp64)); } static inline void afs_extract_discard(struct afs_call *call, size_t size) { call->iov_len = size; iov_iter_discard(&call->def_iter, ITER_DEST, size); } static inline void afs_extract_to_buf(struct afs_call *call, size_t size) { call->iov_len = size; afs_extract_begin(call, call->buffer, size); } static inline int afs_transfer_reply(struct afs_call *call) { return afs_extract_data(call, false); } static inline bool afs_check_call_state(struct afs_call *call, enum afs_call_state state) { return READ_ONCE(call->state) == state; } static inline bool afs_set_call_state(struct afs_call *call, enum afs_call_state from, enum afs_call_state to) { bool ok = false; spin_lock_bh(&call->state_lock); if (call->state == from) { call->state = to; trace_afs_call_state(call, from, to, 0, 0); ok = true; } spin_unlock_bh(&call->state_lock); return ok; } static inline void afs_set_call_complete(struct afs_call *call, int error, u32 remote_abort) { enum afs_call_state state; bool ok = false; spin_lock_bh(&call->state_lock); state = call->state; if (state != AFS_CALL_COMPLETE) { call->abort_code = remote_abort; call->error = error; call->state = AFS_CALL_COMPLETE; trace_afs_call_state(call, state, AFS_CALL_COMPLETE, error, remote_abort); ok = true; } spin_unlock_bh(&call->state_lock); if (ok) { trace_afs_call_done(call); /* Asynchronous calls have two refs to release - one from the alloc and * one queued with the work item - and we can't just deallocate the * call because the work item may be queued again. */ if (call->drop_ref) afs_put_call(call); } } /* * security.c */ extern void afs_put_permits(struct afs_permits *); extern void afs_clear_permits(struct afs_vnode *); extern void afs_cache_permit(struct afs_vnode *, struct key *, unsigned int, struct afs_status_cb *); extern struct key *afs_request_key(struct afs_cell *); extern struct key *afs_request_key_rcu(struct afs_cell *); extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *); extern int afs_permission(struct mnt_idmap *, struct inode *, int); extern void __exit afs_clean_up_permit_cache(void); /* * server.c */ extern spinlock_t afs_server_peer_lock; struct afs_server *afs_find_server(const struct rxrpc_peer *peer); extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32); extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace); struct afs_server *afs_use_server(struct afs_server *server, bool activate, enum afs_server_trace reason); void afs_unuse_server(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason); void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason); extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace); void afs_purge_servers(struct afs_cell *cell); extern void afs_fs_probe_timer(struct timer_list *); void __net_exit afs_wait_for_servers(struct afs_net *net); bool afs_check_server_record(struct afs_operation *op, struct afs_server *server, struct key *key); static inline void afs_see_server(struct afs_server *server, enum afs_server_trace trace) { int r = refcount_read(&server->ref); int a = atomic_read(&server->active); trace_afs_server(server->debug_id, r, a, trace); } static inline void afs_inc_servers_outstanding(struct afs_net *net) { atomic_inc(&net->servers_outstanding); } static inline void afs_dec_servers_outstanding(struct afs_net *net) { if (atomic_dec_and_test(&net->servers_outstanding)) wake_up_var(&net->servers_outstanding); } static inline bool afs_is_probing_server(struct afs_server *server) { return list_empty(&server->probe_link); } /* * server_list.c */ static inline struct afs_server_list *afs_get_serverlist(struct afs_server_list *slist) { refcount_inc(&slist->usage); return slist; } extern void afs_put_serverlist(struct afs_net *, struct afs_server_list *); struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, struct key *key, struct afs_vldb_entry *vldb); extern bool afs_annotate_server_list(struct afs_server_list *, struct afs_server_list *); void afs_attach_volume_to_servers(struct afs_volume *volume, struct afs_server_list *slist); void afs_reattach_volume_to_servers(struct afs_volume *volume, struct afs_server_list *slist, struct afs_server_list *old); void afs_detach_volume_from_servers(struct afs_volume *volume, struct afs_server_list *slist); /* * super.c */ extern int __init afs_fs_init(void); extern void afs_fs_exit(void); /* * validation.c */ bool afs_check_validity(const struct afs_vnode *vnode); int afs_update_volume_state(struct afs_operation *op); int afs_validate(struct afs_vnode *vnode, struct key *key); /* * vlclient.c */ extern struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_vl_cursor *, const char *, int); extern struct afs_addr_list *afs_vl_get_addrs_u(struct afs_vl_cursor *, const uuid_t *); struct afs_call *afs_vl_get_capabilities(struct afs_net *net, struct afs_addr_list *alist, unsigned int addr_index, struct key *key, struct afs_vlserver *server, unsigned int server_index); extern struct afs_addr_list *afs_yfsvl_get_endpoints(struct afs_vl_cursor *, const uuid_t *); extern char *afs_yfsvl_get_cell_name(struct afs_vl_cursor *); /* * vl_alias.c */ extern int afs_cell_detect_alias(struct afs_cell *, struct key *); /* * vl_probe.c */ extern void afs_vlserver_probe_result(struct afs_call *); extern int afs_send_vl_probes(struct afs_net *, struct key *, struct afs_vlserver_list *); extern int afs_wait_for_vl_probes(struct afs_vlserver_list *, unsigned long); /* * vl_rotate.c */ extern bool afs_begin_vlserver_operation(struct afs_vl_cursor *, struct afs_cell *, struct key *); extern bool afs_select_vlserver(struct afs_vl_cursor *); extern bool afs_select_current_vlserver(struct afs_vl_cursor *); extern int afs_end_vlserver_operation(struct afs_vl_cursor *); /* * vlserver_list.c */ static inline struct afs_vlserver *afs_get_vlserver(struct afs_vlserver *vlserver) { refcount_inc(&vlserver->ref); return vlserver; } static inline struct afs_vlserver_list *afs_get_vlserverlist(struct afs_vlserver_list *vllist) { if (vllist) refcount_inc(&vllist->ref); return vllist; } extern struct afs_vlserver *afs_alloc_vlserver(const char *, size_t, unsigned short); extern void afs_put_vlserver(struct afs_net *, struct afs_vlserver *); extern struct afs_vlserver_list *afs_alloc_vlserver_list(unsigned int); extern void afs_put_vlserverlist(struct afs_net *, struct afs_vlserver_list *); extern struct afs_vlserver_list *afs_extract_vlserver_list(struct afs_cell *, const void *, size_t); /* * volume.c */ extern struct afs_volume *afs_create_volume(struct afs_fs_context *); extern int afs_activate_volume(struct afs_volume *); extern void afs_deactivate_volume(struct afs_volume *); bool afs_try_get_volume(struct afs_volume *volume, enum afs_volume_trace reason); extern struct afs_volume *afs_get_volume(struct afs_volume *, enum afs_volume_trace); void afs_put_volume(struct afs_volume *volume, enum afs_volume_trace reason); extern int afs_check_volume_status(struct afs_volume *, struct afs_operation *); /* * write.c */ void afs_prepare_write(struct netfs_io_subrequest *subreq); void afs_issue_write(struct netfs_io_subrequest *subreq); void afs_begin_writeback(struct netfs_io_request *wreq); void afs_retry_request(struct netfs_io_request *wreq, struct netfs_io_stream *stream); extern int afs_writepages(struct address_space *, struct writeback_control *); extern int afs_fsync(struct file *, loff_t, loff_t, int); extern vm_fault_t afs_page_mkwrite(struct vm_fault *vmf); extern void afs_prune_wb_keys(struct afs_vnode *); /* * xattr.c */ extern const struct xattr_handler * const afs_xattr_handlers[]; /* * yfsclient.c */ extern void yfs_fs_fetch_data(struct afs_operation *); extern void yfs_fs_create_file(struct afs_operation *); extern void yfs_fs_make_dir(struct afs_operation *); extern void yfs_fs_remove_file2(struct afs_operation *); extern void yfs_fs_remove_file(struct afs_operation *); extern void yfs_fs_remove_dir(struct afs_operation *); extern void yfs_fs_link(struct afs_operation *); extern void yfs_fs_symlink(struct afs_operation *); extern void yfs_fs_rename(struct afs_operation *); extern void yfs_fs_store_data(struct afs_operation *); extern void yfs_fs_setattr(struct afs_operation *); extern void yfs_fs_get_volume_status(struct afs_operation *); extern void yfs_fs_set_lock(struct afs_operation *); extern void yfs_fs_extend_lock(struct afs_operation *); extern void yfs_fs_release_lock(struct afs_operation *); extern void yfs_fs_fetch_status(struct afs_operation *); extern void yfs_fs_inline_bulk_status(struct afs_operation *); struct yfs_acl { struct afs_acl *acl; /* Dir/file/symlink ACL */ struct afs_acl *vol_acl; /* Whole volume ACL */ u32 inherit_flag; /* True if ACL is inherited from parent dir */ u32 num_cleaned; /* Number of ACEs removed due to subject removal */ unsigned int flags; #define YFS_ACL_WANT_ACL 0x01 /* Set if caller wants ->acl */ #define YFS_ACL_WANT_VOL_ACL 0x02 /* Set if caller wants ->vol_acl */ }; extern void yfs_free_opaque_acl(struct yfs_acl *); extern void yfs_fs_fetch_opaque_acl(struct afs_operation *); extern void yfs_fs_store_opaque_acl2(struct afs_operation *); /* * Miscellaneous inline functions. */ static inline struct afs_vnode *AFS_FS_I(struct inode *inode) { return container_of(inode, struct afs_vnode, netfs.inode); } static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode) { return &vnode->netfs.inode; } /* * Note that a dentry got changed. We need to set d_fsdata to the data version * number derived from the result of the operation. It doesn't matter if * d_fsdata goes backwards as we'll just revalidate. */ static inline void afs_update_dentry_version(struct afs_operation *op, struct afs_vnode_param *dir_vp, struct dentry *dentry) { if (!op->cumul_error.error) dentry->d_fsdata = (void *)(unsigned long)dir_vp->scb.status.data_version; } /* * Set the file size and block count. Estimate the number of 512 bytes blocks * used, rounded up to nearest 1K for consistency with other AFS clients. */ static inline void afs_set_i_size(struct afs_vnode *vnode, u64 size) { i_size_write(&vnode->netfs.inode, size); vnode->netfs.inode.i_blocks = ((size + 1023) >> 10) << 1; } /* * Check for a conflicting operation on a directory that we just unlinked from. * If someone managed to sneak a link or an unlink in on the file we just * unlinked, we won't be able to trust nlink on an AFS file (but not YFS). */ static inline void afs_check_dir_conflict(struct afs_operation *op, struct afs_vnode_param *dvp) { if (dvp->dv_before + dvp->dv_delta != dvp->scb.status.data_version) op->flags |= AFS_OPERATION_DIR_CONFLICT; } static inline int afs_io_error(struct afs_call *call, enum afs_io_error where) { trace_afs_io_error(call->debug_id, -EIO, where); return -EIO; } static inline int afs_bad(struct afs_vnode *vnode, enum afs_file_error where) { trace_afs_file_error(vnode, -EIO, where); return -EIO; } /* * Set the callback promise on a vnode. */ static inline void afs_set_cb_promise(struct afs_vnode *vnode, time64_t expires_at, enum afs_cb_promise_trace trace) { atomic64_set(&vnode->cb_expires_at, expires_at); trace_afs_cb_promise(vnode, trace); } /* * Clear the callback promise on a vnode, returning true if it was promised. */ static inline bool afs_clear_cb_promise(struct afs_vnode *vnode, enum afs_cb_promise_trace trace) { trace_afs_cb_promise(vnode, trace); return atomic64_xchg(&vnode->cb_expires_at, AFS_NO_CB_PROMISE) != AFS_NO_CB_PROMISE; } /* * Mark a directory as being invalid. */ static inline void afs_invalidate_dir(struct afs_vnode *dvnode, enum afs_dir_invalid_trace trace) { if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) { trace_afs_dir_invalid(dvnode, trace); afs_stat_v(dvnode, n_inval); } } /*****************************************************************************/ /* * debug tracing */ extern unsigned afs_debug; #define dbgprintk(FMT,...) \ printk("[%-6.6s] "FMT"\n", current->comm ,##__VA_ARGS__) #define kenter(FMT,...) dbgprintk("==> %s("FMT")",__func__ ,##__VA_ARGS__) #define kleave(FMT,...) dbgprintk("<== %s()"FMT"",__func__ ,##__VA_ARGS__) #define kdebug(FMT,...) dbgprintk(" "FMT ,##__VA_ARGS__) #if defined(__KDEBUG) #define _enter(FMT,...) kenter(FMT,##__VA_ARGS__) #define _leave(FMT,...) kleave(FMT,##__VA_ARGS__) #define _debug(FMT,...) kdebug(FMT,##__VA_ARGS__) #elif defined(CONFIG_AFS_DEBUG) #define AFS_DEBUG_KENTER 0x01 #define AFS_DEBUG_KLEAVE 0x02 #define AFS_DEBUG_KDEBUG 0x04 #define _enter(FMT,...) \ do { \ if (unlikely(afs_debug & AFS_DEBUG_KENTER)) \ kenter(FMT,##__VA_ARGS__); \ } while (0) #define _leave(FMT,...) \ do { \ if (unlikely(afs_debug & AFS_DEBUG_KLEAVE)) \ kleave(FMT,##__VA_ARGS__); \ } while (0) #define _debug(FMT,...) \ do { \ if (unlikely(afs_debug & AFS_DEBUG_KDEBUG)) \ kdebug(FMT,##__VA_ARGS__); \ } while (0) #else #define _enter(FMT,...) no_printk("==> %s("FMT")",__func__ ,##__VA_ARGS__) #define _leave(FMT,...) no_printk("<== %s()"FMT"",__func__ ,##__VA_ARGS__) #define _debug(FMT,...) no_printk(" "FMT ,##__VA_ARGS__) #endif /* * debug assertion checking */ #if 1 // defined(__KDEBUGALL) #define ASSERT(X) \ do { \ if (unlikely(!(X))) { \ printk(KERN_ERR "\n"); \ printk(KERN_ERR "AFS: Assertion failed\n"); \ BUG(); \ } \ } while(0) #define ASSERTCMP(X, OP, Y) \ do { \ if (unlikely(!((X) OP (Y)))) { \ printk(KERN_ERR "\n"); \ printk(KERN_ERR "AFS: Assertion failed\n"); \ printk(KERN_ERR "%lu " #OP " %lu is false\n", \ (unsigned long)(X), (unsigned long)(Y)); \ printk(KERN_ERR "0x%lx " #OP " 0x%lx is false\n", \ (unsigned long)(X), (unsigned long)(Y)); \ BUG(); \ } \ } while(0) #define ASSERTRANGE(L, OP1, N, OP2, H) \ do { \ if (unlikely(!((L) OP1 (N)) || !((N) OP2 (H)))) { \ printk(KERN_ERR "\n"); \ printk(KERN_ERR "AFS: Assertion failed\n"); \ printk(KERN_ERR "%lu "#OP1" %lu "#OP2" %lu is false\n", \ (unsigned long)(L), (unsigned long)(N), \ (unsigned long)(H)); \ printk(KERN_ERR "0x%lx "#OP1" 0x%lx "#OP2" 0x%lx is false\n", \ (unsigned long)(L), (unsigned long)(N), \ (unsigned long)(H)); \ BUG(); \ } \ } while(0) #define ASSERTIF(C, X) \ do { \ if (unlikely((C) && !(X))) { \ printk(KERN_ERR "\n"); \ printk(KERN_ERR "AFS: Assertion failed\n"); \ BUG(); \ } \ } while(0) #define ASSERTIFCMP(C, X, OP, Y) \ do { \ if (unlikely((C) && !((X) OP (Y)))) { \ printk(KERN_ERR "\n"); \ printk(KERN_ERR "AFS: Assertion failed\n"); \ printk(KERN_ERR "%lu " #OP " %lu is false\n", \ (unsigned long)(X), (unsigned long)(Y)); \ printk(KERN_ERR "0x%lx " #OP " 0x%lx is false\n", \ (unsigned long)(X), (unsigned long)(Y)); \ BUG(); \ } \ } while(0) #else #define ASSERT(X) \ do { \ } while(0) #define ASSERTCMP(X, OP, Y) \ do { \ } while(0) #define ASSERTRANGE(L, OP1, N, OP2, H) \ do { \ } while(0) #define ASSERTIF(C, X) \ do { \ } while(0) #define ASSERTIFCMP(C, X, OP, Y) \ do { \ } while(0) #endif /* __KDEBUGALL */ |
| 4 4 4 4 4 2 4 1 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Randomness driver for virtio * Copyright (C) 2007, 2008 Rusty Russell IBM Corporation */ #include <asm/barrier.h> #include <linux/err.h> #include <linux/hw_random.h> #include <linux/scatterlist.h> #include <linux/spinlock.h> #include <linux/virtio.h> #include <linux/virtio_rng.h> #include <linux/module.h> #include <linux/slab.h> static DEFINE_IDA(rng_index_ida); struct virtrng_info { struct hwrng hwrng; struct virtqueue *vq; char name[25]; int index; bool hwrng_register_done; bool hwrng_removed; /* data transfer */ struct completion have_data; unsigned int data_avail; unsigned int data_idx; /* minimal size returned by rng_buffer_size() */ #if SMP_CACHE_BYTES < 32 u8 data[32]; #else u8 data[SMP_CACHE_BYTES]; #endif }; static void random_recv_done(struct virtqueue *vq) { struct virtrng_info *vi = vq->vdev->priv; unsigned int len; /* We can get spurious callbacks, e.g. shared IRQs + virtio_pci. */ if (!virtqueue_get_buf(vi->vq, &len)) return; smp_store_release(&vi->data_avail, len); complete(&vi->have_data); } static void request_entropy(struct virtrng_info *vi) { struct scatterlist sg; reinit_completion(&vi->have_data); vi->data_idx = 0; sg_init_one(&sg, vi->data, sizeof(vi->data)); /* There should always be room for one buffer. */ virtqueue_add_inbuf(vi->vq, &sg, 1, vi->data, GFP_KERNEL); virtqueue_kick(vi->vq); } static unsigned int copy_data(struct virtrng_info *vi, void *buf, unsigned int size) { size = min_t(unsigned int, size, vi->data_avail); memcpy(buf, vi->data + vi->data_idx, size); vi->data_idx += size; vi->data_avail -= size; if (vi->data_avail == 0) request_entropy(vi); return size; } static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait) { int ret; struct virtrng_info *vi = (struct virtrng_info *)rng->priv; unsigned int chunk; size_t read; if (vi->hwrng_removed) return -ENODEV; read = 0; /* copy available data */ if (smp_load_acquire(&vi->data_avail)) { chunk = copy_data(vi, buf, size); size -= chunk; read += chunk; } if (!wait) return read; /* We have already copied available entropy, * so either size is 0 or data_avail is 0 */ while (size != 0) { /* data_avail is 0 but a request is pending */ ret = wait_for_completion_killable(&vi->have_data); if (ret < 0) return ret; /* if vi->data_avail is 0, we have been interrupted * by a cleanup, but buffer stays in the queue */ if (vi->data_avail == 0) return read; chunk = copy_data(vi, buf + read, size); size -= chunk; read += chunk; } return read; } static void virtio_cleanup(struct hwrng *rng) { struct virtrng_info *vi = (struct virtrng_info *)rng->priv; complete(&vi->have_data); } static int probe_common(struct virtio_device *vdev) { int err, index; struct virtrng_info *vi = NULL; vi = kzalloc(sizeof(struct virtrng_info), GFP_KERNEL); if (!vi) return -ENOMEM; vi->index = index = ida_alloc(&rng_index_ida, GFP_KERNEL); if (index < 0) { err = index; goto err_ida; } sprintf(vi->name, "virtio_rng.%d", index); init_completion(&vi->have_data); vi->hwrng = (struct hwrng) { .read = virtio_read, .cleanup = virtio_cleanup, .priv = (unsigned long)vi, .name = vi->name, }; vdev->priv = vi; /* We expect a single virtqueue. */ vi->vq = virtio_find_single_vq(vdev, random_recv_done, "input"); if (IS_ERR(vi->vq)) { err = PTR_ERR(vi->vq); goto err_find; } virtio_device_ready(vdev); /* we always have a pending entropy request */ request_entropy(vi); return 0; err_find: ida_free(&rng_index_ida, index); err_ida: kfree(vi); return err; } static void remove_common(struct virtio_device *vdev) { struct virtrng_info *vi = vdev->priv; vi->hwrng_removed = true; vi->data_avail = 0; vi->data_idx = 0; complete(&vi->have_data); if (vi->hwrng_register_done) hwrng_unregister(&vi->hwrng); virtio_reset_device(vdev); vdev->config->del_vqs(vdev); ida_free(&rng_index_ida, vi->index); kfree(vi); } static int virtrng_probe(struct virtio_device *vdev) { return probe_common(vdev); } static void virtrng_remove(struct virtio_device *vdev) { remove_common(vdev); } static void virtrng_scan(struct virtio_device *vdev) { struct virtrng_info *vi = vdev->priv; int err; err = hwrng_register(&vi->hwrng); if (!err) vi->hwrng_register_done = true; } static int virtrng_freeze(struct virtio_device *vdev) { remove_common(vdev); return 0; } static int virtrng_restore(struct virtio_device *vdev) { int err; err = probe_common(vdev); if (!err) { struct virtrng_info *vi = vdev->priv; /* * Set hwrng_removed to ensure that virtio_read() * does not block waiting for data before the * registration is complete. */ vi->hwrng_removed = true; err = hwrng_register(&vi->hwrng); if (!err) { vi->hwrng_register_done = true; vi->hwrng_removed = false; } } return err; } static const struct virtio_device_id id_table[] = { { VIRTIO_ID_RNG, VIRTIO_DEV_ANY_ID }, { 0 }, }; static struct virtio_driver virtio_rng_driver = { .driver.name = KBUILD_MODNAME, .id_table = id_table, .probe = virtrng_probe, .remove = virtrng_remove, .scan = virtrng_scan, .freeze = pm_sleep_ptr(virtrng_freeze), .restore = pm_sleep_ptr(virtrng_restore), }; module_virtio_driver(virtio_rng_driver); MODULE_DEVICE_TABLE(virtio, id_table); MODULE_DESCRIPTION("Virtio random number driver"); MODULE_LICENSE("GPL"); |
| 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 | // SPDX-License-Identifier: GPL-2.0-or-later /* * corsair-cpro.c - Linux driver for Corsair Commander Pro * Copyright (C) 2020 Marius Zachmann <mail@mariuszachmann.de> * * This driver uses hid reports to communicate with the device to allow hidraw userspace drivers * still being used. The device does not use report ids. When using hidraw and this driver * simultaniously, reports could be switched. */ #include <linux/bitops.h> #include <linux/completion.h> #include <linux/debugfs.h> #include <linux/hid.h> #include <linux/hwmon.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/mutex.h> #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/spinlock.h> #include <linux/types.h> #define USB_VENDOR_ID_CORSAIR 0x1b1c #define USB_PRODUCT_ID_CORSAIR_COMMANDERPRO 0x0c10 #define USB_PRODUCT_ID_CORSAIR_1000D 0x1d00 #define OUT_BUFFER_SIZE 63 #define IN_BUFFER_SIZE 16 #define LABEL_LENGTH 11 #define REQ_TIMEOUT 300 #define CTL_GET_FW_VER 0x02 /* returns the firmware version in bytes 1-3 */ #define CTL_GET_BL_VER 0x06 /* returns the bootloader version in bytes 1-2 */ #define CTL_GET_TMP_CNCT 0x10 /* * returns in bytes 1-4 for each temp sensor: * 0 not connected * 1 connected */ #define CTL_GET_TMP 0x11 /* * send: byte 1 is channel, rest zero * rcv: returns temp for channel in centi-degree celsius * in bytes 1 and 2 * returns 0x11 in byte 0 if no sensor is connected */ #define CTL_GET_VOLT 0x12 /* * send: byte 1 is rail number: 0 = 12v, 1 = 5v, 2 = 3.3v * rcv: returns millivolt in bytes 1,2 * returns error 0x10 if request is invalid */ #define CTL_GET_FAN_CNCT 0x20 /* * returns in bytes 1-6 for each fan: * 0 not connected * 1 3pin * 2 4pin */ #define CTL_GET_FAN_RPM 0x21 /* * send: byte 1 is channel, rest zero * rcv: returns rpm in bytes 1,2 */ #define CTL_GET_FAN_PWM 0x22 /* * send: byte 1 is channel, rest zero * rcv: returns pwm in byte 1 if it was set * returns error 0x12 if fan is controlled via * fan_target or fan curve */ #define CTL_SET_FAN_FPWM 0x23 /* * set fixed pwm * send: byte 1 is fan number * send: byte 2 is percentage from 0 - 100 */ #define CTL_SET_FAN_TARGET 0x24 /* * set target rpm * send: byte 1 is fan number * send: byte 2-3 is target * device accepts all values from 0x00 - 0xFFFF */ #define NUM_FANS 6 #define NUM_TEMP_SENSORS 4 struct ccp_device { struct hid_device *hdev; struct device *hwmon_dev; struct dentry *debugfs; /* For reinitializing the completion below */ spinlock_t wait_input_report_lock; struct completion wait_input_report; struct mutex mutex; /* whenever buffer is used, lock before send_usb_cmd */ u8 *cmd_buffer; u8 *buffer; int target[6]; DECLARE_BITMAP(temp_cnct, NUM_TEMP_SENSORS); DECLARE_BITMAP(fan_cnct, NUM_FANS); char fan_label[6][LABEL_LENGTH]; u8 firmware_ver[3]; u8 bootloader_ver[2]; }; /* converts response error in buffer to errno */ static int ccp_get_errno(struct ccp_device *ccp) { switch (ccp->buffer[0]) { case 0x00: /* success */ return 0; case 0x01: /* called invalid command */ return -EOPNOTSUPP; case 0x10: /* called GET_VOLT / GET_TMP with invalid arguments */ return -EINVAL; case 0x11: /* requested temps of disconnected sensors */ case 0x12: /* requested pwm of not pwm controlled channels */ return -ENODATA; default: hid_dbg(ccp->hdev, "unknown device response error: %d", ccp->buffer[0]); return -EIO; } } /* send command, check for error in response, response in ccp->buffer */ static int send_usb_cmd(struct ccp_device *ccp, u8 command, u8 byte1, u8 byte2, u8 byte3) { unsigned long t; int ret; memset(ccp->cmd_buffer, 0x00, OUT_BUFFER_SIZE); ccp->cmd_buffer[0] = command; ccp->cmd_buffer[1] = byte1; ccp->cmd_buffer[2] = byte2; ccp->cmd_buffer[3] = byte3; /* * Disable raw event parsing for a moment to safely reinitialize the * completion. Reinit is done because hidraw could have triggered * the raw event parsing and marked the ccp->wait_input_report * completion as done. */ spin_lock_bh(&ccp->wait_input_report_lock); reinit_completion(&ccp->wait_input_report); spin_unlock_bh(&ccp->wait_input_report_lock); ret = hid_hw_output_report(ccp->hdev, ccp->cmd_buffer, OUT_BUFFER_SIZE); if (ret < 0) return ret; t = wait_for_completion_timeout(&ccp->wait_input_report, msecs_to_jiffies(REQ_TIMEOUT)); if (!t) return -ETIMEDOUT; return ccp_get_errno(ccp); } static int ccp_raw_event(struct hid_device *hdev, struct hid_report *report, u8 *data, int size) { struct ccp_device *ccp = hid_get_drvdata(hdev); /* only copy buffer when requested */ spin_lock(&ccp->wait_input_report_lock); if (!completion_done(&ccp->wait_input_report)) { memcpy(ccp->buffer, data, min(IN_BUFFER_SIZE, size)); complete_all(&ccp->wait_input_report); } spin_unlock(&ccp->wait_input_report_lock); return 0; } /* requests and returns single data values depending on channel */ static int get_data(struct ccp_device *ccp, int command, int channel, bool two_byte_data) { int ret; mutex_lock(&ccp->mutex); ret = send_usb_cmd(ccp, command, channel, 0, 0); if (ret) goto out_unlock; ret = ccp->buffer[1]; if (two_byte_data) ret = (ret << 8) + ccp->buffer[2]; out_unlock: mutex_unlock(&ccp->mutex); return ret; } static int set_pwm(struct ccp_device *ccp, int channel, long val) { int ret; if (val < 0 || val > 255) return -EINVAL; /* The Corsair Commander Pro uses values from 0-100 */ val = DIV_ROUND_CLOSEST(val * 100, 255); mutex_lock(&ccp->mutex); ret = send_usb_cmd(ccp, CTL_SET_FAN_FPWM, channel, val, 0); if (!ret) ccp->target[channel] = -ENODATA; mutex_unlock(&ccp->mutex); return ret; } static int set_target(struct ccp_device *ccp, int channel, long val) { int ret; val = clamp_val(val, 0, 0xFFFF); ccp->target[channel] = val; mutex_lock(&ccp->mutex); ret = send_usb_cmd(ccp, CTL_SET_FAN_TARGET, channel, val >> 8, val); mutex_unlock(&ccp->mutex); return ret; } static int ccp_read_string(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, const char **str) { struct ccp_device *ccp = dev_get_drvdata(dev); switch (type) { case hwmon_fan: switch (attr) { case hwmon_fan_label: *str = ccp->fan_label[channel]; return 0; default: break; } break; default: break; } return -EOPNOTSUPP; } static int ccp_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, long *val) { struct ccp_device *ccp = dev_get_drvdata(dev); int ret; switch (type) { case hwmon_temp: switch (attr) { case hwmon_temp_input: ret = get_data(ccp, CTL_GET_TMP, channel, true); if (ret < 0) return ret; *val = ret * 10; return 0; default: break; } break; case hwmon_fan: switch (attr) { case hwmon_fan_input: ret = get_data(ccp, CTL_GET_FAN_RPM, channel, true); if (ret < 0) return ret; *val = ret; return 0; case hwmon_fan_target: /* how to read target values from the device is unknown */ /* driver returns last set value or 0 */ if (ccp->target[channel] < 0) return -ENODATA; *val = ccp->target[channel]; return 0; default: break; } break; case hwmon_pwm: switch (attr) { case hwmon_pwm_input: ret = get_data(ccp, CTL_GET_FAN_PWM, channel, false); if (ret < 0) return ret; *val = DIV_ROUND_CLOSEST(ret * 255, 100); return 0; default: break; } break; case hwmon_in: switch (attr) { case hwmon_in_input: ret = get_data(ccp, CTL_GET_VOLT, channel, true); if (ret < 0) return ret; *val = ret; return 0; default: break; } break; default: break; } return -EOPNOTSUPP; }; static int ccp_write(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, long val) { struct ccp_device *ccp = dev_get_drvdata(dev); switch (type) { case hwmon_pwm: switch (attr) { case hwmon_pwm_input: return set_pwm(ccp, channel, val); default: break; } break; case hwmon_fan: switch (attr) { case hwmon_fan_target: return set_target(ccp, channel, val); default: break; } break; default: break; } return -EOPNOTSUPP; }; static umode_t ccp_is_visible(const void *data, enum hwmon_sensor_types type, u32 attr, int channel) { const struct ccp_device *ccp = data; switch (type) { case hwmon_temp: if (!test_bit(channel, ccp->temp_cnct)) break; switch (attr) { case hwmon_temp_input: return 0444; case hwmon_temp_label: return 0444; default: break; } break; case hwmon_fan: if (!test_bit(channel, ccp->fan_cnct)) break; switch (attr) { case hwmon_fan_input: return 0444; case hwmon_fan_label: return 0444; case hwmon_fan_target: return 0644; default: break; } break; case hwmon_pwm: if (!test_bit(channel, ccp->fan_cnct)) break; switch (attr) { case hwmon_pwm_input: return 0644; default: break; } break; case hwmon_in: switch (attr) { case hwmon_in_input: return 0444; default: break; } break; default: break; } return 0; }; static const struct hwmon_ops ccp_hwmon_ops = { .is_visible = ccp_is_visible, .read = ccp_read, .read_string = ccp_read_string, .write = ccp_write, }; static const struct hwmon_channel_info * const ccp_info[] = { HWMON_CHANNEL_INFO(chip, HWMON_C_REGISTER_TZ), HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT, HWMON_T_INPUT, HWMON_T_INPUT, HWMON_T_INPUT ), HWMON_CHANNEL_INFO(fan, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET, HWMON_F_INPUT | HWMON_F_LABEL | HWMON_F_TARGET ), HWMON_CHANNEL_INFO(pwm, HWMON_PWM_INPUT, HWMON_PWM_INPUT, HWMON_PWM_INPUT, HWMON_PWM_INPUT, HWMON_PWM_INPUT, HWMON_PWM_INPUT ), HWMON_CHANNEL_INFO(in, HWMON_I_INPUT, HWMON_I_INPUT, HWMON_I_INPUT ), NULL }; static const struct hwmon_chip_info ccp_chip_info = { .ops = &ccp_hwmon_ops, .info = ccp_info, }; /* read fan connection status and set labels */ static int get_fan_cnct(struct ccp_device *ccp) { int channel; int mode; int ret; ret = send_usb_cmd(ccp, CTL_GET_FAN_CNCT, 0, 0, 0); if (ret) return ret; for (channel = 0; channel < NUM_FANS; channel++) { mode = ccp->buffer[channel + 1]; if (mode == 0) continue; set_bit(channel, ccp->fan_cnct); ccp->target[channel] = -ENODATA; switch (mode) { case 1: scnprintf(ccp->fan_label[channel], LABEL_LENGTH, "fan%d 3pin", channel + 1); break; case 2: scnprintf(ccp->fan_label[channel], LABEL_LENGTH, "fan%d 4pin", channel + 1); break; default: scnprintf(ccp->fan_label[channel], LABEL_LENGTH, "fan%d other", channel + 1); break; } } return 0; } /* read temp sensor connection status */ static int get_temp_cnct(struct ccp_device *ccp) { int channel; int mode; int ret; ret = send_usb_cmd(ccp, CTL_GET_TMP_CNCT, 0, 0, 0); if (ret) return ret; for (channel = 0; channel < NUM_TEMP_SENSORS; channel++) { mode = ccp->buffer[channel + 1]; if (mode == 0) continue; set_bit(channel, ccp->temp_cnct); } return 0; } /* read firmware version */ static int get_fw_version(struct ccp_device *ccp) { int ret; ret = send_usb_cmd(ccp, CTL_GET_FW_VER, 0, 0, 0); if (ret) { hid_notice(ccp->hdev, "Failed to read firmware version.\n"); return ret; } ccp->firmware_ver[0] = ccp->buffer[1]; ccp->firmware_ver[1] = ccp->buffer[2]; ccp->firmware_ver[2] = ccp->buffer[3]; return 0; } /* read bootloader version */ static int get_bl_version(struct ccp_device *ccp) { int ret; ret = send_usb_cmd(ccp, CTL_GET_BL_VER, 0, 0, 0); if (ret) { hid_notice(ccp->hdev, "Failed to read bootloader version.\n"); return ret; } ccp->bootloader_ver[0] = ccp->buffer[1]; ccp->bootloader_ver[1] = ccp->buffer[2]; return 0; } static int firmware_show(struct seq_file *seqf, void *unused) { struct ccp_device *ccp = seqf->private; seq_printf(seqf, "%d.%d.%d\n", ccp->firmware_ver[0], ccp->firmware_ver[1], ccp->firmware_ver[2]); return 0; } DEFINE_SHOW_ATTRIBUTE(firmware); static int bootloader_show(struct seq_file *seqf, void *unused) { struct ccp_device *ccp = seqf->private; seq_printf(seqf, "%d.%d\n", ccp->bootloader_ver[0], ccp->bootloader_ver[1]); return 0; } DEFINE_SHOW_ATTRIBUTE(bootloader); static void ccp_debugfs_init(struct ccp_device *ccp) { char name[32]; int ret; scnprintf(name, sizeof(name), "corsaircpro-%s", dev_name(&ccp->hdev->dev)); ccp->debugfs = debugfs_create_dir(name, NULL); ret = get_fw_version(ccp); if (!ret) debugfs_create_file("firmware_version", 0444, ccp->debugfs, ccp, &firmware_fops); ret = get_bl_version(ccp); if (!ret) debugfs_create_file("bootloader_version", 0444, ccp->debugfs, ccp, &bootloader_fops); } static int ccp_probe(struct hid_device *hdev, const struct hid_device_id *id) { struct ccp_device *ccp; int ret; ccp = devm_kzalloc(&hdev->dev, sizeof(*ccp), GFP_KERNEL); if (!ccp) return -ENOMEM; ccp->cmd_buffer = devm_kmalloc(&hdev->dev, OUT_BUFFER_SIZE, GFP_KERNEL); if (!ccp->cmd_buffer) return -ENOMEM; ccp->buffer = devm_kmalloc(&hdev->dev, IN_BUFFER_SIZE, GFP_KERNEL); if (!ccp->buffer) return -ENOMEM; ret = hid_parse(hdev); if (ret) return ret; ret = hid_hw_start(hdev, HID_CONNECT_HIDRAW); if (ret) return ret; ret = hid_hw_open(hdev); if (ret) goto out_hw_stop; ccp->hdev = hdev; hid_set_drvdata(hdev, ccp); mutex_init(&ccp->mutex); spin_lock_init(&ccp->wait_input_report_lock); init_completion(&ccp->wait_input_report); hid_device_io_start(hdev); /* temp and fan connection status only updates when device is powered on */ ret = get_temp_cnct(ccp); if (ret) goto out_hw_close; ret = get_fan_cnct(ccp); if (ret) goto out_hw_close; ccp_debugfs_init(ccp); ccp->hwmon_dev = hwmon_device_register_with_info(&hdev->dev, "corsaircpro", ccp, &ccp_chip_info, NULL); if (IS_ERR(ccp->hwmon_dev)) { ret = PTR_ERR(ccp->hwmon_dev); goto out_hw_close; } return 0; out_hw_close: hid_hw_close(hdev); out_hw_stop: hid_hw_stop(hdev); return ret; } static void ccp_remove(struct hid_device *hdev) { struct ccp_device *ccp = hid_get_drvdata(hdev); debugfs_remove_recursive(ccp->debugfs); hwmon_device_unregister(ccp->hwmon_dev); hid_hw_close(hdev); hid_hw_stop(hdev); } static const struct hid_device_id ccp_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_CORSAIR, USB_PRODUCT_ID_CORSAIR_COMMANDERPRO) }, { HID_USB_DEVICE(USB_VENDOR_ID_CORSAIR, USB_PRODUCT_ID_CORSAIR_1000D) }, { } }; static struct hid_driver ccp_driver = { .name = "corsair-cpro", .id_table = ccp_devices, .probe = ccp_probe, .remove = ccp_remove, .raw_event = ccp_raw_event, }; MODULE_DEVICE_TABLE(hid, ccp_devices); MODULE_DESCRIPTION("Corsair Commander Pro controller driver"); MODULE_LICENSE("GPL"); static int __init ccp_init(void) { return hid_register_driver(&ccp_driver); } static void __exit ccp_exit(void) { hid_unregister_driver(&ccp_driver); } /* * When compiling this driver as built-in, hwmon initcalls will get called before the * hid driver and this driver would fail to register. late_initcall solves this. */ late_initcall(ccp_init); module_exit(ccp_exit); |
| 34 26 8 6 6 26 27 32 33 32 32 33 17 22 16 8 14 31 32 14 8 29 9 15 6 7 2 28 7 23 30 23 28 31 24 6 31 1 30 12 19 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 | // SPDX-License-Identifier: GPL-2.0-or-later /* mpi-pow.c - MPI functions * Copyright (C) 1994, 1996, 1998, 2000 Free Software Foundation, Inc. * * This file is part of GnuPG. * * Note: This code is heavily based on the GNU MP Library. * Actually it's the same code with only minor changes in the * way the data is stored; this is to support the abstraction * of an optional secure memory allocation which may be used * to avoid revealing of sensitive data due to paging etc. * The GNU MP Library itself is published under the LGPL; * however I decided to publish this code under the plain GPL. */ #include <linux/sched.h> #include <linux/string.h> #include "mpi-internal.h" #include "longlong.h" /**************** * RES = BASE ^ EXP mod MOD */ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod) { mpi_ptr_t mp_marker = NULL, bp_marker = NULL, ep_marker = NULL; struct karatsuba_ctx karactx = {}; mpi_ptr_t xp_marker = NULL; mpi_ptr_t tspace = NULL; mpi_ptr_t rp, ep, mp, bp; mpi_size_t esize, msize, bsize, rsize; int msign, bsign, rsign; mpi_size_t size; int mod_shift_cnt; int negative_result; int assign_rp = 0; mpi_size_t tsize = 0; /* to avoid compiler warning */ /* fixme: we should check that the warning is void */ int rc = -ENOMEM; esize = exp->nlimbs; msize = mod->nlimbs; size = 2 * msize; msign = mod->sign; rp = res->d; ep = exp->d; if (!msize) return -EINVAL; if (!esize) { /* Exponent is zero, result is 1 mod MOD, i.e., 1 or 0 * depending on if MOD equals 1. */ res->nlimbs = (msize == 1 && mod->d[0] == 1) ? 0 : 1; if (res->nlimbs) { if (mpi_resize(res, 1) < 0) goto enomem; rp = res->d; rp[0] = 1; } res->sign = 0; goto leave; } /* Normalize MOD (i.e. make its most significant bit set) as required by * mpn_divrem. This will make the intermediate values in the calculation * slightly larger, but the correct result is obtained after a final * reduction using the original MOD value. */ mp = mp_marker = mpi_alloc_limb_space(msize); if (!mp) goto enomem; mod_shift_cnt = count_leading_zeros(mod->d[msize - 1]); if (mod_shift_cnt) mpihelp_lshift(mp, mod->d, msize, mod_shift_cnt); else MPN_COPY(mp, mod->d, msize); bsize = base->nlimbs; bsign = base->sign; if (bsize > msize) { /* The base is larger than the module. Reduce it. */ /* Allocate (BSIZE + 1) with space for remainder and quotient. * (The quotient is (bsize - msize + 1) limbs.) */ bp = bp_marker = mpi_alloc_limb_space(bsize + 1); if (!bp) goto enomem; MPN_COPY(bp, base->d, bsize); /* We don't care about the quotient, store it above the remainder, * at BP + MSIZE. */ mpihelp_divrem(bp + msize, 0, bp, bsize, mp, msize); bsize = msize; /* Canonicalize the base, since we are going to multiply with it * quite a few times. */ MPN_NORMALIZE(bp, bsize); } else bp = base->d; if (!bsize) { res->nlimbs = 0; res->sign = 0; goto leave; } if (res->alloced < size) { /* We have to allocate more space for RES. If any of the input * parameters are identical to RES, defer deallocation of the old * space. */ if (rp == ep || rp == mp || rp == bp) { rp = mpi_alloc_limb_space(size); if (!rp) goto enomem; assign_rp = 1; } else { if (mpi_resize(res, size) < 0) goto enomem; rp = res->d; } } else { /* Make BASE, EXP and MOD not overlap with RES. */ if (rp == bp) { /* RES and BASE are identical. Allocate temp. space for BASE. */ BUG_ON(bp_marker); bp = bp_marker = mpi_alloc_limb_space(bsize); if (!bp) goto enomem; MPN_COPY(bp, rp, bsize); } if (rp == ep) { /* RES and EXP are identical. Allocate temp. space for EXP. */ ep = ep_marker = mpi_alloc_limb_space(esize); if (!ep) goto enomem; MPN_COPY(ep, rp, esize); } if (rp == mp) { /* RES and MOD are identical. Allocate temporary space for MOD. */ BUG_ON(mp_marker); mp = mp_marker = mpi_alloc_limb_space(msize); if (!mp) goto enomem; MPN_COPY(mp, rp, msize); } } MPN_COPY(rp, bp, bsize); rsize = bsize; rsign = bsign; { mpi_size_t i; mpi_ptr_t xp; int c; mpi_limb_t e; mpi_limb_t carry_limb; xp = xp_marker = mpi_alloc_limb_space(2 * (msize + 1)); if (!xp) goto enomem; negative_result = (ep[0] & 1) && base->sign; i = esize - 1; e = ep[i]; c = count_leading_zeros(e); e = (e << c) << 1; /* shift the exp bits to the left, lose msb */ c = BITS_PER_MPI_LIMB - 1 - c; /* Main loop. * * Make the result be pointed to alternately by XP and RP. This * helps us avoid block copying, which would otherwise be necessary * with the overlap restrictions of mpihelp_divmod. With 50% probability * the result after this loop will be in the area originally pointed * by RP (==RES->d), and with 50% probability in the area originally * pointed to by XP. */ for (;;) { while (c) { mpi_size_t xsize; /*if (mpihelp_mul_n(xp, rp, rp, rsize) < 0) goto enomem */ if (rsize < KARATSUBA_THRESHOLD) mpih_sqr_n_basecase(xp, rp, rsize); else { if (!tspace) { tsize = 2 * rsize; tspace = mpi_alloc_limb_space(tsize); if (!tspace) goto enomem; } else if (tsize < (2 * rsize)) { mpi_free_limb_space(tspace); tsize = 2 * rsize; tspace = mpi_alloc_limb_space(tsize); if (!tspace) goto enomem; } mpih_sqr_n(xp, rp, rsize, tspace); } xsize = 2 * rsize; if (xsize > msize) { mpihelp_divrem(xp + msize, 0, xp, xsize, mp, msize); xsize = msize; } swap(rp, xp); rsize = xsize; if ((mpi_limb_signed_t) e < 0) { /*mpihelp_mul( xp, rp, rsize, bp, bsize ); */ if (bsize < KARATSUBA_THRESHOLD) { mpi_limb_t tmp; if (mpihelp_mul (xp, rp, rsize, bp, bsize, &tmp) < 0) goto enomem; } else { if (mpihelp_mul_karatsuba_case (xp, rp, rsize, bp, bsize, &karactx) < 0) goto enomem; } xsize = rsize + bsize; if (xsize > msize) { mpihelp_divrem(xp + msize, 0, xp, xsize, mp, msize); xsize = msize; } swap(rp, xp); rsize = xsize; } e <<= 1; c--; cond_resched(); } i--; if (i < 0) break; e = ep[i]; c = BITS_PER_MPI_LIMB; } /* We shifted MOD, the modulo reduction argument, left MOD_SHIFT_CNT * steps. Adjust the result by reducing it with the original MOD. * * Also make sure the result is put in RES->d (where it already * might be, see above). */ if (mod_shift_cnt) { carry_limb = mpihelp_lshift(res->d, rp, rsize, mod_shift_cnt); rp = res->d; if (carry_limb) { rp[rsize] = carry_limb; rsize++; } } else { MPN_COPY(res->d, rp, rsize); rp = res->d; } if (rsize >= msize) { mpihelp_divrem(rp + msize, 0, rp, rsize, mp, msize); rsize = msize; } /* Remove any leading zero words from the result. */ if (mod_shift_cnt) mpihelp_rshift(rp, rp, rsize, mod_shift_cnt); MPN_NORMALIZE(rp, rsize); } if (negative_result && rsize) { if (mod_shift_cnt) mpihelp_rshift(mp, mp, msize, mod_shift_cnt); mpihelp_sub(rp, mp, msize, rp, rsize); rsize = msize; rsign = msign; MPN_NORMALIZE(rp, rsize); } res->nlimbs = rsize; res->sign = rsign; leave: rc = 0; enomem: mpihelp_release_karatsuba_ctx(&karactx); if (assign_rp) mpi_assign_limb_space(res, rp, size); if (mp_marker) mpi_free_limb_space(mp_marker); if (bp_marker) mpi_free_limb_space(bp_marker); if (ep_marker) mpi_free_limb_space(ep_marker); if (xp_marker) mpi_free_limb_space(xp_marker); if (tspace) mpi_free_limb_space(tspace); return rc; } EXPORT_SYMBOL_GPL(mpi_powm); |
| 1 6 1 5 5 1 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 | // SPDX-License-Identifier: GPL-2.0+ /* * Copyright (C) 2003-2008 Takahiro Hirofuchi * Copyright (C) 2015-2016 Nobuo Iwata */ #include <linux/kthread.h> #include <linux/file.h> #include <linux/net.h> #include <linux/platform_device.h> #include <linux/slab.h> /* Hardening for Spectre-v1 */ #include <linux/nospec.h> #include "usbip_common.h" #include "vhci.h" /* TODO: refine locking ?*/ /* * output example: * hub port sta spd dev sockfd local_busid * hs 0000 004 000 00000000 000003 1-2.3 * ................................................ * ss 0008 004 000 00000000 000004 2-3.4 * ................................................ * * Output includes socket fd instead of socket pointer address to avoid * leaking kernel memory address in: * /sys/devices/platform/vhci_hcd.0/status and in debug output. * The socket pointer address is not used at the moment and it was made * visible as a convenient way to find IP address from socket pointer * address by looking up /proc/net/{tcp,tcp6}. As this opens a security * hole, the change is made to use sockfd instead. * */ static void port_show_vhci(char **out, int hub, int port, struct vhci_device *vdev) { if (hub == HUB_SPEED_HIGH) *out += sprintf(*out, "hs %04u %03u ", port, vdev->ud.status); else /* hub == HUB_SPEED_SUPER */ *out += sprintf(*out, "ss %04u %03u ", port, vdev->ud.status); if (vdev->ud.status == VDEV_ST_USED) { *out += sprintf(*out, "%03u %08x ", vdev->speed, vdev->devid); *out += sprintf(*out, "%06u %s", vdev->ud.sockfd, dev_name(&vdev->udev->dev)); } else { *out += sprintf(*out, "000 00000000 "); *out += sprintf(*out, "000000 0-0"); } *out += sprintf(*out, "\n"); } /* Sysfs entry to show port status */ static ssize_t status_show_vhci(int pdev_nr, char *out) { struct platform_device *pdev = vhcis[pdev_nr].pdev; struct vhci *vhci; struct usb_hcd *hcd; struct vhci_hcd *vhci_hcd; char *s = out; int i; unsigned long flags; if (!pdev || !out) { usbip_dbg_vhci_sysfs("show status error\n"); return 0; } hcd = platform_get_drvdata(pdev); vhci_hcd = hcd_to_vhci_hcd(hcd); vhci = vhci_hcd->vhci; spin_lock_irqsave(&vhci->lock, flags); for (i = 0; i < VHCI_HC_PORTS; i++) { struct vhci_device *vdev = &vhci->vhci_hcd_hs->vdev[i]; spin_lock(&vdev->ud.lock); port_show_vhci(&out, HUB_SPEED_HIGH, pdev_nr * VHCI_PORTS + i, vdev); spin_unlock(&vdev->ud.lock); } for (i = 0; i < VHCI_HC_PORTS; i++) { struct vhci_device *vdev = &vhci->vhci_hcd_ss->vdev[i]; spin_lock(&vdev->ud.lock); port_show_vhci(&out, HUB_SPEED_SUPER, pdev_nr * VHCI_PORTS + VHCI_HC_PORTS + i, vdev); spin_unlock(&vdev->ud.lock); } spin_unlock_irqrestore(&vhci->lock, flags); return out - s; } static ssize_t status_show_not_ready(int pdev_nr, char *out) { char *s = out; int i = 0; for (i = 0; i < VHCI_HC_PORTS; i++) { out += sprintf(out, "hs %04u %03u ", (pdev_nr * VHCI_PORTS) + i, VDEV_ST_NOTASSIGNED); out += sprintf(out, "000 00000000 0000000000000000 0-0"); out += sprintf(out, "\n"); } for (i = 0; i < VHCI_HC_PORTS; i++) { out += sprintf(out, "ss %04u %03u ", (pdev_nr * VHCI_PORTS) + VHCI_HC_PORTS + i, VDEV_ST_NOTASSIGNED); out += sprintf(out, "000 00000000 0000000000000000 0-0"); out += sprintf(out, "\n"); } return out - s; } static int status_name_to_id(const char *name) { char *c; long val; int ret; c = strchr(name, '.'); if (c == NULL) return 0; ret = kstrtol(c+1, 10, &val); if (ret < 0) return ret; return val; } static ssize_t status_show(struct device *dev, struct device_attribute *attr, char *out) { char *s = out; int pdev_nr; out += sprintf(out, "hub port sta spd dev sockfd local_busid\n"); pdev_nr = status_name_to_id(attr->attr.name); if (pdev_nr < 0) out += status_show_not_ready(pdev_nr, out); else out += status_show_vhci(pdev_nr, out); return out - s; } static ssize_t nports_show(struct device *dev, struct device_attribute *attr, char *out) { char *s = out; /* * Half the ports are for SPEED_HIGH and half for SPEED_SUPER, * thus the * 2. */ out += sprintf(out, "%d\n", VHCI_PORTS * vhci_num_controllers); return out - s; } static DEVICE_ATTR_RO(nports); /* Sysfs entry to shutdown a virtual connection */ static int vhci_port_disconnect(struct vhci_hcd *vhci_hcd, __u32 rhport) { struct vhci_device *vdev = &vhci_hcd->vdev[rhport]; struct vhci *vhci = vhci_hcd->vhci; unsigned long flags; usbip_dbg_vhci_sysfs("enter\n"); mutex_lock(&vdev->ud.sysfs_lock); /* lock */ spin_lock_irqsave(&vhci->lock, flags); spin_lock(&vdev->ud.lock); if (vdev->ud.status == VDEV_ST_NULL) { pr_err("not connected %d\n", vdev->ud.status); /* unlock */ spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); mutex_unlock(&vdev->ud.sysfs_lock); return -EINVAL; } /* unlock */ spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN); mutex_unlock(&vdev->ud.sysfs_lock); return 0; } static int valid_port(__u32 *pdev_nr, __u32 *rhport) { if (*pdev_nr >= vhci_num_controllers) { pr_err("pdev %u\n", *pdev_nr); return 0; } *pdev_nr = array_index_nospec(*pdev_nr, vhci_num_controllers); if (*rhport >= VHCI_HC_PORTS) { pr_err("rhport %u\n", *rhport); return 0; } *rhport = array_index_nospec(*rhport, VHCI_HC_PORTS); return 1; } static ssize_t detach_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { __u32 port = 0, pdev_nr = 0, rhport = 0; struct usb_hcd *hcd; struct vhci_hcd *vhci_hcd; int ret; if (kstrtoint(buf, 10, &port) < 0) return -EINVAL; pdev_nr = port_to_pdev_nr(port); rhport = port_to_rhport(port); if (!valid_port(&pdev_nr, &rhport)) return -EINVAL; hcd = platform_get_drvdata(vhcis[pdev_nr].pdev); if (hcd == NULL) { dev_err(dev, "port is not ready %u\n", port); return -EAGAIN; } usbip_dbg_vhci_sysfs("rhport %d\n", rhport); if ((port / VHCI_HC_PORTS) % 2) vhci_hcd = hcd_to_vhci_hcd(hcd)->vhci->vhci_hcd_ss; else vhci_hcd = hcd_to_vhci_hcd(hcd)->vhci->vhci_hcd_hs; ret = vhci_port_disconnect(vhci_hcd, rhport); if (ret < 0) return -EINVAL; usbip_dbg_vhci_sysfs("Leave\n"); return count; } static DEVICE_ATTR_WO(detach); static int valid_args(__u32 *pdev_nr, __u32 *rhport, enum usb_device_speed speed) { if (!valid_port(pdev_nr, rhport)) { return 0; } switch (speed) { case USB_SPEED_LOW: case USB_SPEED_FULL: case USB_SPEED_HIGH: case USB_SPEED_WIRELESS: case USB_SPEED_SUPER: case USB_SPEED_SUPER_PLUS: break; default: pr_err("Failed attach request for unsupported USB speed: %s\n", usb_speed_string(speed)); return 0; } return 1; } /* Sysfs entry to establish a virtual connection */ /* * To start a new USB/IP attachment, a userland program needs to setup a TCP * connection and then write its socket descriptor with remote device * information into this sysfs file. * * A remote device is virtually attached to the root-hub port of @rhport with * @speed. @devid is embedded into a request to specify the remote device in a * server host. * * write() returns 0 on success, else negative errno. */ static ssize_t attach_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct socket *socket; int sockfd = 0; __u32 port = 0, pdev_nr = 0, rhport = 0, devid = 0, speed = 0; struct usb_hcd *hcd; struct vhci_hcd *vhci_hcd; struct vhci_device *vdev; struct vhci *vhci; int err; unsigned long flags; struct task_struct *tcp_rx = NULL; struct task_struct *tcp_tx = NULL; /* * @rhport: port number of vhci_hcd * @sockfd: socket descriptor of an established TCP connection * @devid: unique device identifier in a remote host * @speed: usb device speed in a remote host */ if (sscanf(buf, "%u %u %u %u", &port, &sockfd, &devid, &speed) != 4) return -EINVAL; pdev_nr = port_to_pdev_nr(port); rhport = port_to_rhport(port); usbip_dbg_vhci_sysfs("port(%u) pdev(%d) rhport(%u)\n", port, pdev_nr, rhport); usbip_dbg_vhci_sysfs("sockfd(%u) devid(%u) speed(%u)\n", sockfd, devid, speed); /* check received parameters */ if (!valid_args(&pdev_nr, &rhport, speed)) return -EINVAL; hcd = platform_get_drvdata(vhcis[pdev_nr].pdev); if (hcd == NULL) { dev_err(dev, "port %d is not ready\n", port); return -EAGAIN; } vhci_hcd = hcd_to_vhci_hcd(hcd); vhci = vhci_hcd->vhci; if (speed >= USB_SPEED_SUPER) vdev = &vhci->vhci_hcd_ss->vdev[rhport]; else vdev = &vhci->vhci_hcd_hs->vdev[rhport]; mutex_lock(&vdev->ud.sysfs_lock); /* Extract socket from fd. */ socket = sockfd_lookup(sockfd, &err); if (!socket) { dev_err(dev, "failed to lookup sock"); err = -EINVAL; goto unlock_mutex; } if (socket->type != SOCK_STREAM) { dev_err(dev, "Expecting SOCK_STREAM - found %d", socket->type); sockfd_put(socket); err = -EINVAL; goto unlock_mutex; } /* create threads before locking */ tcp_rx = kthread_create(vhci_rx_loop, &vdev->ud, "vhci_rx"); if (IS_ERR(tcp_rx)) { sockfd_put(socket); err = -EINVAL; goto unlock_mutex; } tcp_tx = kthread_create(vhci_tx_loop, &vdev->ud, "vhci_tx"); if (IS_ERR(tcp_tx)) { kthread_stop(tcp_rx); sockfd_put(socket); err = -EINVAL; goto unlock_mutex; } /* get task structs now */ get_task_struct(tcp_rx); get_task_struct(tcp_tx); /* now begin lock until setting vdev status set */ spin_lock_irqsave(&vhci->lock, flags); spin_lock(&vdev->ud.lock); if (vdev->ud.status != VDEV_ST_NULL) { /* end of the lock */ spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); sockfd_put(socket); kthread_stop_put(tcp_rx); kthread_stop_put(tcp_tx); dev_err(dev, "port %d already used\n", rhport); /* * Will be retried from userspace * if there's another free port. */ err = -EBUSY; goto unlock_mutex; } dev_info(dev, "pdev(%u) rhport(%u) sockfd(%d)\n", pdev_nr, rhport, sockfd); dev_info(dev, "devid(%u) speed(%u) speed_str(%s)\n", devid, speed, usb_speed_string(speed)); vdev->devid = devid; vdev->speed = speed; vdev->ud.sockfd = sockfd; vdev->ud.tcp_socket = socket; vdev->ud.tcp_rx = tcp_rx; vdev->ud.tcp_tx = tcp_tx; vdev->ud.status = VDEV_ST_NOTASSIGNED; usbip_kcov_handle_init(&vdev->ud); spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); /* end the lock */ wake_up_process(vdev->ud.tcp_rx); wake_up_process(vdev->ud.tcp_tx); rh_port_connect(vdev, speed); dev_info(dev, "Device attached\n"); mutex_unlock(&vdev->ud.sysfs_lock); return count; unlock_mutex: mutex_unlock(&vdev->ud.sysfs_lock); return err; } static DEVICE_ATTR_WO(attach); #define MAX_STATUS_NAME 16 struct status_attr { struct device_attribute attr; char name[MAX_STATUS_NAME+1]; }; static struct status_attr *status_attrs; static void set_status_attr(int id) { struct status_attr *status; status = status_attrs + id; if (id == 0) strcpy(status->name, "status"); else snprintf(status->name, MAX_STATUS_NAME+1, "status.%d", id); status->attr.attr.name = status->name; status->attr.attr.mode = S_IRUGO; status->attr.show = status_show; sysfs_attr_init(&status->attr.attr); } static int init_status_attrs(void) { int id; status_attrs = kcalloc(vhci_num_controllers, sizeof(struct status_attr), GFP_KERNEL); if (status_attrs == NULL) return -ENOMEM; for (id = 0; id < vhci_num_controllers; id++) set_status_attr(id); return 0; } static void finish_status_attrs(void) { kfree(status_attrs); } struct attribute_group vhci_attr_group = { .attrs = NULL, }; int vhci_init_attr_group(void) { struct attribute **attrs; int ret, i; attrs = kcalloc((vhci_num_controllers + 5), sizeof(struct attribute *), GFP_KERNEL); if (attrs == NULL) return -ENOMEM; ret = init_status_attrs(); if (ret) { kfree(attrs); return ret; } *attrs = &dev_attr_nports.attr; *(attrs + 1) = &dev_attr_detach.attr; *(attrs + 2) = &dev_attr_attach.attr; *(attrs + 3) = &dev_attr_usbip_debug.attr; for (i = 0; i < vhci_num_controllers; i++) *(attrs + i + 4) = &((status_attrs + i)->attr.attr); vhci_attr_group.attrs = attrs; return 0; } void vhci_finish_attr_group(void) { finish_status_attrs(); kfree(vhci_attr_group.attrs); } |
| 1 1 1 1 52 47 13 48 1 2 48 1 1 1 31 37 1 49 45 18 1 2 44 17 5 8 8 5 7 1 6 6 69 70 12 29 29 29 29 23 23 28 29 4 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 | // SPDX-License-Identifier: GPL-2.0-or-later /* * ALSA sequencer Memory Manager * Copyright (c) 1998 by Frank van de Pol <fvdpol@coil.demon.nl> * Jaroslav Kysela <perex@perex.cz> * 2000 by Takashi Iwai <tiwai@suse.de> */ #include <linux/init.h> #include <linux/export.h> #include <linux/slab.h> #include <linux/sched/signal.h> #include <linux/mm.h> #include <sound/core.h> #include <sound/seq_kernel.h> #include "seq_memory.h" #include "seq_queue.h" #include "seq_info.h" #include "seq_lock.h" static inline int snd_seq_pool_available(struct snd_seq_pool *pool) { return pool->total_elements - atomic_read(&pool->counter); } static inline int snd_seq_output_ok(struct snd_seq_pool *pool) { return snd_seq_pool_available(pool) >= pool->room; } /* * Variable length event: * The event like sysex uses variable length type. * The external data may be stored in three different formats. * 1) kernel space * This is the normal case. * ext.data.len = length * ext.data.ptr = buffer pointer * 2) user space * When an event is generated via read(), the external data is * kept in user space until expanded. * ext.data.len = length | SNDRV_SEQ_EXT_USRPTR * ext.data.ptr = userspace pointer * 3) chained cells * When the variable length event is enqueued (in prioq or fifo), * the external data is decomposed to several cells. * ext.data.len = length | SNDRV_SEQ_EXT_CHAINED * ext.data.ptr = the additiona cell head * -> cell.next -> cell.next -> .. */ /* * exported: * call dump function to expand external data. */ static int get_var_len(const struct snd_seq_event *event) { if ((event->flags & SNDRV_SEQ_EVENT_LENGTH_MASK) != SNDRV_SEQ_EVENT_LENGTH_VARIABLE) return -EINVAL; return event->data.ext.len & ~SNDRV_SEQ_EXT_MASK; } static int dump_var_event(const struct snd_seq_event *event, snd_seq_dump_func_t func, void *private_data, int offset, int maxlen) { int len, err; struct snd_seq_event_cell *cell; len = get_var_len(event); if (len <= 0) return len; if (len <= offset) return 0; if (maxlen && len > offset + maxlen) len = offset + maxlen; if (event->data.ext.len & SNDRV_SEQ_EXT_USRPTR) { char buf[32]; char __user *curptr = (char __force __user *)event->data.ext.ptr; curptr += offset; len -= offset; while (len > 0) { int size = sizeof(buf); if (len < size) size = len; if (copy_from_user(buf, curptr, size)) return -EFAULT; err = func(private_data, buf, size); if (err < 0) return err; curptr += size; len -= size; } return 0; } if (!(event->data.ext.len & SNDRV_SEQ_EXT_CHAINED)) return func(private_data, event->data.ext.ptr + offset, len - offset); cell = (struct snd_seq_event_cell *)event->data.ext.ptr; for (; len > 0 && cell; cell = cell->next) { int size = sizeof(struct snd_seq_event); char *curptr = (char *)&cell->event; if (offset >= size) { offset -= size; len -= size; continue; } if (len < size) size = len; err = func(private_data, curptr + offset, size - offset); if (err < 0) return err; offset = 0; len -= size; } return 0; } int snd_seq_dump_var_event(const struct snd_seq_event *event, snd_seq_dump_func_t func, void *private_data) { return dump_var_event(event, func, private_data, 0, 0); } EXPORT_SYMBOL(snd_seq_dump_var_event); /* * exported: * expand the variable length event to linear buffer space. */ static int seq_copy_in_kernel(void *ptr, void *src, int size) { char **bufptr = ptr; memcpy(*bufptr, src, size); *bufptr += size; return 0; } static int seq_copy_in_user(void *ptr, void *src, int size) { char __user **bufptr = ptr; if (copy_to_user(*bufptr, src, size)) return -EFAULT; *bufptr += size; return 0; } static int expand_var_event(const struct snd_seq_event *event, int offset, int size, char *buf, bool in_kernel) { if (event->data.ext.len & SNDRV_SEQ_EXT_USRPTR) { if (! in_kernel) return -EINVAL; if (copy_from_user(buf, (char __force __user *)event->data.ext.ptr + offset, size)) return -EFAULT; return 0; } return dump_var_event(event, in_kernel ? seq_copy_in_kernel : seq_copy_in_user, &buf, offset, size); } int snd_seq_expand_var_event(const struct snd_seq_event *event, int count, char *buf, int in_kernel, int size_aligned) { int len, newlen, err; len = get_var_len(event); if (len < 0) return len; newlen = len; if (size_aligned > 0) newlen = roundup(len, size_aligned); if (count < newlen) return -EAGAIN; err = expand_var_event(event, 0, len, buf, in_kernel); if (err < 0) return err; if (len != newlen) { if (in_kernel) memset(buf + len, 0, newlen - len); else if (clear_user((__force void __user *)buf + len, newlen - len)) return -EFAULT; } return newlen; } EXPORT_SYMBOL(snd_seq_expand_var_event); int snd_seq_expand_var_event_at(const struct snd_seq_event *event, int count, char *buf, int offset) { int len, err; len = get_var_len(event); if (len < 0) return len; if (len <= offset) return 0; len -= offset; if (len > count) len = count; err = expand_var_event(event, offset, count, buf, true); if (err < 0) return err; return len; } EXPORT_SYMBOL_GPL(snd_seq_expand_var_event_at); /* * release this cell, free extended data if available */ static inline void free_cell(struct snd_seq_pool *pool, struct snd_seq_event_cell *cell) { cell->next = pool->free; pool->free = cell; atomic_dec(&pool->counter); } void snd_seq_cell_free(struct snd_seq_event_cell * cell) { struct snd_seq_pool *pool; if (snd_BUG_ON(!cell)) return; pool = cell->pool; if (snd_BUG_ON(!pool)) return; guard(spinlock_irqsave)(&pool->lock); free_cell(pool, cell); if (snd_seq_ev_is_variable(&cell->event)) { if (cell->event.data.ext.len & SNDRV_SEQ_EXT_CHAINED) { struct snd_seq_event_cell *curp, *nextptr; curp = cell->event.data.ext.ptr; for (; curp; curp = nextptr) { nextptr = curp->next; curp->next = pool->free; free_cell(pool, curp); } } } if (waitqueue_active(&pool->output_sleep)) { /* has enough space now? */ if (snd_seq_output_ok(pool)) wake_up(&pool->output_sleep); } } /* * allocate an event cell. */ static int snd_seq_cell_alloc(struct snd_seq_pool *pool, struct snd_seq_event_cell **cellp, int nonblock, struct file *file, struct mutex *mutexp) { struct snd_seq_event_cell *cell; unsigned long flags; int err = -EAGAIN; wait_queue_entry_t wait; if (pool == NULL) return -EINVAL; *cellp = NULL; init_waitqueue_entry(&wait, current); spin_lock_irqsave(&pool->lock, flags); if (pool->ptr == NULL) { /* not initialized */ pr_debug("ALSA: seq: pool is not initialized\n"); err = -EINVAL; goto __error; } while (pool->free == NULL && ! nonblock && ! pool->closing) { set_current_state(TASK_INTERRUPTIBLE); add_wait_queue(&pool->output_sleep, &wait); spin_unlock_irqrestore(&pool->lock, flags); if (mutexp) mutex_unlock(mutexp); schedule(); if (mutexp) mutex_lock(mutexp); spin_lock_irqsave(&pool->lock, flags); remove_wait_queue(&pool->output_sleep, &wait); /* interrupted? */ if (signal_pending(current)) { err = -ERESTARTSYS; goto __error; } } if (pool->closing) { /* closing.. */ err = -ENOMEM; goto __error; } cell = pool->free; if (cell) { int used; pool->free = cell->next; atomic_inc(&pool->counter); used = atomic_read(&pool->counter); if (pool->max_used < used) pool->max_used = used; pool->event_alloc_success++; /* clear cell pointers */ cell->next = NULL; err = 0; } else pool->event_alloc_failures++; *cellp = cell; __error: spin_unlock_irqrestore(&pool->lock, flags); return err; } /* * duplicate the event to a cell. * if the event has external data, the data is decomposed to additional * cells. */ int snd_seq_event_dup(struct snd_seq_pool *pool, struct snd_seq_event *event, struct snd_seq_event_cell **cellp, int nonblock, struct file *file, struct mutex *mutexp) { int ncells, err; unsigned int extlen; struct snd_seq_event_cell *cell; int size; *cellp = NULL; ncells = 0; extlen = 0; if (snd_seq_ev_is_variable(event)) { extlen = event->data.ext.len & ~SNDRV_SEQ_EXT_MASK; ncells = DIV_ROUND_UP(extlen, sizeof(struct snd_seq_event)); } if (ncells >= pool->total_elements) return -ENOMEM; err = snd_seq_cell_alloc(pool, &cell, nonblock, file, mutexp); if (err < 0) return err; /* copy the event */ size = snd_seq_event_packet_size(event); memcpy(&cell->ump, event, size); #if IS_ENABLED(CONFIG_SND_SEQ_UMP) if (size < sizeof(cell->event)) cell->ump.raw.extra = 0; #endif /* decompose */ if (snd_seq_ev_is_variable(event)) { int len = extlen; int is_chained = event->data.ext.len & SNDRV_SEQ_EXT_CHAINED; int is_usrptr = event->data.ext.len & SNDRV_SEQ_EXT_USRPTR; struct snd_seq_event_cell *src, *tmp, *tail; char *buf; cell->event.data.ext.len = extlen | SNDRV_SEQ_EXT_CHAINED; cell->event.data.ext.ptr = NULL; src = (struct snd_seq_event_cell *)event->data.ext.ptr; buf = (char *)event->data.ext.ptr; tail = NULL; while (ncells-- > 0) { size = sizeof(struct snd_seq_event); if (len < size) size = len; err = snd_seq_cell_alloc(pool, &tmp, nonblock, file, mutexp); if (err < 0) goto __error; if (cell->event.data.ext.ptr == NULL) cell->event.data.ext.ptr = tmp; if (tail) tail->next = tmp; tail = tmp; /* copy chunk */ if (is_chained && src) { tmp->event = src->event; src = src->next; } else if (is_usrptr) { if (copy_from_user(&tmp->event, (char __force __user *)buf, size)) { err = -EFAULT; goto __error; } } else { memcpy(&tmp->event, buf, size); } buf += size; len -= size; } } *cellp = cell; return 0; __error: snd_seq_cell_free(cell); return err; } /* poll wait */ int snd_seq_pool_poll_wait(struct snd_seq_pool *pool, struct file *file, poll_table *wait) { poll_wait(file, &pool->output_sleep, wait); guard(spinlock_irq)(&pool->lock); return snd_seq_output_ok(pool); } /* allocate room specified number of events */ int snd_seq_pool_init(struct snd_seq_pool *pool) { int cell; struct snd_seq_event_cell *cellptr; if (snd_BUG_ON(!pool)) return -EINVAL; cellptr = kvmalloc_array(pool->size, sizeof(struct snd_seq_event_cell), GFP_KERNEL); if (!cellptr) return -ENOMEM; /* add new cells to the free cell list */ guard(spinlock_irq)(&pool->lock); if (pool->ptr) { kvfree(cellptr); return 0; } pool->ptr = cellptr; pool->free = NULL; for (cell = 0; cell < pool->size; cell++) { cellptr = pool->ptr + cell; cellptr->pool = pool; cellptr->next = pool->free; pool->free = cellptr; } pool->room = (pool->size + 1) / 2; /* init statistics */ pool->max_used = 0; pool->total_elements = pool->size; return 0; } /* refuse the further insertion to the pool */ void snd_seq_pool_mark_closing(struct snd_seq_pool *pool) { if (snd_BUG_ON(!pool)) return; guard(spinlock_irqsave)(&pool->lock); pool->closing = 1; } /* remove events */ int snd_seq_pool_done(struct snd_seq_pool *pool) { struct snd_seq_event_cell *ptr; if (snd_BUG_ON(!pool)) return -EINVAL; /* wait for closing all threads */ if (waitqueue_active(&pool->output_sleep)) wake_up(&pool->output_sleep); while (atomic_read(&pool->counter) > 0) schedule_timeout_uninterruptible(1); /* release all resources */ scoped_guard(spinlock_irq, &pool->lock) { ptr = pool->ptr; pool->ptr = NULL; pool->free = NULL; pool->total_elements = 0; } kvfree(ptr); guard(spinlock_irq)(&pool->lock); pool->closing = 0; return 0; } /* init new memory pool */ struct snd_seq_pool *snd_seq_pool_new(int poolsize) { struct snd_seq_pool *pool; /* create pool block */ pool = kzalloc(sizeof(*pool), GFP_KERNEL); if (!pool) return NULL; spin_lock_init(&pool->lock); pool->ptr = NULL; pool->free = NULL; pool->total_elements = 0; atomic_set(&pool->counter, 0); pool->closing = 0; init_waitqueue_head(&pool->output_sleep); pool->size = poolsize; /* init statistics */ pool->max_used = 0; return pool; } /* remove memory pool */ int snd_seq_pool_delete(struct snd_seq_pool **ppool) { struct snd_seq_pool *pool = *ppool; *ppool = NULL; if (pool == NULL) return 0; snd_seq_pool_mark_closing(pool); snd_seq_pool_done(pool); kfree(pool); return 0; } /* exported to seq_clientmgr.c */ void snd_seq_info_pool(struct snd_info_buffer *buffer, struct snd_seq_pool *pool, char *space) { if (pool == NULL) return; snd_iprintf(buffer, "%sPool size : %d\n", space, pool->total_elements); snd_iprintf(buffer, "%sCells in use : %d\n", space, atomic_read(&pool->counter)); snd_iprintf(buffer, "%sPeak cells in use : %d\n", space, pool->max_used); snd_iprintf(buffer, "%sAlloc success : %d\n", space, pool->event_alloc_success); snd_iprintf(buffer, "%sAlloc failures : %d\n", space, pool->event_alloc_failures); } |
| 3 1 1 1 1 1 1 1 1 1 2 7 1 6 3 1 2 2 1 2 3 3 1 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 | // SPDX-License-Identifier: GPL-2.0-only /* * VFIO-KVM bridge pseudo device * * Copyright (C) 2013 Red Hat, Inc. All rights reserved. * Author: Alex Williamson <alex.williamson@redhat.com> */ #include <linux/errno.h> #include <linux/file.h> #include <linux/kvm_host.h> #include <linux/list.h> #include <linux/module.h> #include <linux/mutex.h> #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/vfio.h> #include "vfio.h" #ifdef CONFIG_SPAPR_TCE_IOMMU #include <asm/kvm_ppc.h> #endif struct kvm_vfio_file { struct list_head node; struct file *file; #ifdef CONFIG_SPAPR_TCE_IOMMU struct iommu_group *iommu_group; #endif }; struct kvm_vfio { struct list_head file_list; struct mutex lock; bool noncoherent; }; static void kvm_vfio_file_set_kvm(struct file *file, struct kvm *kvm) { void (*fn)(struct file *file, struct kvm *kvm); fn = symbol_get(vfio_file_set_kvm); if (!fn) return; fn(file, kvm); symbol_put(vfio_file_set_kvm); } static bool kvm_vfio_file_enforced_coherent(struct file *file) { bool (*fn)(struct file *file); bool ret; fn = symbol_get(vfio_file_enforced_coherent); if (!fn) return false; ret = fn(file); symbol_put(vfio_file_enforced_coherent); return ret; } static bool kvm_vfio_file_is_valid(struct file *file) { bool (*fn)(struct file *file); bool ret; fn = symbol_get(vfio_file_is_valid); if (!fn) return false; ret = fn(file); symbol_put(vfio_file_is_valid); return ret; } #ifdef CONFIG_SPAPR_TCE_IOMMU static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file) { struct iommu_group *(*fn)(struct file *file); struct iommu_group *ret; fn = symbol_get(vfio_file_iommu_group); if (!fn) return NULL; ret = fn(file); symbol_put(vfio_file_iommu_group); return ret; } static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm, struct kvm_vfio_file *kvf) { if (WARN_ON_ONCE(!kvf->iommu_group)) return; kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group); iommu_group_put(kvf->iommu_group); kvf->iommu_group = NULL; } #endif /* * Groups/devices can use the same or different IOMMU domains. If the same * then adding a new group/device may change the coherency of groups/devices * we've previously been told about. We don't want to care about any of * that so we retest each group/device and bail as soon as we find one that's * noncoherent. This means we only ever [un]register_noncoherent_dma once * for the whole device. */ static void kvm_vfio_update_coherency(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; bool noncoherent = false; struct kvm_vfio_file *kvf; list_for_each_entry(kvf, &kv->file_list, node) { if (!kvm_vfio_file_enforced_coherent(kvf->file)) { noncoherent = true; break; } } if (noncoherent != kv->noncoherent) { kv->noncoherent = noncoherent; if (kv->noncoherent) kvm_arch_register_noncoherent_dma(dev->kvm); else kvm_arch_unregister_noncoherent_dma(dev->kvm); } } static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; struct kvm_vfio_file *kvf; struct file *filp; int ret = 0; filp = fget(fd); if (!filp) return -EBADF; /* Ensure the FD is a vfio FD. */ if (!kvm_vfio_file_is_valid(filp)) { ret = -EINVAL; goto out_fput; } mutex_lock(&kv->lock); list_for_each_entry(kvf, &kv->file_list, node) { if (kvf->file == filp) { ret = -EEXIST; goto out_unlock; } } kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT); if (!kvf) { ret = -ENOMEM; goto out_unlock; } kvf->file = get_file(filp); list_add_tail(&kvf->node, &kv->file_list); kvm_arch_start_assignment(dev->kvm); kvm_vfio_file_set_kvm(kvf->file, dev->kvm); kvm_vfio_update_coherency(dev); out_unlock: mutex_unlock(&kv->lock); out_fput: fput(filp); return ret; } static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; struct kvm_vfio_file *kvf; CLASS(fd, f)(fd); int ret; if (fd_empty(f)) return -EBADF; ret = -ENOENT; mutex_lock(&kv->lock); list_for_each_entry(kvf, &kv->file_list, node) { if (kvf->file != fd_file(f)) continue; list_del(&kvf->node); kvm_arch_end_assignment(dev->kvm); #ifdef CONFIG_SPAPR_TCE_IOMMU kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif kvm_vfio_file_set_kvm(kvf->file, NULL); fput(kvf->file); kfree(kvf); ret = 0; break; } kvm_vfio_update_coherency(dev); mutex_unlock(&kv->lock); return ret; } #ifdef CONFIG_SPAPR_TCE_IOMMU static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev, void __user *arg) { struct kvm_vfio_spapr_tce param; struct kvm_vfio *kv = dev->private; struct kvm_vfio_file *kvf; int ret; if (copy_from_user(¶m, arg, sizeof(struct kvm_vfio_spapr_tce))) return -EFAULT; CLASS(fd, f)(param.groupfd); if (fd_empty(f)) return -EBADF; ret = -ENOENT; mutex_lock(&kv->lock); list_for_each_entry(kvf, &kv->file_list, node) { if (kvf->file != fd_file(f)) continue; if (!kvf->iommu_group) { kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file); if (WARN_ON_ONCE(!kvf->iommu_group)) { ret = -EIO; goto err_fdput; } } ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd, kvf->iommu_group); break; } err_fdput: mutex_unlock(&kv->lock); return ret; } #endif static int kvm_vfio_set_file(struct kvm_device *dev, long attr, void __user *arg) { int32_t __user *argp = arg; int32_t fd; switch (attr) { case KVM_DEV_VFIO_FILE_ADD: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_add(dev, fd); case KVM_DEV_VFIO_FILE_DEL: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } return -ENXIO; } static int kvm_vfio_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { case KVM_DEV_VFIO_FILE: return kvm_vfio_set_file(dev, attr->attr, u64_to_user_ptr(attr->addr)); } return -ENXIO; } static int kvm_vfio_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { case KVM_DEV_VFIO_FILE: switch (attr->attr) { case KVM_DEV_VFIO_FILE_ADD: case KVM_DEV_VFIO_FILE_DEL: #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: #endif return 0; } break; } return -ENXIO; } static void kvm_vfio_release(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; struct kvm_vfio_file *kvf, *tmp; list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) { #ifdef CONFIG_SPAPR_TCE_IOMMU kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif kvm_vfio_file_set_kvm(kvf->file, NULL); fput(kvf->file); list_del(&kvf->node); kfree(kvf); kvm_arch_end_assignment(dev->kvm); } kvm_vfio_update_coherency(dev); kfree(kv); kfree(dev); /* alloc by kvm_ioctl_create_device, free by .release */ } static int kvm_vfio_create(struct kvm_device *dev, u32 type); static const struct kvm_device_ops kvm_vfio_ops = { .name = "kvm-vfio", .create = kvm_vfio_create, .release = kvm_vfio_release, .set_attr = kvm_vfio_set_attr, .has_attr = kvm_vfio_has_attr, }; static int kvm_vfio_create(struct kvm_device *dev, u32 type) { struct kvm_device *tmp; struct kvm_vfio *kv; lockdep_assert_held(&dev->kvm->lock); /* Only one VFIO "device" per VM */ list_for_each_entry(tmp, &dev->kvm->devices, vm_node) if (tmp->ops == &kvm_vfio_ops) return -EBUSY; kv = kzalloc(sizeof(*kv), GFP_KERNEL_ACCOUNT); if (!kv) return -ENOMEM; INIT_LIST_HEAD(&kv->file_list); mutex_init(&kv->lock); dev->private = kv; return 0; } int kvm_vfio_ops_init(void) { return kvm_register_device_ops(&kvm_vfio_ops, KVM_DEV_TYPE_VFIO); } void kvm_vfio_ops_exit(void) { kvm_unregister_device_ops(KVM_DEV_TYPE_VFIO); } |
| 1 90 89 3 3 3 3 2 90 89 1 14 73 93 1 94 1 92 85 75 96 96 2 5 15 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | /* * Copyright (c) 2017 Mellanox Technologies Inc. All rights reserved. * Copyright (c) 2010 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__ #include <linux/export.h> #include <net/netlink.h> #include <net/net_namespace.h> #include <net/netns/generic.h> #include <net/sock.h> #include <rdma/rdma_netlink.h> #include <linux/module.h> #include "core_priv.h" static struct { const struct rdma_nl_cbs *cb_table; /* Synchronizes between ongoing netlink commands and netlink client * unregistration. */ struct rw_semaphore sem; } rdma_nl_types[RDMA_NL_NUM_CLIENTS]; bool rdma_nl_chk_listeners(unsigned int group) { struct rdma_dev_net *rnet = rdma_net_to_dev_net(&init_net); return netlink_has_listeners(rnet->nl_sock, group); } EXPORT_SYMBOL(rdma_nl_chk_listeners); static bool is_nl_msg_valid(unsigned int type, unsigned int op) { static const unsigned int max_num_ops[RDMA_NL_NUM_CLIENTS] = { [RDMA_NL_IWCM] = RDMA_NL_IWPM_NUM_OPS, [RDMA_NL_LS] = RDMA_NL_LS_NUM_OPS, [RDMA_NL_NLDEV] = RDMA_NLDEV_NUM_OPS, }; /* * This BUILD_BUG_ON is intended to catch addition of new * RDMA netlink protocol without updating the array above. */ BUILD_BUG_ON(RDMA_NL_NUM_CLIENTS != 6); if (type >= RDMA_NL_NUM_CLIENTS) return false; return op < max_num_ops[type]; } static const struct rdma_nl_cbs * get_cb_table(const struct sk_buff *skb, unsigned int type, unsigned int op) { const struct rdma_nl_cbs *cb_table; /* * Currently only NLDEV client is supporting netlink commands in * non init_net net namespace. */ if (sock_net(skb->sk) != &init_net && type != RDMA_NL_NLDEV) return NULL; cb_table = READ_ONCE(rdma_nl_types[type].cb_table); if (!cb_table) { /* * Didn't get valid reference of the table, attempt module * load once. */ up_read(&rdma_nl_types[type].sem); request_module("rdma-netlink-subsys-%u", type); down_read(&rdma_nl_types[type].sem); cb_table = READ_ONCE(rdma_nl_types[type].cb_table); } if (!cb_table || (!cb_table[op].dump && !cb_table[op].doit)) return NULL; return cb_table; } void rdma_nl_register(unsigned int index, const struct rdma_nl_cbs cb_table[]) { if (WARN_ON(!is_nl_msg_valid(index, 0)) || WARN_ON(READ_ONCE(rdma_nl_types[index].cb_table))) return; /* Pairs with the READ_ONCE in is_nl_valid() */ smp_store_release(&rdma_nl_types[index].cb_table, cb_table); } EXPORT_SYMBOL(rdma_nl_register); void rdma_nl_unregister(unsigned int index) { down_write(&rdma_nl_types[index].sem); rdma_nl_types[index].cb_table = NULL; up_write(&rdma_nl_types[index].sem); } EXPORT_SYMBOL(rdma_nl_unregister); void *ibnl_put_msg(struct sk_buff *skb, struct nlmsghdr **nlh, int seq, int len, int client, int op, int flags) { *nlh = nlmsg_put(skb, 0, seq, RDMA_NL_GET_TYPE(client, op), len, flags); if (!*nlh) return NULL; return nlmsg_data(*nlh); } EXPORT_SYMBOL(ibnl_put_msg); int ibnl_put_attr(struct sk_buff *skb, struct nlmsghdr *nlh, int len, void *data, int type) { if (nla_put(skb, type, len, data)) { nlmsg_cancel(skb, nlh); return -EMSGSIZE; } return 0; } EXPORT_SYMBOL(ibnl_put_attr); static int rdma_nl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, struct netlink_ext_ack *extack) { int type = nlh->nlmsg_type; unsigned int index = RDMA_NL_GET_CLIENT(type); unsigned int op = RDMA_NL_GET_OP(type); const struct rdma_nl_cbs *cb_table; int err = -EINVAL; if (!is_nl_msg_valid(index, op)) return -EINVAL; down_read(&rdma_nl_types[index].sem); cb_table = get_cb_table(skb, index, op); if (!cb_table) goto done; if ((cb_table[op].flags & RDMA_NL_ADMIN_PERM) && !netlink_capable(skb, CAP_NET_ADMIN)) { err = -EPERM; goto done; } /* * LS responses overload the 0x100 (NLM_F_ROOT) flag. Don't * mistakenly call the .dump() function. */ if (index == RDMA_NL_LS) { if (cb_table[op].doit) err = cb_table[op].doit(skb, nlh, extack); goto done; } /* FIXME: Convert IWCM to properly handle doit callbacks */ if ((nlh->nlmsg_flags & NLM_F_DUMP) || index == RDMA_NL_IWCM) { struct netlink_dump_control c = { .dump = cb_table[op].dump, }; if (c.dump) err = netlink_dump_start(skb->sk, skb, nlh, &c); goto done; } if (cb_table[op].doit) err = cb_table[op].doit(skb, nlh, extack); done: up_read(&rdma_nl_types[index].sem); return err; } /* * This function is similar to netlink_rcv_skb with one exception: * It calls to the callback for the netlink messages without NLM_F_REQUEST * flag. These messages are intended for RDMA_NL_LS consumer, so it is allowed * for that consumer only. */ static int rdma_nl_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *, struct nlmsghdr *, struct netlink_ext_ack *)) { struct netlink_ext_ack extack = {}; struct nlmsghdr *nlh; int err; while (skb->len >= nlmsg_total_size(0)) { int msglen; nlh = nlmsg_hdr(skb); err = 0; if (nlh->nlmsg_len < NLMSG_HDRLEN || skb->len < nlh->nlmsg_len) return 0; /* * Generally speaking, the only requests are handled * by the kernel, but RDMA_NL_LS is different, because it * runs backward netlink scheme. Kernel initiates messages * and waits for reply with data to keep pathrecord cache * in sync. */ if (!(nlh->nlmsg_flags & NLM_F_REQUEST) && (RDMA_NL_GET_CLIENT(nlh->nlmsg_type) != RDMA_NL_LS)) goto ack; /* Skip control messages */ if (nlh->nlmsg_type < NLMSG_MIN_TYPE) goto ack; err = cb(skb, nlh, &extack); if (err == -EINTR) goto skip; ack: if (nlh->nlmsg_flags & NLM_F_ACK || err) netlink_ack(skb, nlh, err, &extack); skip: msglen = NLMSG_ALIGN(nlh->nlmsg_len); if (msglen > skb->len) msglen = skb->len; skb_pull(skb, msglen); } return 0; } static void rdma_nl_rcv(struct sk_buff *skb) { rdma_nl_rcv_skb(skb, &rdma_nl_rcv_msg); } int rdma_nl_unicast(struct net *net, struct sk_buff *skb, u32 pid) { struct rdma_dev_net *rnet = rdma_net_to_dev_net(net); int err; err = netlink_unicast(rnet->nl_sock, skb, pid, MSG_DONTWAIT); return (err < 0) ? err : 0; } EXPORT_SYMBOL(rdma_nl_unicast); int rdma_nl_unicast_wait(struct net *net, struct sk_buff *skb, __u32 pid) { struct rdma_dev_net *rnet = rdma_net_to_dev_net(net); int err; err = netlink_unicast(rnet->nl_sock, skb, pid, 0); return (err < 0) ? err : 0; } EXPORT_SYMBOL(rdma_nl_unicast_wait); int rdma_nl_multicast(struct net *net, struct sk_buff *skb, unsigned int group, gfp_t flags) { struct rdma_dev_net *rnet = rdma_net_to_dev_net(net); return nlmsg_multicast(rnet->nl_sock, skb, 0, group, flags); } EXPORT_SYMBOL(rdma_nl_multicast); void rdma_nl_init(void) { int idx; for (idx = 0; idx < RDMA_NL_NUM_CLIENTS; idx++) init_rwsem(&rdma_nl_types[idx].sem); } void rdma_nl_exit(void) { int idx; for (idx = 0; idx < RDMA_NL_NUM_CLIENTS; idx++) WARN(rdma_nl_types[idx].cb_table, "Netlink client %d wasn't released prior to unloading %s\n", idx, KBUILD_MODNAME); } int rdma_nl_net_init(struct rdma_dev_net *rnet) { struct net *net = read_pnet(&rnet->net); struct netlink_kernel_cfg cfg = { .input = rdma_nl_rcv, .flags = NL_CFG_F_NONROOT_RECV, }; struct sock *nls; nls = netlink_kernel_create(net, NETLINK_RDMA, &cfg); if (!nls) return -ENOMEM; nls->sk_sndtimeo = 10 * HZ; rnet->nl_sock = nls; return 0; } void rdma_nl_net_exit(struct rdma_dev_net *rnet) { netlink_kernel_release(rnet->nl_sock); } MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_RDMA); |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 | /* SPDX-License-Identifier: GPL-2.0 */ /* * linux/fs/ufs/util.h * * Copyright (C) 1998 * Daniel Pirkl <daniel.pirkl@email.cz> * Charles University, Faculty of Mathematics and Physics */ #include <linux/buffer_head.h> #include <linux/fs.h> #include "swab.h" /* * functions used for retyping */ static inline struct ufs_buffer_head *UCPI_UBH(struct ufs_cg_private_info *cpi) { return &cpi->c_ubh; } static inline struct ufs_buffer_head *USPI_UBH(struct ufs_sb_private_info *spi) { return &spi->s_ubh; } /* * macros used for accessing structures */ static inline s32 ufs_get_fs_state(struct super_block *sb, struct ufs_super_block_first *usb1, struct ufs_super_block_third *usb3) { switch (UFS_SB(sb)->s_flags & UFS_ST_MASK) { case UFS_ST_SUNOS: if (fs32_to_cpu(sb, usb3->fs_postblformat) == UFS_42POSTBLFMT) return fs32_to_cpu(sb, usb1->fs_u0.fs_sun.fs_state); fallthrough; /* to UFS_ST_SUN */ case UFS_ST_SUN: return fs32_to_cpu(sb, usb3->fs_un2.fs_sun.fs_state); case UFS_ST_SUNx86: return fs32_to_cpu(sb, usb1->fs_u1.fs_sunx86.fs_state); case UFS_ST_44BSD: default: return fs32_to_cpu(sb, usb3->fs_un2.fs_44.fs_state); } } static inline void ufs_set_fs_state(struct super_block *sb, struct ufs_super_block_first *usb1, struct ufs_super_block_third *usb3, s32 value) { switch (UFS_SB(sb)->s_flags & UFS_ST_MASK) { case UFS_ST_SUNOS: if (fs32_to_cpu(sb, usb3->fs_postblformat) == UFS_42POSTBLFMT) { usb1->fs_u0.fs_sun.fs_state = cpu_to_fs32(sb, value); break; } fallthrough; /* to UFS_ST_SUN */ case UFS_ST_SUN: usb3->fs_un2.fs_sun.fs_state = cpu_to_fs32(sb, value); break; case UFS_ST_SUNx86: usb1->fs_u1.fs_sunx86.fs_state = cpu_to_fs32(sb, value); break; case UFS_ST_44BSD: usb3->fs_un2.fs_44.fs_state = cpu_to_fs32(sb, value); break; } } static inline u32 ufs_get_fs_npsect(struct super_block *sb, struct ufs_super_block_first *usb1, struct ufs_super_block_third *usb3) { if ((UFS_SB(sb)->s_flags & UFS_ST_MASK) == UFS_ST_SUNx86) return fs32_to_cpu(sb, usb3->fs_un2.fs_sunx86.fs_npsect); else return fs32_to_cpu(sb, usb1->fs_u1.fs_sun.fs_npsect); } static inline u64 ufs_get_fs_qbmask(struct super_block *sb, struct ufs_super_block_third *usb3) { __fs64 tmp; switch (UFS_SB(sb)->s_flags & UFS_ST_MASK) { case UFS_ST_SUNOS: case UFS_ST_SUN: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_sun.fs_qbmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_sun.fs_qbmask[1]; break; case UFS_ST_SUNx86: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_sunx86.fs_qbmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_sunx86.fs_qbmask[1]; break; case UFS_ST_44BSD: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_44.fs_qbmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_44.fs_qbmask[1]; break; } return fs64_to_cpu(sb, tmp); } static inline u64 ufs_get_fs_qfmask(struct super_block *sb, struct ufs_super_block_third *usb3) { __fs64 tmp; switch (UFS_SB(sb)->s_flags & UFS_ST_MASK) { case UFS_ST_SUNOS: case UFS_ST_SUN: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_sun.fs_qfmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_sun.fs_qfmask[1]; break; case UFS_ST_SUNx86: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_sunx86.fs_qfmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_sunx86.fs_qfmask[1]; break; case UFS_ST_44BSD: ((__fs32 *)&tmp)[0] = usb3->fs_un2.fs_44.fs_qfmask[0]; ((__fs32 *)&tmp)[1] = usb3->fs_un2.fs_44.fs_qfmask[1]; break; } return fs64_to_cpu(sb, tmp); } static inline u16 ufs_get_de_namlen(struct super_block *sb, struct ufs_dir_entry *de) { if ((UFS_SB(sb)->s_flags & UFS_DE_MASK) == UFS_DE_OLD) return fs16_to_cpu(sb, de->d_u.d_namlen); else return de->d_u.d_44.d_namlen; /* XXX this seems wrong */ } static inline void ufs_set_de_namlen(struct super_block *sb, struct ufs_dir_entry *de, u16 value) { if ((UFS_SB(sb)->s_flags & UFS_DE_MASK) == UFS_DE_OLD) de->d_u.d_namlen = cpu_to_fs16(sb, value); else de->d_u.d_44.d_namlen = value; /* XXX this seems wrong */ } static inline void ufs_set_de_type(struct super_block *sb, struct ufs_dir_entry *de, int mode) { if ((UFS_SB(sb)->s_flags & UFS_DE_MASK) != UFS_DE_44BSD) return; /* * TODO turn this into a table lookup */ switch (mode & S_IFMT) { case S_IFSOCK: de->d_u.d_44.d_type = DT_SOCK; break; case S_IFLNK: de->d_u.d_44.d_type = DT_LNK; break; case S_IFREG: de->d_u.d_44.d_type = DT_REG; break; case S_IFBLK: de->d_u.d_44.d_type = DT_BLK; break; case S_IFDIR: de->d_u.d_44.d_type = DT_DIR; break; case S_IFCHR: de->d_u.d_44.d_type = DT_CHR; break; case S_IFIFO: de->d_u.d_44.d_type = DT_FIFO; break; default: de->d_u.d_44.d_type = DT_UNKNOWN; } } static inline u32 ufs_get_inode_uid(struct super_block *sb, struct ufs_inode *inode) { switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) { case UFS_UID_44BSD: return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_uid); case UFS_UID_EFT: if (inode->ui_u1.oldids.ui_suid == 0xFFFF) return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_uid); fallthrough; default: return fs16_to_cpu(sb, inode->ui_u1.oldids.ui_suid); } } static inline void ufs_set_inode_uid(struct super_block *sb, struct ufs_inode *inode, u32 value) { switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) { case UFS_UID_44BSD: inode->ui_u3.ui_44.ui_uid = cpu_to_fs32(sb, value); inode->ui_u1.oldids.ui_suid = cpu_to_fs16(sb, value); break; case UFS_UID_EFT: inode->ui_u3.ui_sun.ui_uid = cpu_to_fs32(sb, value); if (value > 0xFFFF) value = 0xFFFF; fallthrough; default: inode->ui_u1.oldids.ui_suid = cpu_to_fs16(sb, value); break; } } static inline u32 ufs_get_inode_gid(struct super_block *sb, struct ufs_inode *inode) { switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) { case UFS_UID_44BSD: return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid); case UFS_UID_EFT: if (inode->ui_u1.oldids.ui_sgid == 0xFFFF) return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid); fallthrough; default: return fs16_to_cpu(sb, inode->ui_u1.oldids.ui_sgid); } } static inline void ufs_set_inode_gid(struct super_block *sb, struct ufs_inode *inode, u32 value) { switch (UFS_SB(sb)->s_flags & UFS_UID_MASK) { case UFS_UID_44BSD: inode->ui_u3.ui_44.ui_gid = cpu_to_fs32(sb, value); inode->ui_u1.oldids.ui_sgid = cpu_to_fs16(sb, value); break; case UFS_UID_EFT: inode->ui_u3.ui_sun.ui_gid = cpu_to_fs32(sb, value); if (value > 0xFFFF) value = 0xFFFF; fallthrough; default: inode->ui_u1.oldids.ui_sgid = cpu_to_fs16(sb, value); break; } } dev_t ufs_get_inode_dev(struct super_block *, struct ufs_inode_info *); void ufs_set_inode_dev(struct super_block *, struct ufs_inode_info *, dev_t); int ufs_prepare_chunk(struct folio *folio, loff_t pos, unsigned len); /* * These functions manipulate ufs buffers */ #define ubh_bread(sb,fragment,size) _ubh_bread_(uspi,sb,fragment,size) extern struct ufs_buffer_head * _ubh_bread_(struct ufs_sb_private_info *, struct super_block *, u64 , u64); extern struct ufs_buffer_head * ubh_bread_uspi(struct ufs_sb_private_info *, struct super_block *, u64, u64); extern void ubh_brelse (struct ufs_buffer_head *); extern void ubh_brelse_uspi (struct ufs_sb_private_info *); extern void ubh_mark_buffer_dirty (struct ufs_buffer_head *); extern void ubh_sync_block(struct ufs_buffer_head *); extern void ubh_bforget (struct ufs_buffer_head *); extern int ubh_buffer_dirty (struct ufs_buffer_head *); /* This functions works with cache pages*/ struct folio *ufs_get_locked_folio(struct address_space *mapping, pgoff_t index); static inline void ufs_put_locked_folio(struct folio *folio) { folio_unlock(folio); folio_put(folio); } /* * macros and inline function to get important structures from ufs_sb_private_info */ static inline void *get_usb_offset(struct ufs_sb_private_info *uspi, unsigned int offset) { unsigned int index; index = offset >> uspi->s_fshift; offset &= ~uspi->s_fmask; return uspi->s_ubh.bh[index]->b_data + offset; } #define ubh_get_usb_first(uspi) \ ((struct ufs_super_block_first *)get_usb_offset((uspi), 0)) #define ubh_get_usb_second(uspi) \ ((struct ufs_super_block_second *)get_usb_offset((uspi), UFS_SECTOR_SIZE)) #define ubh_get_usb_third(uspi) \ ((struct ufs_super_block_third *)get_usb_offset((uspi), 2*UFS_SECTOR_SIZE)) #define ubh_get_ucg(ubh) \ ((struct ufs_cylinder_group *)((ubh)->bh[0]->b_data)) /* * Extract byte from ufs_buffer_head * Extract the bits for a block from a map inside ufs_buffer_head */ #define ubh_get_addr8(ubh,begin) \ ((u8*)(ubh)->bh[(begin) >> uspi->s_fshift]->b_data + \ ((begin) & ~uspi->s_fmask)) #define ubh_get_addr16(ubh,begin) \ (((__fs16*)((ubh)->bh[(begin) >> (uspi->s_fshift-1)]->b_data)) + \ ((begin) & ((uspi->fsize>>1) - 1))) #define ubh_get_addr32(ubh,begin) \ (((__fs32*)((ubh)->bh[(begin) >> (uspi->s_fshift-2)]->b_data)) + \ ((begin) & ((uspi->s_fsize>>2) - 1))) #define ubh_get_addr64(ubh,begin) \ (((__fs64*)((ubh)->bh[(begin) >> (uspi->s_fshift-3)]->b_data)) + \ ((begin) & ((uspi->s_fsize>>3) - 1))) #define ubh_get_addr ubh_get_addr8 static inline void *ubh_get_data_ptr(struct ufs_sb_private_info *uspi, struct ufs_buffer_head *ubh, u64 blk) { if (uspi->fs_magic == UFS2_MAGIC) return ubh_get_addr64(ubh, blk); else return ubh_get_addr32(ubh, blk); } #define ubh_blkmap(ubh,begin,bit) \ ((*ubh_get_addr(ubh, (begin) + ((bit) >> 3)) >> ((bit) & 7)) & (0xff >> (UFS_MAXFRAG - uspi->s_fpb))) static inline u64 ufs_freefrags(struct ufs_sb_private_info *uspi) { return ufs_blkstofrags(uspi->cs_total.cs_nbfree) + uspi->cs_total.cs_nffree; } /* * Macros to access cylinder group array structures */ #define ubh_cg_blktot(ucpi,cylno) \ (*((__fs32*)ubh_get_addr(UCPI_UBH(ucpi), (ucpi)->c_btotoff + ((cylno) << 2)))) #define ubh_cg_blks(ucpi,cylno,rpos) \ (*((__fs16*)ubh_get_addr(UCPI_UBH(ucpi), \ (ucpi)->c_boff + (((cylno) * uspi->s_nrpos + (rpos)) << 1 )))) /* * Bitmap operations * These functions work like classical bitmap operations. * The difference is that we don't have the whole bitmap * in one contiguous chunk of memory, but in several buffers. * The parameters of each function are super_block, ufs_buffer_head and * position of the beginning of the bitmap. */ #define ubh_setbit(ubh,begin,bit) \ (*ubh_get_addr(ubh, (begin) + ((bit) >> 3)) |= (1 << ((bit) & 7))) #define ubh_clrbit(ubh,begin,bit) \ (*ubh_get_addr (ubh, (begin) + ((bit) >> 3)) &= ~(1 << ((bit) & 7))) #define ubh_isset(ubh,begin,bit) \ (*ubh_get_addr (ubh, (begin) + ((bit) >> 3)) & (1 << ((bit) & 7))) #define ubh_isclr(ubh,begin,bit) (!ubh_isset(ubh,begin,bit)) #define ubh_find_first_zero_bit(ubh,begin,size) _ubh_find_next_zero_bit_(uspi,ubh,begin,size,0) #define ubh_find_next_zero_bit(ubh,begin,size,offset) _ubh_find_next_zero_bit_(uspi,ubh,begin,size,offset) static inline unsigned _ubh_find_next_zero_bit_( struct ufs_sb_private_info * uspi, struct ufs_buffer_head * ubh, unsigned begin, unsigned size, unsigned offset) { unsigned base, count, pos; size -= offset; begin <<= 3; offset += begin; base = offset >> uspi->s_bpfshift; offset &= uspi->s_bpfmask; for (;;) { count = min_t(unsigned int, size + offset, uspi->s_bpf); size -= count - offset; pos = find_next_zero_bit_le(ubh->bh[base]->b_data, count, offset); if (pos < count || !size) break; base++; offset = 0; } return (base << uspi->s_bpfshift) + pos - begin; } static inline unsigned find_last_zero_bit (unsigned char * bitmap, unsigned size, unsigned offset) { unsigned bit, i; unsigned char * mapp; unsigned char map; mapp = bitmap + (size >> 3); map = *mapp--; bit = 1 << (size & 7); for (i = size; i > offset; i--) { if ((map & bit) == 0) break; if ((i & 7) != 0) { bit >>= 1; } else { map = *mapp--; bit = 1 << 7; } } return i; } #define ubh_find_last_zero_bit(ubh,begin,size,offset) _ubh_find_last_zero_bit_(uspi,ubh,begin,size,offset) static inline unsigned _ubh_find_last_zero_bit_( struct ufs_sb_private_info * uspi, struct ufs_buffer_head * ubh, unsigned begin, unsigned start, unsigned end) { unsigned base, count, pos, size; size = start - end; begin <<= 3; start += begin; base = start >> uspi->s_bpfshift; start &= uspi->s_bpfmask; for (;;) { count = min_t(unsigned int, size + (uspi->s_bpf - start), uspi->s_bpf) - (uspi->s_bpf - start); size -= count; pos = find_last_zero_bit (ubh->bh[base]->b_data, start, start - count); if (pos > start - count || !size) break; base--; start = uspi->s_bpf; } return (base << uspi->s_bpfshift) + pos - begin; } static inline int ubh_isblockset(struct ufs_sb_private_info *uspi, struct ufs_cg_private_info *ucpi, unsigned int frag) { struct ufs_buffer_head *ubh = UCPI_UBH(ucpi); u8 *p = ubh_get_addr(ubh, ucpi->c_freeoff + (frag >> 3)); u8 mask; switch (uspi->s_fpb) { case 8: return *p == 0xff; case 4: mask = 0x0f << (frag & 4); return (*p & mask) == mask; case 2: mask = 0x03 << (frag & 6); return (*p & mask) == mask; case 1: mask = 0x01 << (frag & 7); return (*p & mask) == mask; } return 0; } static inline void ubh_clrblock(struct ufs_sb_private_info *uspi, struct ufs_cg_private_info *ucpi, unsigned int frag) { struct ufs_buffer_head *ubh = UCPI_UBH(ucpi); u8 *p = ubh_get_addr(ubh, ucpi->c_freeoff + (frag >> 3)); switch (uspi->s_fpb) { case 8: *p = 0x00; return; case 4: *p &= ~(0x0f << (frag & 4)); return; case 2: *p &= ~(0x03 << (frag & 6)); return; case 1: *p &= ~(0x01 << (frag & 7)); return; } } static inline void ubh_setblock(struct ufs_sb_private_info * uspi, struct ufs_cg_private_info *ucpi, unsigned int frag) { struct ufs_buffer_head *ubh = UCPI_UBH(ucpi); u8 *p = ubh_get_addr(ubh, ucpi->c_freeoff + (frag >> 3)); switch (uspi->s_fpb) { case 8: *p = 0xff; return; case 4: *p |= 0x0f << (frag & 4); return; case 2: *p |= 0x03 << (frag & 6); return; case 1: *p |= 0x01 << (frag & 7); return; } } static inline void ufs_fragacct (struct super_block * sb, unsigned blockmap, __fs32 * fraglist, int cnt) { struct ufs_sb_private_info * uspi; unsigned fragsize, pos; uspi = UFS_SB(sb)->s_uspi; fragsize = 0; for (pos = 0; pos < uspi->s_fpb; pos++) { if (blockmap & (1 << pos)) { fragsize++; } else if (fragsize > 0) { fs32_add(sb, &fraglist[fragsize], cnt); fragsize = 0; } } if (fragsize > 0 && fragsize < uspi->s_fpb) fs32_add(sb, &fraglist[fragsize], cnt); } static inline void *ufs_get_direct_data_ptr(struct ufs_sb_private_info *uspi, struct ufs_inode_info *ufsi, unsigned blk) { BUG_ON(blk > UFS_TIND_BLOCK); return uspi->fs_magic == UFS2_MAGIC ? (void *)&ufsi->i_u1.u2_i_data[blk] : (void *)&ufsi->i_u1.i_data[blk]; } static inline u64 ufs_data_ptr_to_cpu(struct super_block *sb, void *p) { return UFS_SB(sb)->s_uspi->fs_magic == UFS2_MAGIC ? fs64_to_cpu(sb, *(__fs64 *)p) : fs32_to_cpu(sb, *(__fs32 *)p); } static inline void ufs_cpu_to_data_ptr(struct super_block *sb, void *p, u64 val) { if (UFS_SB(sb)->s_uspi->fs_magic == UFS2_MAGIC) *(__fs64 *)p = cpu_to_fs64(sb, val); else *(__fs32 *)p = cpu_to_fs32(sb, val); } static inline void ufs_data_ptr_clear(struct ufs_sb_private_info *uspi, void *p) { if (uspi->fs_magic == UFS2_MAGIC) *(__fs64 *)p = 0; else *(__fs32 *)p = 0; } static inline int ufs_is_data_ptr_zero(struct ufs_sb_private_info *uspi, void *p) { if (uspi->fs_magic == UFS2_MAGIC) return *(__fs64 *)p == 0; else return *(__fs32 *)p == 0; } static inline __fs32 ufs_get_seconds(struct super_block *sbp) { time64_t now = ktime_get_real_seconds(); /* Signed 32-bit interpretation wraps around in 2038, which * happens in ufs1 inode stamps but not ufs2 using 64-bits * stamps. For superblock and blockgroup, let's assume * unsigned 32-bit stamps, which are good until y2106. * Wrap around rather than clamp here to make the dirty * file system detection work in the superblock stamp. */ return cpu_to_fs32(sbp, lower_32_bits(now)); } |
| 2 1 1 4 3 1 223 224 17 16 17 8 9 240 22 225 10 4 4 4 4 1 1 12 12 2 2 1 1 2 12 12 8 8 12 12 2 12 1 1 1 12 12 1 12 12 11 12 8 8 1 1 12 1 12 12 12 1 1 12 8 12 40 38 38 34 36 36 35 35 36 33 2 4 24 1 34 3 3 3 1 1 2 3 38 38 38 37 38 13 13 11 3 3 12 12 12 12 12 2 2 12 11 12 12 12 11 3 13 17 10 15 5 10 1 9 16 16 16 2 16 1 3 14 3 13 17 7 13 18 22 22 22 25 25 2 25 22 2 1 2 2 2 2 2 1 1 2 1 1 2 2 2 22 3 3 23 3 2 6 18 18 18 18 18 18 26 26 2 24 24 11 9 6 2 6 2 23 1 1 24 24 3 24 18 24 1 24 23 4 18 24 24 24 24 24 24 10 10 10 9 1 1 1 1 16 23 23 23 23 23 23 23 1 14 6 6 14 14 26 27 28 26 4 4 27 27 25 1 25 4 25 24 25 8 28 6 26 26 28 7 27 47 1 41 45 34 1 33 45 28 28 18 14 40 45 25 25 14 27 41 41 13 33 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 | // SPDX-License-Identifier: GPL-2.0 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/mm.h> #include <linux/sched.h> #include <linux/sched/mm.h> #include <linux/mmu_notifier.h> #include <linux/rmap.h> #include <linux/swap.h> #include <linux/mm_inline.h> #include <linux/kthread.h> #include <linux/khugepaged.h> #include <linux/freezer.h> #include <linux/mman.h> #include <linux/hashtable.h> #include <linux/userfaultfd_k.h> #include <linux/page_idle.h> #include <linux/page_table_check.h> #include <linux/rcupdate_wait.h> #include <linux/swapops.h> #include <linux/shmem_fs.h> #include <linux/dax.h> #include <linux/ksm.h> #include <asm/tlb.h> #include <asm/pgalloc.h> #include "internal.h" #include "mm_slot.h" enum scan_result { SCAN_FAIL, SCAN_SUCCEED, SCAN_PMD_NULL, SCAN_PMD_NONE, SCAN_PMD_MAPPED, SCAN_EXCEED_NONE_PTE, SCAN_EXCEED_SWAP_PTE, SCAN_EXCEED_SHARED_PTE, SCAN_PTE_NON_PRESENT, SCAN_PTE_UFFD_WP, SCAN_PTE_MAPPED_HUGEPAGE, SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, SCAN_SCAN_ABORT, SCAN_PAGE_COUNT, SCAN_PAGE_LRU, SCAN_PAGE_LOCK, SCAN_PAGE_ANON, SCAN_PAGE_COMPOUND, SCAN_ANY_PROCESS, SCAN_VMA_NULL, SCAN_VMA_CHECK, SCAN_ADDRESS_RANGE, SCAN_DEL_PAGE_LRU, SCAN_ALLOC_HUGE_PAGE_FAIL, SCAN_CGROUP_CHARGE_FAIL, SCAN_TRUNCATED, SCAN_PAGE_HAS_PRIVATE, SCAN_STORE_FAILED, SCAN_COPY_MC, SCAN_PAGE_FILLED, }; #define CREATE_TRACE_POINTS #include <trace/events/huge_memory.h> static struct task_struct *khugepaged_thread __read_mostly; static DEFINE_MUTEX(khugepaged_mutex); /* default scan 8*512 pte (or vmas) every 30 second */ static unsigned int khugepaged_pages_to_scan __read_mostly; static unsigned int khugepaged_pages_collapsed; static unsigned int khugepaged_full_scans; static unsigned int khugepaged_scan_sleep_millisecs __read_mostly = 10000; /* during fragmentation poll the hugepage allocator once every minute */ static unsigned int khugepaged_alloc_sleep_millisecs __read_mostly = 60000; static unsigned long khugepaged_sleep_expire; static DEFINE_SPINLOCK(khugepaged_mm_lock); static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); /* * default collapse hugepages if there is at least one pte mapped like * it would have happened if the vma was large enough during page * fault. * * Note that these are only respected if collapse was initiated by khugepaged. */ unsigned int khugepaged_max_ptes_none __read_mostly; static unsigned int khugepaged_max_ptes_swap __read_mostly; static unsigned int khugepaged_max_ptes_shared __read_mostly; #define MM_SLOTS_HASH_BITS 10 static DEFINE_READ_MOSTLY_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS); static struct kmem_cache *mm_slot_cache __ro_after_init; struct collapse_control { bool is_khugepaged; /* Num pages scanned per node */ u32 node_load[MAX_NUMNODES]; /* nodemask for allocation fallback */ nodemask_t alloc_nmask; }; /** * struct khugepaged_mm_slot - khugepaged information per mm that is being scanned * @slot: hash lookup from mm to mm_slot */ struct khugepaged_mm_slot { struct mm_slot slot; }; /** * struct khugepaged_scan - cursor for scanning * @mm_head: the head of the mm list to scan * @mm_slot: the current mm_slot we are scanning * @address: the next address inside that to be scanned * * There is only the one khugepaged_scan instance of this cursor structure. */ struct khugepaged_scan { struct list_head mm_head; struct khugepaged_mm_slot *mm_slot; unsigned long address; }; static struct khugepaged_scan khugepaged_scan = { .mm_head = LIST_HEAD_INIT(khugepaged_scan.mm_head), }; #ifdef CONFIG_SYSFS static ssize_t scan_sleep_millisecs_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_scan_sleep_millisecs); } static ssize_t scan_sleep_millisecs_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { unsigned int msecs; int err; err = kstrtouint(buf, 10, &msecs); if (err) return -EINVAL; khugepaged_scan_sleep_millisecs = msecs; khugepaged_sleep_expire = 0; wake_up_interruptible(&khugepaged_wait); return count; } static struct kobj_attribute scan_sleep_millisecs_attr = __ATTR_RW(scan_sleep_millisecs); static ssize_t alloc_sleep_millisecs_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_alloc_sleep_millisecs); } static ssize_t alloc_sleep_millisecs_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { unsigned int msecs; int err; err = kstrtouint(buf, 10, &msecs); if (err) return -EINVAL; khugepaged_alloc_sleep_millisecs = msecs; khugepaged_sleep_expire = 0; wake_up_interruptible(&khugepaged_wait); return count; } static struct kobj_attribute alloc_sleep_millisecs_attr = __ATTR_RW(alloc_sleep_millisecs); static ssize_t pages_to_scan_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_pages_to_scan); } static ssize_t pages_to_scan_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { unsigned int pages; int err; err = kstrtouint(buf, 10, &pages); if (err || !pages) return -EINVAL; khugepaged_pages_to_scan = pages; return count; } static struct kobj_attribute pages_to_scan_attr = __ATTR_RW(pages_to_scan); static ssize_t pages_collapsed_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_pages_collapsed); } static struct kobj_attribute pages_collapsed_attr = __ATTR_RO(pages_collapsed); static ssize_t full_scans_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_full_scans); } static struct kobj_attribute full_scans_attr = __ATTR_RO(full_scans); static ssize_t defrag_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return single_hugepage_flag_show(kobj, attr, buf, TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG); } static ssize_t defrag_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { return single_hugepage_flag_store(kobj, attr, buf, count, TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG); } static struct kobj_attribute khugepaged_defrag_attr = __ATTR_RW(defrag); /* * max_ptes_none controls if khugepaged should collapse hugepages over * any unmapped ptes in turn potentially increasing the memory * footprint of the vmas. When max_ptes_none is 0 khugepaged will not * reduce the available free memory in the system as it * runs. Increasing max_ptes_none will instead potentially reduce the * free memory in the system during the khugepaged scan. */ static ssize_t max_ptes_none_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_max_ptes_none); } static ssize_t max_ptes_none_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { int err; unsigned long max_ptes_none; err = kstrtoul(buf, 10, &max_ptes_none); if (err || max_ptes_none > HPAGE_PMD_NR - 1) return -EINVAL; khugepaged_max_ptes_none = max_ptes_none; return count; } static struct kobj_attribute khugepaged_max_ptes_none_attr = __ATTR_RW(max_ptes_none); static ssize_t max_ptes_swap_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_max_ptes_swap); } static ssize_t max_ptes_swap_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { int err; unsigned long max_ptes_swap; err = kstrtoul(buf, 10, &max_ptes_swap); if (err || max_ptes_swap > HPAGE_PMD_NR - 1) return -EINVAL; khugepaged_max_ptes_swap = max_ptes_swap; return count; } static struct kobj_attribute khugepaged_max_ptes_swap_attr = __ATTR_RW(max_ptes_swap); static ssize_t max_ptes_shared_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sysfs_emit(buf, "%u\n", khugepaged_max_ptes_shared); } static ssize_t max_ptes_shared_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { int err; unsigned long max_ptes_shared; err = kstrtoul(buf, 10, &max_ptes_shared); if (err || max_ptes_shared > HPAGE_PMD_NR - 1) return -EINVAL; khugepaged_max_ptes_shared = max_ptes_shared; return count; } static struct kobj_attribute khugepaged_max_ptes_shared_attr = __ATTR_RW(max_ptes_shared); static struct attribute *khugepaged_attr[] = { &khugepaged_defrag_attr.attr, &khugepaged_max_ptes_none_attr.attr, &khugepaged_max_ptes_swap_attr.attr, &khugepaged_max_ptes_shared_attr.attr, &pages_to_scan_attr.attr, &pages_collapsed_attr.attr, &full_scans_attr.attr, &scan_sleep_millisecs_attr.attr, &alloc_sleep_millisecs_attr.attr, NULL, }; struct attribute_group khugepaged_attr_group = { .attrs = khugepaged_attr, .name = "khugepaged", }; #endif /* CONFIG_SYSFS */ int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice) { switch (advice) { case MADV_HUGEPAGE: #ifdef CONFIG_S390 /* * qemu blindly sets MADV_HUGEPAGE on all allocations, but s390 * can't handle this properly after s390_enable_sie, so we simply * ignore the madvise to prevent qemu from causing a SIGSEGV. */ if (mm_has_pgste(vma->vm_mm)) return 0; #endif *vm_flags &= ~VM_NOHUGEPAGE; *vm_flags |= VM_HUGEPAGE; /* * If the vma become good for khugepaged to scan, * register it here without waiting a page fault that * may not happen any time soon. */ khugepaged_enter_vma(vma, *vm_flags); break; case MADV_NOHUGEPAGE: *vm_flags &= ~VM_HUGEPAGE; *vm_flags |= VM_NOHUGEPAGE; /* * Setting VM_NOHUGEPAGE will prevent khugepaged from scanning * this vma even if we leave the mm registered in khugepaged if * it got registered before VM_NOHUGEPAGE was set. */ break; } return 0; } int __init khugepaged_init(void) { mm_slot_cache = KMEM_CACHE(khugepaged_mm_slot, 0); if (!mm_slot_cache) return -ENOMEM; khugepaged_pages_to_scan = HPAGE_PMD_NR * 8; khugepaged_max_ptes_none = HPAGE_PMD_NR - 1; khugepaged_max_ptes_swap = HPAGE_PMD_NR / 8; khugepaged_max_ptes_shared = HPAGE_PMD_NR / 2; return 0; } void __init khugepaged_destroy(void) { kmem_cache_destroy(mm_slot_cache); } static inline int hpage_collapse_test_exit(struct mm_struct *mm) { return atomic_read(&mm->mm_users) == 0; } static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) { return hpage_collapse_test_exit(mm) || test_bit(MMF_DISABLE_THP, &mm->flags); } static bool hugepage_pmd_enabled(void) { /* * We cover the anon, shmem and the file-backed case here; file-backed * hugepages, when configured in, are determined by the global control. * Anon pmd-sized hugepages are determined by the pmd-size control. * Shmem pmd-sized hugepages are also determined by its pmd-size control, * except when the global shmem_huge is set to SHMEM_HUGE_DENY. */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) return true; if (test_bit(PMD_ORDER, &huge_anon_orders_always)) return true; if (test_bit(PMD_ORDER, &huge_anon_orders_madvise)) return true; if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && hugepage_global_enabled()) return true; if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled()) return true; return false; } void __khugepaged_enter(struct mm_struct *mm) { struct khugepaged_mm_slot *mm_slot; struct mm_slot *slot; int wakeup; /* __khugepaged_exit() must not run from under us */ VM_BUG_ON_MM(hpage_collapse_test_exit(mm), mm); if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) return; mm_slot = mm_slot_alloc(mm_slot_cache); if (!mm_slot) return; slot = &mm_slot->slot; spin_lock(&khugepaged_mm_lock); mm_slot_insert(mm_slots_hash, mm, slot); /* * Insert just behind the scanning cursor, to let the area settle * down a little. */ wakeup = list_empty(&khugepaged_scan.mm_head); list_add_tail(&slot->mm_node, &khugepaged_scan.mm_head); spin_unlock(&khugepaged_mm_lock); mmgrab(mm); if (wakeup) wake_up_interruptible(&khugepaged_wait); } void khugepaged_enter_vma(struct vm_area_struct *vma, unsigned long vm_flags) { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && hugepage_pmd_enabled()) { if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, PMD_ORDER)) __khugepaged_enter(vma->vm_mm); } } void __khugepaged_exit(struct mm_struct *mm) { struct khugepaged_mm_slot *mm_slot; struct mm_slot *slot; int free = 0; spin_lock(&khugepaged_mm_lock); slot = mm_slot_lookup(mm_slots_hash, mm); mm_slot = mm_slot_entry(slot, struct khugepaged_mm_slot, slot); if (mm_slot && khugepaged_scan.mm_slot != mm_slot) { hash_del(&slot->hash); list_del(&slot->mm_node); free = 1; } spin_unlock(&khugepaged_mm_lock); if (free) { clear_bit(MMF_VM_HUGEPAGE, &mm->flags); mm_slot_free(mm_slot_cache, mm_slot); mmdrop(mm); } else if (mm_slot) { /* * This is required to serialize against * hpage_collapse_test_exit() (which is guaranteed to run * under mmap sem read mode). Stop here (after we return all * pagetables will be destroyed) until khugepaged has finished * working on the pagetables under the mmap_lock. */ mmap_write_lock(mm); mmap_write_unlock(mm); } } static void release_pte_folio(struct folio *folio) { node_stat_mod_folio(folio, NR_ISOLATED_ANON + folio_is_file_lru(folio), -folio_nr_pages(folio)); folio_unlock(folio); folio_putback_lru(folio); } static void release_pte_pages(pte_t *pte, pte_t *_pte, struct list_head *compound_pagelist) { struct folio *folio, *tmp; while (--_pte >= pte) { pte_t pteval = ptep_get(_pte); unsigned long pfn; if (pte_none(pteval)) continue; pfn = pte_pfn(pteval); if (is_zero_pfn(pfn)) continue; folio = pfn_folio(pfn); if (folio_test_large(folio)) continue; release_pte_folio(folio); } list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { list_del(&folio->lru); release_pte_folio(folio); } } static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; struct folio *folio = NULL; pte_t *_pte; int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; bool writable = false; for (_pte = pte; _pte < pte + HPAGE_PMD_NR; _pte++, address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || none_or_zero <= khugepaged_max_ptes_none)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); goto out; } } if (!pte_present(pteval)) { result = SCAN_PTE_NON_PRESENT; goto out; } if (pte_uffd_wp(pteval)) { result = SCAN_PTE_UFFD_WP; goto out; } page = vm_normal_page(vma, address, pteval); if (unlikely(!page) || unlikely(is_zone_device_page(page))) { result = SCAN_PAGE_NULL; goto out; } folio = page_folio(page); VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio); /* See hpage_collapse_scan_pmd(). */ if (folio_maybe_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; } } if (folio_test_large(folio)) { struct folio *f; /* * Check if we have dealt with the compound page * already */ list_for_each_entry(f, compound_pagelist, lru) { if (folio == f) goto next; } } /* * We can do it before folio_isolate_lru because the * folio can't be freed from under us. NOTE: PG_lock * is needed to serialize against split_huge_page * when invoked from the VM. */ if (!folio_trylock(folio)) { result = SCAN_PAGE_LOCK; goto out; } /* * Check if the page has any GUP (or other external) pins. * * The page table that maps the page has been already unlinked * from the page table tree and this process cannot get * an additional pin on the page. * * New pins can come later if the page is shared across fork, * but not from this process. The other process cannot write to * the page, only trigger CoW. */ if (folio_expected_ref_count(folio) != folio_ref_count(folio)) { folio_unlock(folio); result = SCAN_PAGE_COUNT; goto out; } /* * Isolate the page to avoid collapsing an hugepage * currently in use by the VM. */ if (!folio_isolate_lru(folio)) { folio_unlock(folio); result = SCAN_DEL_PAGE_LRU; goto out; } node_stat_mod_folio(folio, NR_ISOLATED_ANON + folio_is_file_lru(folio), folio_nr_pages(folio)); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); if (folio_test_large(folio)) list_add_tail(&folio->lru, compound_pagelist); next: /* * If collapse was initiated by khugepaged, check that there is * enough young pte to justify collapsing the page */ if (cc->is_khugepaged && (pte_young(pteval) || folio_test_young(folio) || folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, address))) referenced++; if (pte_write(pteval)) writable = true; } if (unlikely(!writable)) { result = SCAN_PAGE_RO; } else if (unlikely(cc->is_khugepaged && !referenced)) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; trace_mm_collapse_huge_page_isolate(folio, none_or_zero, referenced, writable, result); return result; } out: release_pte_pages(pte, _pte, compound_pagelist); trace_mm_collapse_huge_page_isolate(folio, none_or_zero, referenced, writable, result); return result; } static void __collapse_huge_page_copy_succeeded(pte_t *pte, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, struct list_head *compound_pagelist) { struct folio *src, *tmp; pte_t *_pte; pte_t pteval; for (_pte = pte; _pte < pte + HPAGE_PMD_NR; _pte++, address += PAGE_SIZE) { pteval = ptep_get(_pte); if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1); if (is_zero_pfn(pte_pfn(pteval))) { /* * ptl mostly unnecessary. */ spin_lock(ptl); ptep_clear(vma->vm_mm, address, _pte); spin_unlock(ptl); ksm_might_unmap_zero_page(vma->vm_mm, pteval); } } else { struct page *src_page = pte_page(pteval); src = page_folio(src_page); if (!folio_test_large(src)) release_pte_folio(src); /* * ptl mostly unnecessary, but preempt has to * be disabled to update the per-cpu stats * inside folio_remove_rmap_pte(). */ spin_lock(ptl); ptep_clear(vma->vm_mm, address, _pte); folio_remove_rmap_pte(src, src_page, vma); spin_unlock(ptl); free_folio_and_swap_cache(src); } } list_for_each_entry_safe(src, tmp, compound_pagelist, lru) { list_del(&src->lru); node_stat_sub_folio(src, NR_ISOLATED_ANON + folio_is_file_lru(src)); folio_unlock(src); free_swap_cache(src); folio_putback_lru(src); } } static void __collapse_huge_page_copy_failed(pte_t *pte, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, struct list_head *compound_pagelist) { spinlock_t *pmd_ptl; /* * Re-establish the PMD to point to the original page table * entry. Restoring PMD needs to be done prior to releasing * pages. Since pages are still isolated and locked here, * acquiring anon_vma_lock_write is unnecessary. */ pmd_ptl = pmd_lock(vma->vm_mm, pmd); pmd_populate(vma->vm_mm, pmd, pmd_pgtable(orig_pmd)); spin_unlock(pmd_ptl); /* * Release both raw and compound pages isolated * in __collapse_huge_page_isolate. */ release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); } /* * __collapse_huge_page_copy - attempts to copy memory contents from raw * pages to a hugepage. Cleans up the raw pages if copying succeeds; * otherwise restores the original page table and releases isolated raw pages. * Returns SCAN_SUCCEED if copying succeeds, otherwise returns SCAN_COPY_MC. * * @pte: starting of the PTEs to copy from * @folio: the new hugepage to copy contents to * @pmd: pointer to the new hugepage's PMD * @orig_pmd: the original raw pages' PMD * @vma: the original raw pages' virtual memory area * @address: starting address to copy * @ptl: lock on raw pages' PTEs * @compound_pagelist: list that stores compound pages */ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, struct list_head *compound_pagelist) { unsigned int i; int result = SCAN_SUCCEED; /* * Copying pages' contents is subject to memory poison at any iteration. */ for (i = 0; i < HPAGE_PMD_NR; i++) { pte_t pteval = ptep_get(pte + i); struct page *page = folio_page(folio, i); unsigned long src_addr = address + i * PAGE_SIZE; struct page *src_page; if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { clear_user_highpage(page, src_addr); continue; } src_page = pte_page(pteval); if (copy_mc_user_highpage(page, src_page, src_addr, vma) > 0) { result = SCAN_COPY_MC; break; } } if (likely(result == SCAN_SUCCEED)) __collapse_huge_page_copy_succeeded(pte, vma, address, ptl, compound_pagelist); else __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, compound_pagelist); return result; } static void khugepaged_alloc_sleep(void) { DEFINE_WAIT(wait); add_wait_queue(&khugepaged_wait, &wait); __set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE); schedule_timeout(msecs_to_jiffies(khugepaged_alloc_sleep_millisecs)); remove_wait_queue(&khugepaged_wait, &wait); } struct collapse_control khugepaged_collapse_control = { .is_khugepaged = true, }; static bool hpage_collapse_scan_abort(int nid, struct collapse_control *cc) { int i; /* * If node_reclaim_mode is disabled, then no extra effort is made to * allocate memory locally. */ if (!node_reclaim_enabled()) return false; /* If there is a count for this node already, it must be acceptable */ if (cc->node_load[nid]) return false; for (i = 0; i < MAX_NUMNODES; i++) { if (!cc->node_load[i]) continue; if (node_distance(nid, i) > node_reclaim_distance) return true; } return false; } #define khugepaged_defrag() \ (transparent_hugepage_flags & \ (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG)) /* Defrag for khugepaged will enter direct reclaim/compaction if necessary */ static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void) { return khugepaged_defrag() ? GFP_TRANSHUGE : GFP_TRANSHUGE_LIGHT; } #ifdef CONFIG_NUMA static int hpage_collapse_find_target_node(struct collapse_control *cc) { int nid, target_node = 0, max_value = 0; /* find first node with max normal pages hit */ for (nid = 0; nid < MAX_NUMNODES; nid++) if (cc->node_load[nid] > max_value) { max_value = cc->node_load[nid]; target_node = nid; } for_each_online_node(nid) { if (max_value == cc->node_load[nid]) node_set(nid, cc->alloc_nmask); } return target_node; } #else static int hpage_collapse_find_target_node(struct collapse_control *cc) { return 0; } #endif /* * If mmap_lock temporarily dropped, revalidate vma * before taking mmap_lock. * Returns enum scan_result value. */ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, bool expect_anon, struct vm_area_struct **vmap, struct collapse_control *cc) { struct vm_area_struct *vma; unsigned long tva_flags = cc->is_khugepaged ? TVA_ENFORCE_SYSFS : 0; if (unlikely(hpage_collapse_test_exit_or_disable(mm))) return SCAN_ANY_PROCESS; *vmap = vma = find_vma(mm, address); if (!vma) return SCAN_VMA_NULL; if (!thp_vma_suitable_order(vma, address, PMD_ORDER)) return SCAN_ADDRESS_RANGE; if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_flags, PMD_ORDER)) return SCAN_VMA_CHECK; /* * Anon VMA expected, the address may be unmapped then * remapped to file after khugepaged reaquired the mmap_lock. * * thp_vma_allowable_order may return true for qualified file * vmas. */ if (expect_anon && (!(*vmap)->anon_vma || !vma_is_anonymous(*vmap))) return SCAN_PAGE_ANON; return SCAN_SUCCEED; } static inline int check_pmd_state(pmd_t *pmd) { pmd_t pmde = pmdp_get_lockless(pmd); if (pmd_none(pmde)) return SCAN_PMD_NONE; if (!pmd_present(pmde)) return SCAN_PMD_NULL; if (pmd_trans_huge(pmde)) return SCAN_PMD_MAPPED; if (pmd_devmap(pmde)) return SCAN_PMD_NULL; if (pmd_bad(pmde)) return SCAN_PMD_NULL; return SCAN_SUCCEED; } static int find_pmd_or_thp_or_none(struct mm_struct *mm, unsigned long address, pmd_t **pmd) { *pmd = mm_find_pmd(mm, address); if (!*pmd) return SCAN_PMD_NULL; return check_pmd_state(*pmd); } static int check_pmd_still_valid(struct mm_struct *mm, unsigned long address, pmd_t *pmd) { pmd_t *new_pmd; int result = find_pmd_or_thp_or_none(mm, address, &new_pmd); if (result != SCAN_SUCCEED) return result; if (new_pmd != pmd) return SCAN_FAIL; return SCAN_SUCCEED; } /* * Bring missing pages in from swap, to complete THP collapse. * Only done if hpage_collapse_scan_pmd believes it is worthwhile. * * Called and returns without pte mapped or spinlocks held. * Returns result: if not SCAN_SUCCEED, mmap_lock has been released. */ static int __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd, int referenced) { int swapped_in = 0; vm_fault_t ret = 0; unsigned long address, end = haddr + (HPAGE_PMD_NR * PAGE_SIZE); int result; pte_t *pte = NULL; spinlock_t *ptl; for (address = haddr; address < end; address += PAGE_SIZE) { struct vm_fault vmf = { .vma = vma, .address = address, .pgoff = linear_page_index(vma, address), .flags = FAULT_FLAG_ALLOW_RETRY, .pmd = pmd, }; if (!pte++) { /* * Here the ptl is only used to check pte_same() in * do_swap_page(), so readonly version is enough. */ pte = pte_offset_map_ro_nolock(mm, pmd, address, &ptl); if (!pte) { mmap_read_unlock(mm); result = SCAN_PMD_NULL; goto out; } } vmf.orig_pte = ptep_get_lockless(pte); if (!is_swap_pte(vmf.orig_pte)) continue; vmf.pte = pte; vmf.ptl = ptl; ret = do_swap_page(&vmf); /* Which unmaps pte (after perhaps re-checking the entry) */ pte = NULL; /* * do_swap_page returns VM_FAULT_RETRY with released mmap_lock. * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because * we do not retry here and swap entry will remain in pagetable * resulting in later failure. */ if (ret & VM_FAULT_RETRY) { /* Likely, but not guaranteed, that page lock failed */ result = SCAN_PAGE_LOCK; goto out; } if (ret & VM_FAULT_ERROR) { mmap_read_unlock(mm); result = SCAN_FAIL; goto out; } swapped_in++; } if (pte) pte_unmap(pte); /* Drain LRU cache to remove extra pin on the swapped in pages */ if (swapped_in) lru_add_drain(); result = SCAN_SUCCEED; out: trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, result); return result; } static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, struct collapse_control *cc) { gfp_t gfp = (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() : GFP_TRANSHUGE); int node = hpage_collapse_find_target_node(cc); struct folio *folio; folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask); if (!folio) { *foliop = NULL; count_vm_event(THP_COLLAPSE_ALLOC_FAILED); return SCAN_ALLOC_HUGE_PAGE_FAIL; } count_vm_event(THP_COLLAPSE_ALLOC); if (unlikely(mem_cgroup_charge(folio, mm, gfp))) { folio_put(folio); *foliop = NULL; return SCAN_CGROUP_CHARGE_FAIL; } count_memcg_folio_events(folio, THP_COLLAPSE_ALLOC, 1); *foliop = folio; return SCAN_SUCCEED; } static int collapse_huge_page(struct mm_struct *mm, unsigned long address, int referenced, int unmapped, struct collapse_control *cc) { LIST_HEAD(compound_pagelist); pmd_t *pmd, _pmd; pte_t *pte; pgtable_t pgtable; struct folio *folio; spinlock_t *pmd_ptl, *pte_ptl; int result = SCAN_FAIL; struct vm_area_struct *vma; struct mmu_notifier_range range; VM_BUG_ON(address & ~HPAGE_PMD_MASK); /* * Before allocating the hugepage, release the mmap_lock read lock. * The allocation can take potentially a long time if it involves * sync compaction, and we do not need to hold the mmap_lock during * that. We will recheck the vma after taking it again in write mode. */ mmap_read_unlock(mm); result = alloc_charge_folio(&folio, mm, cc); if (result != SCAN_SUCCEED) goto out_nolock; mmap_read_lock(mm); result = hugepage_vma_revalidate(mm, address, true, &vma, cc); if (result != SCAN_SUCCEED) { mmap_read_unlock(mm); goto out_nolock; } result = find_pmd_or_thp_or_none(mm, address, &pmd); if (result != SCAN_SUCCEED) { mmap_read_unlock(mm); goto out_nolock; } if (unmapped) { /* * __collapse_huge_page_swapin will return with mmap_lock * released when it fails. So we jump out_nolock directly in * that case. Continuing to collapse causes inconsistency. */ result = __collapse_huge_page_swapin(mm, vma, address, pmd, referenced); if (result != SCAN_SUCCEED) goto out_nolock; } mmap_read_unlock(mm); /* * Prevent all access to pagetables with the exception of * gup_fast later handled by the ptep_clear_flush and the VM * handled by the anon_vma lock + PG_lock. * * UFFDIO_MOVE is prevented to race as well thanks to the * mmap_lock. */ mmap_write_lock(mm); result = hugepage_vma_revalidate(mm, address, true, &vma, cc); if (result != SCAN_SUCCEED) goto out_up_write; /* check if the pmd is still valid */ result = check_pmd_still_valid(mm, address, pmd); if (result != SCAN_SUCCEED) goto out_up_write; vma_start_write(vma); anon_vma_lock_write(vma->anon_vma); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, address, address + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */ /* * This removes any huge TLB entry from the CPU so we won't allow * huge and small TLB entries for the same virtual address to * avoid the risk of CPU bugs in that area. * * Parallel GUP-fast is fine since GUP-fast will back off when * it detects PMD is changed. */ _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl); mmu_notifier_invalidate_range_end(&range); tlb_remove_table_sync_one(); pte = pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); if (pte) { result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); } else { result = SCAN_PMD_NULL; } if (unlikely(result != SCAN_SUCCEED)) { if (pte) pte_unmap(pte); spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); /* * We can only use set_pmd_at when establishing * hugepmds and never for establishing regular pmds that * points to regular pagetables. Use pmd_populate for that */ pmd_populate(mm, pmd, pmd_pgtable(_pmd)); spin_unlock(pmd_ptl); anon_vma_unlock_write(vma->anon_vma); goto out_up_write; } /* * All pages are isolated and locked so anon_vma rmap * can't run anymore. */ anon_vma_unlock_write(vma->anon_vma); result = __collapse_huge_page_copy(pte, folio, pmd, _pmd, vma, address, pte_ptl, &compound_pagelist); pte_unmap(pte); if (unlikely(result != SCAN_SUCCEED)) goto out_up_write; /* * The smp_wmb() inside __folio_mark_uptodate() ensures the * copy_huge_page writes become visible before the set_pmd_at() * write. */ __folio_mark_uptodate(folio); pgtable = pmd_pgtable(_pmd); _pmd = folio_mk_pmd(folio, vma->vm_page_prot); _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma); spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); deferred_split_folio(folio, false); spin_unlock(pmd_ptl); folio = NULL; result = SCAN_SUCCEED; out_up_write: mmap_write_unlock(mm); out_nolock: if (folio) folio_put(folio); trace_mm_collapse_huge_page(mm, result == SCAN_SUCCEED, result); return result; } static int hpage_collapse_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, struct collapse_control *cc) { pmd_t *pmd; pte_t *pte, *_pte; int result = SCAN_FAIL, referenced = 0; int none_or_zero = 0, shared = 0; struct page *page = NULL; struct folio *folio = NULL; unsigned long _address; spinlock_t *ptl; int node = NUMA_NO_NODE, unmapped = 0; bool writable = false; VM_BUG_ON(address & ~HPAGE_PMD_MASK); result = find_pmd_or_thp_or_none(mm, address, &pmd); if (result != SCAN_SUCCEED) goto out; memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); pte = pte_offset_map_lock(mm, pmd, address, &ptl); if (!pte) { result = SCAN_PMD_NULL; goto out; } for (_address = address, _pte = pte; _pte < pte + HPAGE_PMD_NR; _pte++, _address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); if (is_swap_pte(pteval)) { ++unmapped; if (!cc->is_khugepaged || unmapped <= khugepaged_max_ptes_swap) { /* * Always be strict with uffd-wp * enabled swap entries. Please see * comment below for pte_uffd_wp(). */ if (pte_swp_uffd_wp_any(pteval)) { result = SCAN_PTE_UFFD_WP; goto out_unmap; } continue; } else { result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); goto out_unmap; } } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || none_or_zero <= khugepaged_max_ptes_none)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); goto out_unmap; } } if (pte_uffd_wp(pteval)) { /* * Don't collapse the page if any of the small * PTEs are armed with uffd write protection. * Here we can also mark the new huge pmd as * write protected if any of the small ones is * marked but that could bring unknown * userfault messages that falls outside of * the registered range. So, just be simple. */ result = SCAN_PTE_UFFD_WP; goto out_unmap; } if (pte_write(pteval)) writable = true; page = vm_normal_page(vma, _address, pteval); if (unlikely(!page) || unlikely(is_zone_device_page(page))) { result = SCAN_PAGE_NULL; goto out_unmap; } folio = page_folio(page); if (!folio_test_anon(folio)) { result = SCAN_PAGE_ANON; goto out_unmap; } /* * We treat a single page as shared if any part of the THP * is shared. */ if (folio_maybe_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; } } /* * Record which node the original page is from and save this * information to cc->node_load[]. * Khugepaged will allocate hugepage from the node has the max * hit record. */ node = folio_nid(folio); if (hpage_collapse_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; goto out_unmap; } cc->node_load[node]++; if (!folio_test_lru(folio)) { result = SCAN_PAGE_LRU; goto out_unmap; } if (folio_test_locked(folio)) { result = SCAN_PAGE_LOCK; goto out_unmap; } /* * Check if the page has any GUP (or other external) pins. * * Here the check may be racy: * it may see folio_mapcount() > folio_ref_count(). * But such case is ephemeral we could always retry collapse * later. However it may report false positive if the page * has excessive GUP pins (i.e. 512). Anyway the same check * will be done again later the risk seems low. */ if (folio_expected_ref_count(folio) != folio_ref_count(folio)) { result = SCAN_PAGE_COUNT; goto out_unmap; } /* * If collapse was initiated by khugepaged, check that there is * enough young pte to justify collapsing the page */ if (cc->is_khugepaged && (pte_young(pteval) || folio_test_young(folio) || folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, address))) referenced++; } if (!writable) { result = SCAN_PAGE_RO; } else if (cc->is_khugepaged && (!referenced || (unmapped && referenced < HPAGE_PMD_NR / 2))) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; } out_unmap: pte_unmap_unlock(pte, ptl); if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, unmapped, cc); /* collapse_huge_page will return with the mmap_lock released */ *mmap_locked = false; } out: trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced, none_or_zero, result, unmapped); return result; } static void collect_mm_slot(struct khugepaged_mm_slot *mm_slot) { struct mm_slot *slot = &mm_slot->slot; struct mm_struct *mm = slot->mm; lockdep_assert_held(&khugepaged_mm_lock); if (hpage_collapse_test_exit(mm)) { /* free mm_slot */ hash_del(&slot->hash); list_del(&slot->mm_node); /* * Not strictly needed because the mm exited already. * * clear_bit(MMF_VM_HUGEPAGE, &mm->flags); */ /* khugepaged_mm_lock actually not necessary for the below */ mm_slot_free(mm_slot_cache, mm_slot); mmdrop(mm); } } /* folio must be locked, and mmap_lock must be held */ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmdp, struct folio *folio, struct page *page) { struct vm_fault vmf = { .vma = vma, .address = addr, .flags = 0, .pmd = pmdp, }; mmap_assert_locked(vma->vm_mm); if (do_set_pmd(&vmf, folio, page)) return SCAN_FAIL; folio_get(folio); return SCAN_SUCCEED; } /** * collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at * address haddr. * * @mm: process address space where collapse happens * @addr: THP collapse address * @install_pmd: If a huge PMD should be installed * * This function checks whether all the PTEs in the PMD are pointing to the * right THP. If so, retract the page table so the THP can refault in with * as pmd-mapped. Possibly install a huge PMD mapping the THP. */ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, bool install_pmd) { struct mmu_notifier_range range; bool notified = false; unsigned long haddr = addr & HPAGE_PMD_MASK; struct vm_area_struct *vma = vma_lookup(mm, haddr); struct folio *folio; pte_t *start_pte, *pte; pmd_t *pmd, pgt_pmd; spinlock_t *pml = NULL, *ptl; int nr_ptes = 0, result = SCAN_FAIL; int i; mmap_assert_locked(mm); /* First check VMA found, in case page tables are being torn down */ if (!vma || !vma->vm_file || !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE)) return SCAN_VMA_CHECK; /* Fast check before locking page if already PMD-mapped */ result = find_pmd_or_thp_or_none(mm, haddr, &pmd); if (result == SCAN_PMD_MAPPED) return result; /* * If we are here, we've succeeded in replacing all the native pages * in the page cache with a single hugepage. If a mm were to fault-in * this memory (mapped by a suitably aligned VMA), we'd get the hugepage * and map it by a PMD, regardless of sysfs THP settings. As such, let's * analogously elide sysfs THP settings here. */ if (!thp_vma_allowable_order(vma, vma->vm_flags, 0, PMD_ORDER)) return SCAN_VMA_CHECK; /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */ if (userfaultfd_wp(vma)) return SCAN_PTE_UFFD_WP; folio = filemap_lock_folio(vma->vm_file->f_mapping, linear_page_index(vma, haddr)); if (IS_ERR(folio)) return SCAN_PAGE_NULL; if (folio_order(folio) != HPAGE_PMD_ORDER) { result = SCAN_PAGE_COMPOUND; goto drop_folio; } result = find_pmd_or_thp_or_none(mm, haddr, &pmd); switch (result) { case SCAN_SUCCEED: break; case SCAN_PMD_NONE: /* * All pte entries have been removed and pmd cleared. * Skip all the pte checks and just update the pmd mapping. */ goto maybe_install_pmd; default: goto drop_folio; } result = SCAN_FAIL; start_pte = pte_offset_map_lock(mm, pmd, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto drop_folio; /* step 1: check all mapped PTEs are to the right huge page */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { struct page *page; pte_t ptent = ptep_get(pte); /* empty pte, skip */ if (pte_none(ptent)) continue; /* page swapped out, abort */ if (!pte_present(ptent)) { result = SCAN_PTE_NON_PRESENT; goto abort; } page = vm_normal_page(vma, addr, ptent); if (WARN_ON_ONCE(page && is_zone_device_page(page))) page = NULL; /* * Note that uprobe, debugger, or MAP_PRIVATE may change the * page table, but the new page will not be a subpage of hpage. */ if (folio_page(folio, i) != page) goto abort; } pte_unmap_unlock(start_pte, ptl); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddr, haddr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); notified = true; /* * pmd_lock covers a wider range than ptl, and (if split from mm's * page_table_lock) ptl nests inside pml. The less time we hold pml, * the better; but userfaultfd's mfill_atomic_pte() on a private VMA * inserts a valid as-if-COWed PTE without even looking up page cache. * So page lock of folio does not protect from it, so we must not drop * ptl before pgt_pmd is removed, so uffd private needs pml taken now. */ if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) pml = pmd_lock(mm, pmd); start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ goto abort; if (!pml) spin_lock(ptl); else if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) goto abort; /* step 2: clear page table and adjust rmap */ for (i = 0, addr = haddr, pte = start_pte; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { struct page *page; pte_t ptent = ptep_get(pte); if (pte_none(ptent)) continue; /* * We dropped ptl after the first scan, to do the mmu_notifier: * page lock stops more PTEs of the folio being faulted in, but * does not stop write faults COWing anon copies from existing * PTEs; and does not stop those being swapped out or migrated. */ if (!pte_present(ptent)) { result = SCAN_PTE_NON_PRESENT; goto abort; } page = vm_normal_page(vma, addr, ptent); if (folio_page(folio, i) != page) goto abort; /* * Must clear entry, or a racing truncate may re-remove it. * TLB flush can be left until pmdp_collapse_flush() does it. * PTE dirty? Shmem page is already dirty; file is read-only. */ ptep_clear(mm, addr, pte); folio_remove_rmap_pte(folio, page, vma); nr_ptes++; } if (!pml) spin_unlock(ptl); /* step 3: set proper refcount and mm_counters. */ if (nr_ptes) { folio_ref_sub(folio, nr_ptes); add_mm_counter(mm, mm_counter_file(folio), -nr_ptes); } /* step 4: remove empty page table */ if (!pml) { pml = pmd_lock(mm, pmd); if (ptl != pml) { spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { flush_tlb_mm(mm); goto unlock; } } } pgt_pmd = pmdp_collapse_flush(vma, haddr, pmd); pmdp_get_lockless_sync(); pte_unmap_unlock(start_pte, ptl); if (ptl != pml) spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, haddr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); maybe_install_pmd: /* step 5: install pmd entry */ result = install_pmd ? set_huge_pmd(vma, haddr, pmd, folio, &folio->page) : SCAN_SUCCEED; goto drop_folio; abort: if (nr_ptes) { flush_tlb_mm(mm); folio_ref_sub(folio, nr_ptes); add_mm_counter(mm, mm_counter_file(folio), -nr_ptes); } unlock: if (start_pte) pte_unmap_unlock(start_pte, ptl); if (pml && pml != ptl) spin_unlock(pml); if (notified) mmu_notifier_invalidate_range_end(&range); drop_folio: folio_unlock(folio); folio_put(folio); return result; } static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) { struct vm_area_struct *vma; i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) { struct mmu_notifier_range range; struct mm_struct *mm; unsigned long addr; pmd_t *pmd, pgt_pmd; spinlock_t *pml; spinlock_t *ptl; bool success = false; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that * got written to. These VMAs are likely not worth removing * page tables from, as PMD-mapping is likely to be split later. */ if (READ_ONCE(vma->anon_vma)) continue; addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); if (addr & ~HPAGE_PMD_MASK || vma->vm_end < addr + HPAGE_PMD_SIZE) continue; mm = vma->vm_mm; if (find_pmd_or_thp_or_none(mm, addr, &pmd) != SCAN_SUCCEED) continue; if (hpage_collapse_test_exit(mm)) continue; /* * When a vma is registered with uffd-wp, we cannot recycle * the page table because there may be pte markers installed. * Other vmas can still have the same file mapped hugely, but * skip this one: it will always be mapped in small page size * for uffd-wp registered ranges. */ if (userfaultfd_wp(vma)) continue; /* PTEs were notified when unmapped; but now for the PMD? */ mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, addr, addr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); /* * The lock of new_folio is still held, we will be blocked in * the page fault path, which prevents the pte entries from * being set again. So even though the old empty PTE page may be * concurrently freed and a new PTE page is filled into the pmd * entry, it is still empty and can be removed. * * So here we only need to recheck if the state of pmd entry * still meets our requirements, rather than checking pmd_same() * like elsewhere. */ if (check_pmd_state(pmd) != SCAN_SUCCEED) goto drop_pml; ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); /* * Huge page lock is still held, so normally the page table * must remain empty; and we have already skipped anon_vma * and userfaultfd_wp() vmas. But since the mmap_lock is not * held, it is still possible for a racing userfaultfd_ioctl() * to have inserted ptes or markers. Now that we hold ptlock, * repeating the anon_vma check protects from one category, * and repeating the userfaultfd_wp() check from another. */ if (likely(!vma->anon_vma && !userfaultfd_wp(vma))) { pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); success = true; } if (ptl != pml) spin_unlock(ptl); drop_pml: spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); if (success) { mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); } } i_mmap_unlock_read(mapping); } /** * collapse_file - collapse filemap/tmpfs/shmem pages into huge one. * * @mm: process address space where collapse happens * @addr: virtual collapse start address * @file: file that collapse on * @start: collapse start address * @cc: collapse context and scratchpad * * Basic scheme is simple, details are more complex: * - allocate and lock a new huge page; * - scan page cache, locking old pages * + swap/gup in pages if necessary; * - copy data to new page * - handle shmem holes * + re-validate that holes weren't filled by someone else * + check for userfaultfd * - finalize updates to the page cache; * - if replacing succeeds: * + unlock huge page; * + free old pages; * - if replacing failed; * + unlock old pages * + unlock and free huge page; */ static int collapse_file(struct mm_struct *mm, unsigned long addr, struct file *file, pgoff_t start, struct collapse_control *cc) { struct address_space *mapping = file->f_mapping; struct page *dst; struct folio *folio, *tmp, *new_folio; pgoff_t index = 0, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); int nr_none = 0, result = SCAN_SUCCEED; bool is_shmem = shmem_file(file); VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); result = alloc_charge_folio(&new_folio, mm, cc); if (result != SCAN_SUCCEED) goto out; mapping_set_update(&xas, mapping); __folio_set_locked(new_folio); if (is_shmem) __folio_set_swapbacked(new_folio); new_folio->index = start; new_folio->mapping = mapping; /* * Ensure we have slots for all the pages in the range. This is * almost certainly a no-op because most of the pages must be present */ do { xas_lock_irq(&xas); xas_create_range(&xas); if (!xas_error(&xas)) break; xas_unlock_irq(&xas); if (!xas_nomem(&xas, GFP_KERNEL)) { result = SCAN_FAIL; goto rollback; } } while (1); for (index = start; index < end;) { xas_set(&xas, index); folio = xas_load(&xas); VM_BUG_ON(index != xas.xa_index); if (is_shmem) { if (!folio) { /* * Stop if extent has been truncated or * hole-punched, and is now completely * empty. */ if (index == start) { if (!xas_next_entry(&xas, end - 1)) { result = SCAN_TRUNCATED; goto xa_locked; } } nr_none++; index++; continue; } if (xa_is_value(folio) || !folio_test_uptodate(folio)) { xas_unlock_irq(&xas); /* swap in or instantiate fallocated page */ if (shmem_get_folio(mapping->host, index, 0, &folio, SGP_NOALLOC)) { result = SCAN_FAIL; goto xa_unlocked; } /* drain lru cache to help folio_isolate_lru() */ lru_add_drain(); } else if (folio_trylock(folio)) { folio_get(folio); xas_unlock_irq(&xas); } else { result = SCAN_PAGE_LOCK; goto xa_locked; } } else { /* !is_shmem */ if (!folio || xa_is_value(folio)) { xas_unlock_irq(&xas); page_cache_sync_readahead(mapping, &file->f_ra, file, index, end - index); /* drain lru cache to help folio_isolate_lru() */ lru_add_drain(); folio = filemap_lock_folio(mapping, index); if (IS_ERR(folio)) { result = SCAN_FAIL; goto xa_unlocked; } } else if (folio_test_dirty(folio)) { /* * khugepaged only works on read-only fd, * so this page is dirty because it hasn't * been flushed since first write. There * won't be new dirty pages. * * Trigger async flush here and hope the * writeback is done when khugepaged * revisits this page. * * This is a one-off situation. We are not * forcing writeback in loop. */ xas_unlock_irq(&xas); filemap_flush(mapping); result = SCAN_FAIL; goto xa_unlocked; } else if (folio_test_writeback(folio)) { xas_unlock_irq(&xas); result = SCAN_FAIL; goto xa_unlocked; } else if (folio_trylock(folio)) { folio_get(folio); xas_unlock_irq(&xas); } else { result = SCAN_PAGE_LOCK; goto xa_locked; } } /* * The folio must be locked, so we can drop the i_pages lock * without racing with truncate. */ VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); /* make sure the folio is up to date */ if (unlikely(!folio_test_uptodate(folio))) { result = SCAN_FAIL; goto out_unlock; } /* * If file was truncated then extended, or hole-punched, before * we locked the first folio, then a THP might be there already. * This will be discovered on the first iteration. */ if (folio_order(folio) == HPAGE_PMD_ORDER && folio->index == start) { /* Maybe PMD-mapped */ result = SCAN_PTE_MAPPED_HUGEPAGE; goto out_unlock; } if (folio_mapping(folio) != mapping) { result = SCAN_TRUNCATED; goto out_unlock; } if (!is_shmem && (folio_test_dirty(folio) || folio_test_writeback(folio))) { /* * khugepaged only works on read-only fd, so this * folio is dirty because it hasn't been flushed * since first write. */ result = SCAN_FAIL; goto out_unlock; } if (!folio_isolate_lru(folio)) { result = SCAN_DEL_PAGE_LRU; goto out_unlock; } if (!filemap_release_folio(folio, GFP_KERNEL)) { result = SCAN_PAGE_HAS_PRIVATE; folio_putback_lru(folio); goto out_unlock; } if (folio_mapped(folio)) try_to_unmap(folio, TTU_IGNORE_MLOCK | TTU_BATCH_FLUSH); xas_lock_irq(&xas); VM_BUG_ON_FOLIO(folio != xa_load(xas.xa, index), folio); /* * We control 2 + nr_pages references to the folio: * - we hold a pin on it; * - nr_pages reference from page cache; * - one from lru_isolate_folio; * If those are the only references, then any new usage * of the folio will have to fetch it from the page * cache. That requires locking the folio to handle * truncate, so any new usage will be blocked until we * unlock folio after collapse/during rollback. */ if (folio_ref_count(folio) != 2 + folio_nr_pages(folio)) { result = SCAN_PAGE_COUNT; xas_unlock_irq(&xas); folio_putback_lru(folio); goto out_unlock; } /* * Accumulate the folios that are being collapsed. */ list_add_tail(&folio->lru, &pagelist); index += folio_nr_pages(folio); continue; out_unlock: folio_unlock(folio); folio_put(folio); goto xa_unlocked; } if (!is_shmem) { filemap_nr_thps_inc(mapping); /* * Paired with the fence in do_dentry_open() -> get_write_access() * to ensure i_writecount is up to date and the update to nr_thps * is visible. Ensures the page cache will be truncated if the * file is opened writable. */ smp_mb(); if (inode_is_open_for_write(mapping->host)) { result = SCAN_FAIL; filemap_nr_thps_dec(mapping); } } xa_locked: xas_unlock_irq(&xas); xa_unlocked: /* * If collapse is successful, flush must be done now before copying. * If collapse is unsuccessful, does flush actually need to be done? * Do it anyway, to clear the state. */ try_to_unmap_flush(); if (result == SCAN_SUCCEED && nr_none && !shmem_charge(mapping->host, nr_none)) result = SCAN_FAIL; if (result != SCAN_SUCCEED) { nr_none = 0; goto rollback; } /* * The old folios are locked, so they won't change anymore. */ index = start; dst = folio_page(new_folio, 0); list_for_each_entry(folio, &pagelist, lru) { int i, nr_pages = folio_nr_pages(folio); while (index < folio->index) { clear_highpage(dst); index++; dst++; } for (i = 0; i < nr_pages; i++) { if (copy_mc_highpage(dst, folio_page(folio, i)) > 0) { result = SCAN_COPY_MC; goto rollback; } index++; dst++; } } while (index < end) { clear_highpage(dst); index++; dst++; } if (nr_none) { struct vm_area_struct *vma; int nr_none_check = 0; i_mmap_lock_read(mapping); xas_lock_irq(&xas); xas_set(&xas, start); for (index = start; index < end; index++) { if (!xas_next(&xas)) { xas_store(&xas, XA_RETRY_ENTRY); if (xas_error(&xas)) { result = SCAN_STORE_FAILED; goto immap_locked; } nr_none_check++; } } if (nr_none != nr_none_check) { result = SCAN_PAGE_FILLED; goto immap_locked; } /* * If userspace observed a missing page in a VMA with * a MODE_MISSING userfaultfd, then it might expect a * UFFD_EVENT_PAGEFAULT for that page. If so, we need to * roll back to avoid suppressing such an event. Since * wp/minor userfaultfds don't give userspace any * guarantees that the kernel doesn't fill a missing * page with a zero page, so they don't matter here. * * Any userfaultfds registered after this point will * not be able to observe any missing pages due to the * previously inserted retry entries. */ vma_interval_tree_foreach(vma, &mapping->i_mmap, start, end) { if (userfaultfd_missing(vma)) { result = SCAN_EXCEED_NONE_PTE; goto immap_locked; } } immap_locked: i_mmap_unlock_read(mapping); if (result != SCAN_SUCCEED) { xas_set(&xas, start); for (index = start; index < end; index++) { if (xas_next(&xas) == XA_RETRY_ENTRY) xas_store(&xas, NULL); } xas_unlock_irq(&xas); goto rollback; } } else { xas_lock_irq(&xas); } if (is_shmem) __lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR); else __lruvec_stat_mod_folio(new_folio, NR_FILE_THPS, HPAGE_PMD_NR); if (nr_none) { __lruvec_stat_mod_folio(new_folio, NR_FILE_PAGES, nr_none); /* nr_none is always 0 for non-shmem. */ __lruvec_stat_mod_folio(new_folio, NR_SHMEM, nr_none); } /* * Mark new_folio as uptodate before inserting it into the * page cache so that it isn't mistaken for an fallocated but * unwritten page. */ folio_mark_uptodate(new_folio); folio_ref_add(new_folio, HPAGE_PMD_NR - 1); if (is_shmem) folio_mark_dirty(new_folio); folio_add_lru(new_folio); /* Join all the small entries into a single multi-index entry. */ xas_set_order(&xas, start, HPAGE_PMD_ORDER); xas_store(&xas, new_folio); WARN_ON_ONCE(xas_error(&xas)); xas_unlock_irq(&xas); /* * Remove pte page tables, so we can re-fault the page as huge. * If MADV_COLLAPSE, adjust result to call collapse_pte_mapped_thp(). */ retract_page_tables(mapping, start); if (cc && !cc->is_khugepaged) result = SCAN_PTE_MAPPED_HUGEPAGE; folio_unlock(new_folio); /* * The collapse has succeeded, so free the old folios. */ list_for_each_entry_safe(folio, tmp, &pagelist, lru) { list_del(&folio->lru); folio->mapping = NULL; folio_clear_active(folio); folio_clear_unevictable(folio); folio_unlock(folio); folio_put_refs(folio, 2 + folio_nr_pages(folio)); } goto out; rollback: /* Something went wrong: roll back page cache changes */ if (nr_none) { xas_lock_irq(&xas); mapping->nrpages -= nr_none; xas_unlock_irq(&xas); shmem_uncharge(mapping->host, nr_none); } list_for_each_entry_safe(folio, tmp, &pagelist, lru) { list_del(&folio->lru); folio_unlock(folio); folio_putback_lru(folio); folio_put(folio); } /* * Undo the updates of filemap_nr_thps_inc for non-SHMEM * file only. This undo is not needed unless failure is * due to SCAN_COPY_MC. */ if (!is_shmem && result == SCAN_COPY_MC) { filemap_nr_thps_dec(mapping); /* * Paired with the fence in do_dentry_open() -> get_write_access() * to ensure the update to nr_thps is visible. */ smp_mb(); } new_folio->mapping = NULL; folio_unlock(new_folio); folio_put(new_folio); out: VM_BUG_ON(!list_empty(&pagelist)); trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, file, HPAGE_PMD_NR, result); return result; } static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr, struct file *file, pgoff_t start, struct collapse_control *cc) { struct folio *folio = NULL; struct address_space *mapping = file->f_mapping; XA_STATE(xas, &mapping->i_pages, start); int present, swap; int node = NUMA_NO_NODE; int result = SCAN_SUCCEED; present = 0; swap = 0; memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); rcu_read_lock(); xas_for_each(&xas, folio, start + HPAGE_PMD_NR - 1) { if (xas_retry(&xas, folio)) continue; if (xa_is_value(folio)) { swap += 1 << xas_get_order(&xas); if (cc->is_khugepaged && swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; } continue; } if (!folio_try_get(folio)) { xas_reset(&xas); continue; } if (unlikely(folio != xas_reload(&xas))) { folio_put(folio); xas_reset(&xas); continue; } if (folio_order(folio) == HPAGE_PMD_ORDER && folio->index == start) { /* Maybe PMD-mapped */ result = SCAN_PTE_MAPPED_HUGEPAGE; /* * For SCAN_PTE_MAPPED_HUGEPAGE, further processing * by the caller won't touch the page cache, and so * it's safe to skip LRU and refcount checks before * returning. */ folio_put(folio); break; } node = folio_nid(folio); if (hpage_collapse_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; folio_put(folio); break; } cc->node_load[node]++; if (!folio_test_lru(folio)) { result = SCAN_PAGE_LRU; folio_put(folio); break; } if (folio_expected_ref_count(folio) + 1 != folio_ref_count(folio)) { result = SCAN_PAGE_COUNT; folio_put(folio); break; } /* * We probably should check if the folio is referenced * here, but nobody would transfer pte_young() to * folio_test_referenced() for us. And rmap walk here * is just too costly... */ present += folio_nr_pages(folio); folio_put(folio); if (need_resched()) { xas_pause(&xas); cond_resched_rcu(); } } rcu_read_unlock(); if (result == SCAN_SUCCEED) { if (cc->is_khugepaged && present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { result = collapse_file(mm, addr, file, start, cc); } } trace_mm_khugepaged_scan_file(mm, folio, file, present, swap, result); return result; } static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, struct collapse_control *cc) __releases(&khugepaged_mm_lock) __acquires(&khugepaged_mm_lock) { struct vma_iterator vmi; struct khugepaged_mm_slot *mm_slot; struct mm_slot *slot; struct mm_struct *mm; struct vm_area_struct *vma; int progress = 0; VM_BUG_ON(!pages); lockdep_assert_held(&khugepaged_mm_lock); *result = SCAN_FAIL; if (khugepaged_scan.mm_slot) { mm_slot = khugepaged_scan.mm_slot; slot = &mm_slot->slot; } else { slot = list_entry(khugepaged_scan.mm_head.next, struct mm_slot, mm_node); mm_slot = mm_slot_entry(slot, struct khugepaged_mm_slot, slot); khugepaged_scan.address = 0; khugepaged_scan.mm_slot = mm_slot; } spin_unlock(&khugepaged_mm_lock); mm = slot->mm; /* * Don't wait for semaphore (to avoid long wait times). Just move to * the next mm on the list. */ vma = NULL; if (unlikely(!mmap_read_trylock(mm))) goto breakouterloop_mmap_lock; progress++; if (unlikely(hpage_collapse_test_exit_or_disable(mm))) goto breakouterloop; vma_iter_init(&vmi, mm, khugepaged_scan.address); for_each_vma(vmi, vma) { unsigned long hstart, hend; cond_resched(); if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { progress++; break; } if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_ENFORCE_SYSFS, PMD_ORDER)) { skip: progress++; continue; } hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE); hend = round_down(vma->vm_end, HPAGE_PMD_SIZE); if (khugepaged_scan.address > hend) goto skip; if (khugepaged_scan.address < hstart) khugepaged_scan.address = hstart; VM_BUG_ON(khugepaged_scan.address & ~HPAGE_PMD_MASK); while (khugepaged_scan.address < hend) { bool mmap_locked = true; cond_resched(); if (unlikely(hpage_collapse_test_exit_or_disable(mm))) goto breakouterloop; VM_BUG_ON(khugepaged_scan.address < hstart || khugepaged_scan.address + HPAGE_PMD_SIZE > hend); if (!vma_is_anonymous(vma)) { struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, khugepaged_scan.address); mmap_read_unlock(mm); mmap_locked = false; *result = hpage_collapse_scan_file(mm, khugepaged_scan.address, file, pgoff, cc); fput(file); if (*result == SCAN_PTE_MAPPED_HUGEPAGE) { mmap_read_lock(mm); if (hpage_collapse_test_exit_or_disable(mm)) goto breakouterloop; *result = collapse_pte_mapped_thp(mm, khugepaged_scan.address, false); if (*result == SCAN_PMD_MAPPED) *result = SCAN_SUCCEED; mmap_read_unlock(mm); } } else { *result = hpage_collapse_scan_pmd(mm, vma, khugepaged_scan.address, &mmap_locked, cc); } if (*result == SCAN_SUCCEED) ++khugepaged_pages_collapsed; /* move to next address */ khugepaged_scan.address += HPAGE_PMD_SIZE; progress += HPAGE_PMD_NR; if (!mmap_locked) /* * We released mmap_lock so break loop. Note * that we drop mmap_lock before all hugepage * allocations, so if allocation fails, we are * guaranteed to break here and report the * correct result back to caller. */ goto breakouterloop_mmap_lock; if (progress >= pages) goto breakouterloop; } } breakouterloop: mmap_read_unlock(mm); /* exit_mmap will destroy ptes after this */ breakouterloop_mmap_lock: spin_lock(&khugepaged_mm_lock); VM_BUG_ON(khugepaged_scan.mm_slot != mm_slot); /* * Release the current mm_slot if this mm is about to die, or * if we scanned all vmas of this mm. */ if (hpage_collapse_test_exit(mm) || !vma) { /* * Make sure that if mm_users is reaching zero while * khugepaged runs here, khugepaged_exit will find * mm_slot not pointing to the exiting mm. */ if (slot->mm_node.next != &khugepaged_scan.mm_head) { slot = list_entry(slot->mm_node.next, struct mm_slot, mm_node); khugepaged_scan.mm_slot = mm_slot_entry(slot, struct khugepaged_mm_slot, slot); khugepaged_scan.address = 0; } else { khugepaged_scan.mm_slot = NULL; khugepaged_full_scans++; } collect_mm_slot(mm_slot); } return progress; } static int khugepaged_has_work(void) { return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled(); } static int khugepaged_wait_event(void) { return !list_empty(&khugepaged_scan.mm_head) || kthread_should_stop(); } static void khugepaged_do_scan(struct collapse_control *cc) { unsigned int progress = 0, pass_through_head = 0; unsigned int pages = READ_ONCE(khugepaged_pages_to_scan); bool wait = true; int result = SCAN_SUCCEED; lru_add_drain_all(); while (true) { cond_resched(); if (unlikely(kthread_should_stop())) break; spin_lock(&khugepaged_mm_lock); if (!khugepaged_scan.mm_slot) pass_through_head++; if (khugepaged_has_work() && pass_through_head < 2) progress += khugepaged_scan_mm_slot(pages - progress, &result, cc); else progress = pages; spin_unlock(&khugepaged_mm_lock); if (progress >= pages) break; if (result == SCAN_ALLOC_HUGE_PAGE_FAIL) { /* * If fail to allocate the first time, try to sleep for * a while. When hit again, cancel the scan. */ if (!wait) break; wait = false; khugepaged_alloc_sleep(); } } } static bool khugepaged_should_wakeup(void) { return kthread_should_stop() || time_after_eq(jiffies, khugepaged_sleep_expire); } static void khugepaged_wait_work(void) { if (khugepaged_has_work()) { const unsigned long scan_sleep_jiffies = msecs_to_jiffies(khugepaged_scan_sleep_millisecs); if (!scan_sleep_jiffies) return; khugepaged_sleep_expire = jiffies + scan_sleep_jiffies; wait_event_freezable_timeout(khugepaged_wait, khugepaged_should_wakeup(), scan_sleep_jiffies); return; } if (hugepage_pmd_enabled()) wait_event_freezable(khugepaged_wait, khugepaged_wait_event()); } static int khugepaged(void *none) { struct khugepaged_mm_slot *mm_slot; set_freezable(); set_user_nice(current, MAX_NICE); while (!kthread_should_stop()) { khugepaged_do_scan(&khugepaged_collapse_control); khugepaged_wait_work(); } spin_lock(&khugepaged_mm_lock); mm_slot = khugepaged_scan.mm_slot; khugepaged_scan.mm_slot = NULL; if (mm_slot) collect_mm_slot(mm_slot); spin_unlock(&khugepaged_mm_lock); return 0; } static void set_recommended_min_free_kbytes(void) { struct zone *zone; int nr_zones = 0; unsigned long recommended_min; if (!hugepage_pmd_enabled()) { calculate_min_free_kbytes(); goto update_wmarks; } for_each_populated_zone(zone) { /* * We don't need to worry about fragmentation of * ZONE_MOVABLE since it only has movable pages. */ if (zone_idx(zone) > gfp_zone(GFP_USER)) continue; nr_zones++; } /* Ensure 2 pageblocks are free to assist fragmentation avoidance */ recommended_min = pageblock_nr_pages * nr_zones * 2; /* * Make sure that on average at least two pageblocks are almost free * of another type, one for a migratetype to fall back to and a * second to avoid subsequent fallbacks of other types There are 3 * MIGRATE_TYPES we care about. */ recommended_min += pageblock_nr_pages * nr_zones * MIGRATE_PCPTYPES * MIGRATE_PCPTYPES; /* don't ever allow to reserve more than 5% of the lowmem */ recommended_min = min(recommended_min, (unsigned long) nr_free_buffer_pages() / 20); recommended_min <<= (PAGE_SHIFT-10); if (recommended_min > min_free_kbytes) { if (user_min_free_kbytes >= 0) pr_info("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n", min_free_kbytes, recommended_min); min_free_kbytes = recommended_min; } update_wmarks: setup_per_zone_wmarks(); } int start_stop_khugepaged(void) { int err = 0; mutex_lock(&khugepaged_mutex); if (hugepage_pmd_enabled()) { if (!khugepaged_thread) khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged"); if (IS_ERR(khugepaged_thread)) { pr_err("khugepaged: kthread_run(khugepaged) failed\n"); err = PTR_ERR(khugepaged_thread); khugepaged_thread = NULL; goto fail; } if (!list_empty(&khugepaged_scan.mm_head)) wake_up_interruptible(&khugepaged_wait); } else if (khugepaged_thread) { kthread_stop(khugepaged_thread); khugepaged_thread = NULL; } set_recommended_min_free_kbytes(); fail: mutex_unlock(&khugepaged_mutex); return err; } void khugepaged_min_free_kbytes_update(void) { mutex_lock(&khugepaged_mutex); if (hugepage_pmd_enabled() && khugepaged_thread) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); } bool current_is_khugepaged(void) { return kthread_func(current) == khugepaged; } static int madvise_collapse_errno(enum scan_result r) { /* * MADV_COLLAPSE breaks from existing madvise(2) conventions to provide * actionable feedback to caller, so they may take an appropriate * fallback measure depending on the nature of the failure. */ switch (r) { case SCAN_ALLOC_HUGE_PAGE_FAIL: return -ENOMEM; case SCAN_CGROUP_CHARGE_FAIL: case SCAN_EXCEED_NONE_PTE: return -EBUSY; /* Resource temporary unavailable - trying again might succeed */ case SCAN_PAGE_COUNT: case SCAN_PAGE_LOCK: case SCAN_PAGE_LRU: case SCAN_DEL_PAGE_LRU: case SCAN_PAGE_FILLED: return -EAGAIN; /* * Other: Trying again likely not to succeed / error intrinsic to * specified memory range. khugepaged likely won't be able to collapse * either. */ default: return -EINVAL; } } int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end) { struct collapse_control *cc; struct mm_struct *mm = vma->vm_mm; unsigned long hstart, hend, addr; int thps = 0, last_fail = SCAN_FAIL; bool mmap_locked = true; BUG_ON(vma->vm_start > start); BUG_ON(vma->vm_end < end); *prev = vma; if (!thp_vma_allowable_order(vma, vma->vm_flags, 0, PMD_ORDER)) return -EINVAL; cc = kmalloc(sizeof(*cc), GFP_KERNEL); if (!cc) return -ENOMEM; cc->is_khugepaged = false; mmgrab(mm); lru_add_drain_all(); hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK; hend = end & HPAGE_PMD_MASK; for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) { int result = SCAN_FAIL; if (!mmap_locked) { cond_resched(); mmap_read_lock(mm); mmap_locked = true; result = hugepage_vma_revalidate(mm, addr, false, &vma, cc); if (result != SCAN_SUCCEED) { last_fail = result; goto out_nolock; } hend = min(hend, vma->vm_end & HPAGE_PMD_MASK); } mmap_assert_locked(mm); memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); if (!vma_is_anonymous(vma)) { struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, addr); mmap_read_unlock(mm); mmap_locked = false; result = hpage_collapse_scan_file(mm, addr, file, pgoff, cc); fput(file); } else { result = hpage_collapse_scan_pmd(mm, vma, addr, &mmap_locked, cc); } if (!mmap_locked) *prev = NULL; /* Tell caller we dropped mmap_lock */ handle_result: switch (result) { case SCAN_SUCCEED: case SCAN_PMD_MAPPED: ++thps; break; case SCAN_PTE_MAPPED_HUGEPAGE: BUG_ON(mmap_locked); BUG_ON(*prev); mmap_read_lock(mm); result = collapse_pte_mapped_thp(mm, addr, true); mmap_read_unlock(mm); goto handle_result; /* Whitelisted set of results where continuing OK */ case SCAN_PMD_NULL: case SCAN_PTE_NON_PRESENT: case SCAN_PTE_UFFD_WP: case SCAN_PAGE_RO: case SCAN_LACK_REFERENCED_PAGE: case SCAN_PAGE_NULL: case SCAN_PAGE_COUNT: case SCAN_PAGE_LOCK: case SCAN_PAGE_COMPOUND: case SCAN_PAGE_LRU: case SCAN_DEL_PAGE_LRU: last_fail = result; break; default: last_fail = result; /* Other error, exit */ goto out_maybelock; } } out_maybelock: /* Caller expects us to hold mmap_lock on return */ if (!mmap_locked) mmap_read_lock(mm); out_nolock: mmap_assert_locked(mm); mmdrop(mm); kfree(cc); return thps == ((hend - hstart) >> HPAGE_PMD_SHIFT) ? 0 : madvise_collapse_errno(last_fail); } |
| 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | // SPDX-License-Identifier: GPL-2.0-or-later /* * Copyright (C) 2004, 2005 Oracle. All rights reserved. */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/string.h> #include <linux/uaccess.h> #include "masklog.h" struct mlog_bits mlog_and_bits = MLOG_BITS_RHS(MLOG_INITIAL_AND_MASK); EXPORT_SYMBOL_GPL(mlog_and_bits); struct mlog_bits mlog_not_bits = MLOG_BITS_RHS(0); EXPORT_SYMBOL_GPL(mlog_not_bits); static ssize_t mlog_mask_show(u64 mask, char *buf) { char *state; if (__mlog_test_u64(mask, mlog_and_bits)) state = "allow"; else if (__mlog_test_u64(mask, mlog_not_bits)) state = "deny"; else state = "off"; return snprintf(buf, PAGE_SIZE, "%s\n", state); } static ssize_t mlog_mask_store(u64 mask, const char *buf, size_t count) { if (!strncasecmp(buf, "allow", 5)) { __mlog_set_u64(mask, mlog_and_bits); __mlog_clear_u64(mask, mlog_not_bits); } else if (!strncasecmp(buf, "deny", 4)) { __mlog_set_u64(mask, mlog_not_bits); __mlog_clear_u64(mask, mlog_and_bits); } else if (!strncasecmp(buf, "off", 3)) { __mlog_clear_u64(mask, mlog_not_bits); __mlog_clear_u64(mask, mlog_and_bits); } else return -EINVAL; return count; } void __mlog_printk(const u64 *mask, const char *func, int line, const char *fmt, ...) { struct va_format vaf; va_list args; const char *level; const char *prefix = ""; if (!__mlog_test_u64(*mask, mlog_and_bits) || __mlog_test_u64(*mask, mlog_not_bits)) return; if (*mask & ML_ERROR) { level = KERN_ERR; prefix = "ERROR: "; } else if (*mask & ML_NOTICE) { level = KERN_NOTICE; } else { level = KERN_INFO; } va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; printk("%s(%s,%u,%u):%s:%d %s%pV", level, current->comm, task_pid_nr(current), raw_smp_processor_id(), func, line, prefix, &vaf); va_end(args); } EXPORT_SYMBOL_GPL(__mlog_printk); struct mlog_attribute { struct attribute attr; u64 mask; }; #define to_mlog_attr(_attr) container_of(_attr, struct mlog_attribute, attr) #define define_mask(_name) { \ .attr = { \ .name = #_name, \ .mode = S_IRUGO | S_IWUSR, \ }, \ .mask = ML_##_name, \ } static struct mlog_attribute mlog_attrs[MLOG_MAX_BITS] = { define_mask(TCP), define_mask(MSG), define_mask(SOCKET), define_mask(HEARTBEAT), define_mask(HB_BIO), define_mask(DLMFS), define_mask(DLM), define_mask(DLM_DOMAIN), define_mask(DLM_THREAD), define_mask(DLM_MASTER), define_mask(DLM_RECOVERY), define_mask(DLM_GLUE), define_mask(VOTE), define_mask(CONN), define_mask(QUORUM), define_mask(BASTS), define_mask(CLUSTER), define_mask(ERROR), define_mask(NOTICE), define_mask(KTHREAD), }; static struct attribute *mlog_default_attrs[MLOG_MAX_BITS] = {NULL, }; ATTRIBUTE_GROUPS(mlog_default); static ssize_t mlog_show(struct kobject *obj, struct attribute *attr, char *buf) { struct mlog_attribute *mlog_attr = to_mlog_attr(attr); return mlog_mask_show(mlog_attr->mask, buf); } static ssize_t mlog_store(struct kobject *obj, struct attribute *attr, const char *buf, size_t count) { struct mlog_attribute *mlog_attr = to_mlog_attr(attr); return mlog_mask_store(mlog_attr->mask, buf, count); } static const struct sysfs_ops mlog_attr_ops = { .show = mlog_show, .store = mlog_store, }; static struct kobj_type mlog_ktype = { .default_groups = mlog_default_groups, .sysfs_ops = &mlog_attr_ops, }; static struct kset mlog_kset = { .kobj = {.ktype = &mlog_ktype}, }; int mlog_sys_init(struct kset *o2cb_kset) { int i = 0; while (mlog_attrs[i].attr.mode) { mlog_default_attrs[i] = &mlog_attrs[i].attr; i++; } mlog_default_attrs[i] = NULL; kobject_set_name(&mlog_kset.kobj, "logmask"); mlog_kset.kobj.kset = o2cb_kset; return kset_register(&mlog_kset); } void mlog_sys_shutdown(void) { kset_unregister(&mlog_kset); } |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _SCSI_SCSI_DEVICE_H #define _SCSI_SCSI_DEVICE_H #include <linux/list.h> #include <linux/spinlock.h> #include <linux/workqueue.h> #include <linux/blk-mq.h> #include <scsi/scsi.h> #include <linux/atomic.h> #include <linux/sbitmap.h> struct bsg_device; struct device; struct request_queue; struct scsi_cmnd; struct scsi_lun; struct scsi_sense_hdr; typedef __u64 __bitwise blist_flags_t; #define SCSI_SENSE_BUFFERSIZE 96 struct scsi_mode_data { __u32 length; __u16 block_descriptor_length; __u8 medium_type; __u8 device_specific; __u8 header_length; __u8 longlba:1; }; /* * sdev state: If you alter this, you also need to alter scsi_sysfs.c * (for the ascii descriptions) and the state model enforcer: * scsi_lib:scsi_device_set_state(). */ enum scsi_device_state { SDEV_CREATED = 1, /* device created but not added to sysfs * Only internal commands allowed (for inq) */ SDEV_RUNNING, /* device properly configured * All commands allowed */ SDEV_CANCEL, /* beginning to delete device * Only error handler commands allowed */ SDEV_DEL, /* device deleted * no commands allowed */ SDEV_QUIESCE, /* Device quiescent. No block commands * will be accepted, only specials (which * originate in the mid-layer) */ SDEV_OFFLINE, /* Device offlined (by error handling or * user request */ SDEV_TRANSPORT_OFFLINE, /* Offlined by transport class error handler */ SDEV_BLOCK, /* Device blocked by scsi lld. No * scsi commands from user or midlayer * should be issued to the scsi * lld. */ SDEV_CREATED_BLOCK, /* same as above but for created devices */ }; enum scsi_scan_mode { SCSI_SCAN_INITIAL = 0, SCSI_SCAN_RESCAN, SCSI_SCAN_MANUAL, }; enum scsi_device_event { SDEV_EVT_MEDIA_CHANGE = 1, /* media has changed */ SDEV_EVT_INQUIRY_CHANGE_REPORTED, /* 3F 03 UA reported */ SDEV_EVT_CAPACITY_CHANGE_REPORTED, /* 2A 09 UA reported */ SDEV_EVT_SOFT_THRESHOLD_REACHED_REPORTED, /* 38 07 UA reported */ SDEV_EVT_MODE_PARAMETER_CHANGE_REPORTED, /* 2A 01 UA reported */ SDEV_EVT_LUN_CHANGE_REPORTED, /* 3F 0E UA reported */ SDEV_EVT_ALUA_STATE_CHANGE_REPORTED, /* 2A 06 UA reported */ SDEV_EVT_POWER_ON_RESET_OCCURRED, /* 29 00 UA reported */ SDEV_EVT_FIRST = SDEV_EVT_MEDIA_CHANGE, SDEV_EVT_LAST = SDEV_EVT_POWER_ON_RESET_OCCURRED, SDEV_EVT_MAXBITS = SDEV_EVT_LAST + 1 }; struct scsi_event { enum scsi_device_event evt_type; struct list_head node; /* put union of data structures, for non-simple event types, * here */ }; /** * struct scsi_vpd - SCSI Vital Product Data * @rcu: For kfree_rcu(). * @len: Length in bytes of @data. * @data: VPD data as defined in various T10 SCSI standard documents. */ struct scsi_vpd { struct rcu_head rcu; int len; unsigned char data[]; }; struct scsi_device { struct Scsi_Host *host; struct request_queue *request_queue; /* the next two are protected by the host->host_lock */ struct list_head siblings; /* list of all devices on this host */ struct list_head same_target_siblings; /* just the devices sharing same target id */ struct sbitmap budget_map; atomic_t device_blocked; /* Device returned QUEUE_FULL. */ atomic_t restarts; spinlock_t list_lock; struct list_head starved_entry; unsigned short queue_depth; /* How deep of a queue we want */ unsigned short max_queue_depth; /* max queue depth */ unsigned short last_queue_full_depth; /* These two are used by */ unsigned short last_queue_full_count; /* scsi_track_queue_full() */ unsigned long last_queue_full_time; /* last queue full time */ unsigned long queue_ramp_up_period; /* ramp up period in jiffies */ #define SCSI_DEFAULT_RAMP_UP_PERIOD (120 * HZ) unsigned long last_queue_ramp_up; /* last queue ramp up time */ unsigned int id, channel; u64 lun; unsigned int manufacturer; /* Manufacturer of device, for using * vendor-specific cmd's */ unsigned sector_size; /* size in bytes */ void *hostdata; /* available to low-level driver */ unsigned char type; char scsi_level; char inq_periph_qual; /* PQ from INQUIRY data */ struct mutex inquiry_mutex; unsigned char inquiry_len; /* valid bytes in 'inquiry' */ unsigned char * inquiry; /* INQUIRY response data */ const char * vendor; /* [back_compat] point into 'inquiry' ... */ const char * model; /* ... after scan; point to static string */ const char * rev; /* ... "nullnullnullnull" before scan */ #define SCSI_DEFAULT_VPD_LEN 255 /* default SCSI VPD page size (max) */ struct scsi_vpd __rcu *vpd_pg0; struct scsi_vpd __rcu *vpd_pg83; struct scsi_vpd __rcu *vpd_pg80; struct scsi_vpd __rcu *vpd_pg89; struct scsi_vpd __rcu *vpd_pgb0; struct scsi_vpd __rcu *vpd_pgb1; struct scsi_vpd __rcu *vpd_pgb2; struct scsi_vpd __rcu *vpd_pgb7; struct scsi_target *sdev_target; blist_flags_t sdev_bflags; /* black/white flags as also found in * scsi_devinfo.[hc]. For now used only to * pass settings from sdev_init to scsi * core. */ unsigned int eh_timeout; /* Error handling timeout */ /* * If true, let the high-level device driver (sd) manage the device * power state for system suspend/resume (suspend to RAM and * hibernation) operations. */ unsigned manage_system_start_stop:1; /* * If true, let the high-level device driver (sd) manage the device * power state for runtime device suspand and resume operations. */ unsigned manage_runtime_start_stop:1; /* * If true, let the high-level device driver (sd) manage the device * power state for system shutdown (power off) operations. */ unsigned manage_shutdown:1; /* * If set and if the device is runtime suspended, ask the high-level * device driver (sd) to force a runtime resume of the device. */ unsigned force_runtime_start_on_system_start:1; unsigned removable:1; unsigned changed:1; /* Data invalid due to media change */ unsigned busy:1; /* Used to prevent races */ unsigned lockable:1; /* Able to prevent media removal */ unsigned locked:1; /* Media removal disabled */ unsigned borken:1; /* Tell the Seagate driver to be * painfully slow on this device */ unsigned disconnect:1; /* can disconnect */ unsigned soft_reset:1; /* Uses soft reset option */ unsigned sdtr:1; /* Device supports SDTR messages */ unsigned wdtr:1; /* Device supports WDTR messages */ unsigned ppr:1; /* Device supports PPR messages */ unsigned tagged_supported:1; /* Supports SCSI-II tagged queuing */ unsigned simple_tags:1; /* simple queue tag messages are enabled */ unsigned was_reset:1; /* There was a bus reset on the bus for * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ unsigned read_before_ms:1; /* perform a READ before MODE SENSE */ unsigned no_report_opcodes:1; /* no REPORT SUPPORTED OPERATION CODES */ unsigned no_write_same:1; /* no WRITE SAME command */ unsigned use_16_for_rw:1; /* Use read/write(16) over read/write(10) */ unsigned use_16_for_sync:1; /* Use sync (16) over sync (10) */ unsigned skip_ms_page_8:1; /* do not use MODE SENSE page 0x08 */ unsigned skip_ms_page_3f:1; /* do not use MODE SENSE page 0x3f */ unsigned skip_vpd_pages:1; /* do not read VPD pages */ unsigned try_vpd_pages:1; /* attempt to read VPD pages */ unsigned use_192_bytes_for_3f:1; /* ask for 192 bytes from page 0x3f */ unsigned no_start_on_add:1; /* do not issue start on add */ unsigned allow_restart:1; /* issue START_UNIT in error handler */ unsigned start_stop_pwr_cond:1; /* Set power cond. in START_STOP_UNIT */ unsigned no_uld_attach:1; /* disable connecting to upper level drivers */ unsigned select_no_atn:1; unsigned fix_capacity:1; /* READ_CAPACITY is too high by 1 */ unsigned guess_capacity:1; /* READ_CAPACITY might be too high by 1 */ unsigned retry_hwerror:1; /* Retry HARDWARE_ERROR */ unsigned last_sector_bug:1; /* do not use multisector accesses on SD_LAST_BUGGY_SECTORS */ unsigned no_read_disc_info:1; /* Avoid READ_DISC_INFO cmds */ unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */ unsigned try_rc_10_first:1; /* Try READ_CAPACACITY_10 first */ unsigned security_supported:1; /* Supports Security Protocols */ unsigned is_visible:1; /* is the device visible in sysfs */ unsigned wce_default_on:1; /* Cache is ON by default */ unsigned no_dif:1; /* T10 PI (DIF) should be disabled */ unsigned broken_fua:1; /* Don't set FUA bit */ unsigned lun_in_cdb:1; /* Store LUN bits in CDB[1] */ unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */ unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device * creation time */ unsigned ignore_media_change:1; /* Ignore MEDIA CHANGE on resume */ unsigned silence_suspend:1; /* Do not print runtime PM related messages */ unsigned no_vpd_size:1; /* No VPD size reported in header */ unsigned cdl_supported:1; /* Command duration limits supported */ unsigned cdl_enable:1; /* Enable/disable Command duration limits */ unsigned int queue_stopped; /* request queue is quiesced */ bool offline_already; /* Device offline message logged */ unsigned int ua_new_media_ctr; /* Counter for New Media UNIT ATTENTIONs */ unsigned int ua_por_ctr; /* Counter for Power On / Reset UAs */ atomic_t disk_events_disable_depth; /* disable depth for disk events */ DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */ DECLARE_BITMAP(pending_events, SDEV_EVT_MAXBITS); /* pending events */ struct list_head event_list; /* asserted events */ struct work_struct event_work; unsigned int max_device_blocked; /* what device_blocked counts down from */ #define SCSI_DEFAULT_DEVICE_BLOCKED 3 atomic_t iorequest_cnt; atomic_t iodone_cnt; atomic_t ioerr_cnt; atomic_t iotmo_cnt; struct device sdev_gendev, sdev_dev; struct work_struct requeue_work; struct scsi_device_handler *handler; void *handler_data; size_t dma_drain_len; void *dma_drain_buf; unsigned int sg_timeout; unsigned int sg_reserved_size; struct bsg_device *bsg_dev; unsigned char access_state; struct mutex state_mutex; enum scsi_device_state sdev_state; struct task_struct *quiesced_by; unsigned long sdev_data[]; } __attribute__((aligned(sizeof(unsigned long)))); #define to_scsi_device(d) \ container_of(d, struct scsi_device, sdev_gendev) #define class_to_sdev(d) \ container_of(d, struct scsi_device, sdev_dev) #define transport_class_to_sdev(class_dev) \ to_scsi_device(class_dev->parent) #define sdev_dbg(sdev, fmt, a...) \ dev_dbg(&(sdev)->sdev_gendev, fmt, ##a) /* * like scmd_printk, but the device name is passed in * as a string pointer */ __printf(4, 5) void sdev_prefix_printk(const char *, const struct scsi_device *, const char *, const char *, ...); #define sdev_printk(l, sdev, fmt, a...) \ sdev_prefix_printk(l, sdev, NULL, fmt, ##a) __printf(3, 4) void scmd_printk(const char *, const struct scsi_cmnd *, const char *, ...); #define scmd_dbg(scmd, fmt, a...) \ do { \ struct request *__rq = scsi_cmd_to_rq((scmd)); \ \ if (__rq->q->disk) \ sdev_dbg((scmd)->device, "[%s] " fmt, \ __rq->q->disk->disk_name, ##a); \ else \ sdev_dbg((scmd)->device, fmt, ##a); \ } while (0) enum scsi_target_state { STARGET_CREATED = 1, STARGET_RUNNING, STARGET_REMOVE, STARGET_CREATED_REMOVE, STARGET_DEL, }; /* * scsi_target: representation of a scsi target, for now, this is only * used for single_lun devices. If no one has active IO to the target, * starget_sdev_user is NULL, else it points to the active sdev. */ struct scsi_target { struct scsi_device *starget_sdev_user; struct list_head siblings; struct list_head devices; struct device dev; struct kref reap_ref; /* last put renders target invisible */ unsigned int channel; unsigned int id; /* target id ... replace * scsi_device.id eventually */ unsigned int create:1; /* signal that it needs to be added */ unsigned int single_lun:1; /* Indicates we should only * allow I/O to one of the luns * for the device at a time. */ unsigned int pdt_1f_for_no_lun:1; /* PDT = 0x1f * means no lun present. */ unsigned int no_report_luns:1; /* Don't use * REPORT LUNS for scanning. */ unsigned int expecting_lun_change:1; /* A device has reported * a 3F/0E UA, other devices on * the same target will also. */ /* commands actually active on LLD. */ atomic_t target_busy; atomic_t target_blocked; /* * LLDs should set this in the sdev_init host template callout. * If set to zero then there is not limit. */ unsigned int can_queue; unsigned int max_target_blocked; #define SCSI_DEFAULT_TARGET_BLOCKED 3 char scsi_level; enum scsi_target_state state; void *hostdata; /* available to low-level driver */ unsigned long starget_data[]; /* for the transport */ /* starget_data must be the last element!!!! */ } __attribute__((aligned(sizeof(unsigned long)))); #define to_scsi_target(d) container_of(d, struct scsi_target, dev) static inline struct scsi_target *scsi_target(struct scsi_device *sdev) { return to_scsi_target(sdev->sdev_gendev.parent); } #define transport_class_to_starget(class_dev) \ to_scsi_target(class_dev->parent) #define starget_printk(prefix, starget, fmt, a...) \ dev_printk(prefix, &(starget)->dev, fmt, ##a) extern struct scsi_device *__scsi_add_device(struct Scsi_Host *, uint, uint, u64, void *hostdata); extern int scsi_add_device(struct Scsi_Host *host, uint channel, uint target, u64 lun); extern int scsi_register_device_handler(struct scsi_device_handler *scsi_dh); extern void scsi_remove_device(struct scsi_device *); extern int scsi_unregister_device_handler(struct scsi_device_handler *scsi_dh); void scsi_attach_vpd(struct scsi_device *sdev); void scsi_cdl_check(struct scsi_device *sdev); int scsi_cdl_enable(struct scsi_device *sdev, bool enable); extern struct scsi_device *scsi_device_from_queue(struct request_queue *q); extern int __must_check scsi_device_get(struct scsi_device *); extern void scsi_device_put(struct scsi_device *); extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *, uint, uint, u64); extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *, uint, uint, u64); extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *, u64); extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *, u64); extern void starget_for_each_device(struct scsi_target *, void *, void (*fn)(struct scsi_device *, void *)); extern void __starget_for_each_device(struct scsi_target *, void *, void (*fn)(struct scsi_device *, void *)); /* only exposed to implement shost_for_each_device */ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *, struct scsi_device *); /** * shost_for_each_device - iterate over all devices of a host * @sdev: the &struct scsi_device to use as a cursor * @shost: the &struct scsi_host to iterate over * * Iterator that returns each device attached to @shost. This loop * takes a reference on each device and releases it at the end. If * you break out of the loop, you must call scsi_device_put(sdev). */ #define shost_for_each_device(sdev, shost) \ for ((sdev) = __scsi_iterate_devices((shost), NULL); \ (sdev); \ (sdev) = __scsi_iterate_devices((shost), (sdev))) /** * __shost_for_each_device - iterate over all devices of a host (UNLOCKED) * @sdev: the &struct scsi_device to use as a cursor * @shost: the &struct scsi_host to iterate over * * Iterator that returns each device attached to @shost. It does _not_ * take a reference on the scsi_device, so the whole loop must be * protected by shost->host_lock. * * Note: The only reason to use this is because you need to access the * device list in interrupt context. Otherwise you really want to use * shost_for_each_device instead. */ #define __shost_for_each_device(sdev, shost) \ list_for_each_entry((sdev), &((shost)->__devices), siblings) extern int scsi_change_queue_depth(struct scsi_device *, int); extern int scsi_track_queue_full(struct scsi_device *, int); extern int scsi_set_medium_removal(struct scsi_device *, char); int scsi_mode_sense(struct scsi_device *sdev, int dbd, int modepage, int subpage, unsigned char *buffer, int len, int timeout, int retries, struct scsi_mode_data *data, struct scsi_sense_hdr *); extern int scsi_mode_select(struct scsi_device *sdev, int pf, int sp, unsigned char *buffer, int len, int timeout, int retries, struct scsi_mode_data *data, struct scsi_sense_hdr *); extern int scsi_test_unit_ready(struct scsi_device *sdev, int timeout, int retries, struct scsi_sense_hdr *sshdr); extern int scsi_get_vpd_page(struct scsi_device *, u8 page, unsigned char *buf, int buf_len); int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer, unsigned int len, unsigned char opcode, unsigned short sa); extern int scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state); extern struct scsi_event *sdev_evt_alloc(enum scsi_device_event evt_type, gfp_t gfpflags); extern void sdev_evt_send(struct scsi_device *sdev, struct scsi_event *evt); extern void sdev_evt_send_simple(struct scsi_device *sdev, enum scsi_device_event evt_type, gfp_t gfpflags); extern int scsi_device_quiesce(struct scsi_device *sdev); extern void scsi_device_resume(struct scsi_device *sdev); extern void scsi_target_quiesce(struct scsi_target *); extern void scsi_target_resume(struct scsi_target *); extern void scsi_scan_target(struct device *parent, unsigned int channel, unsigned int id, u64 lun, enum scsi_scan_mode rescan); extern void scsi_target_reap(struct scsi_target *); void scsi_block_targets(struct Scsi_Host *shost, struct device *dev); extern void scsi_target_unblock(struct device *, enum scsi_device_state); extern void scsi_remove_target(struct device *); extern const char *scsi_device_state_name(enum scsi_device_state); extern int scsi_is_sdev_device(const struct device *); extern int scsi_is_target_device(const struct device *); extern void scsi_sanitize_inquiry_string(unsigned char *s, int len); /* * scsi_execute_cmd users can set scsi_failure.result to have * scsi_check_passthrough fail/retry a command. scsi_failure.result can be a * specific host byte or message code, or SCMD_FAILURE_RESULT_ANY can be used * to match any host or message code. */ #define SCMD_FAILURE_RESULT_ANY 0x7fffffff /* * Set scsi_failure.result to SCMD_FAILURE_STAT_ANY to fail/retry any failure * scsi_status_is_good returns false for. */ #define SCMD_FAILURE_STAT_ANY 0xff /* * The following can be set to the scsi_failure sense, asc and ascq fields to * match on any sense, ASC, or ASCQ value. */ #define SCMD_FAILURE_SENSE_ANY 0xff #define SCMD_FAILURE_ASC_ANY 0xff #define SCMD_FAILURE_ASCQ_ANY 0xff /* Always retry a matching failure. */ #define SCMD_FAILURE_NO_LIMIT -1 struct scsi_failure { int result; u8 sense; u8 asc; u8 ascq; /* * Number of times scsi_execute_cmd will retry the failure. It does * not count for the total_allowed. */ s8 allowed; /* Number of times the failure has been retried. */ s8 retries; }; struct scsi_failures { /* * If a scsi_failure does not have a retry limit setup this limit will * be used. */ int total_allowed; int total_retries; struct scsi_failure *failure_definitions; }; /* Optional arguments to scsi_execute_cmd */ struct scsi_exec_args { unsigned char *sense; /* sense buffer */ unsigned int sense_len; /* sense buffer len */ struct scsi_sense_hdr *sshdr; /* decoded sense header */ blk_mq_req_flags_t req_flags; /* BLK_MQ_REQ flags */ int scmd_flags; /* SCMD flags */ int *resid; /* residual length */ struct scsi_failures *failures; /* failures to retry */ }; int scsi_execute_cmd(struct scsi_device *sdev, const unsigned char *cmd, blk_opf_t opf, void *buffer, unsigned int bufflen, int timeout, int retries, const struct scsi_exec_args *args); void scsi_failures_reset_retries(struct scsi_failures *failures); extern void sdev_disable_disk_events(struct scsi_device *sdev); extern void sdev_enable_disk_events(struct scsi_device *sdev); extern int scsi_vpd_lun_id(struct scsi_device *, char *, size_t); extern int scsi_vpd_tpg_id(struct scsi_device *, int *); #ifdef CONFIG_PM extern int scsi_autopm_get_device(struct scsi_device *); extern void scsi_autopm_put_device(struct scsi_device *); #else static inline int scsi_autopm_get_device(struct scsi_device *d) { return 0; } static inline void scsi_autopm_put_device(struct scsi_device *d) {} #endif /* CONFIG_PM */ static inline int __must_check scsi_device_reprobe(struct scsi_device *sdev) { return device_reprobe(&sdev->sdev_gendev); } static inline unsigned int sdev_channel(struct scsi_device *sdev) { return sdev->channel; } static inline unsigned int sdev_id(struct scsi_device *sdev) { return sdev->id; } #define scmd_id(scmd) sdev_id((scmd)->device) #define scmd_channel(scmd) sdev_channel((scmd)->device) /* * checks for positions of the SCSI state machine */ static inline int scsi_device_online(struct scsi_device *sdev) { return (sdev->sdev_state != SDEV_OFFLINE && sdev->sdev_state != SDEV_TRANSPORT_OFFLINE && sdev->sdev_state != SDEV_DEL); } static inline int scsi_device_blocked(struct scsi_device *sdev) { return sdev->sdev_state == SDEV_BLOCK || sdev->sdev_state == SDEV_CREATED_BLOCK; } static inline int scsi_device_created(struct scsi_device *sdev) { return sdev->sdev_state == SDEV_CREATED || sdev->sdev_state == SDEV_CREATED_BLOCK; } int scsi_internal_device_block_nowait(struct scsi_device *sdev); int scsi_internal_device_unblock_nowait(struct scsi_device *sdev, enum scsi_device_state new_state); /* accessor functions for the SCSI parameters */ static inline int scsi_device_sync(struct scsi_device *sdev) { return sdev->sdtr; } static inline int scsi_device_wide(struct scsi_device *sdev) { return sdev->wdtr; } static inline int scsi_device_dt(struct scsi_device *sdev) { return sdev->ppr; } static inline int scsi_device_dt_only(struct scsi_device *sdev) { if (sdev->inquiry_len < 57) return 0; return (sdev->inquiry[56] & 0x0c) == 0x04; } static inline int scsi_device_ius(struct scsi_device *sdev) { if (sdev->inquiry_len < 57) return 0; return sdev->inquiry[56] & 0x01; } static inline int scsi_device_qas(struct scsi_device *sdev) { if (sdev->inquiry_len < 57) return 0; return sdev->inquiry[56] & 0x02; } static inline int scsi_device_enclosure(struct scsi_device *sdev) { return sdev->inquiry ? (sdev->inquiry[6] & (1<<6)) : 1; } static inline int scsi_device_protection(struct scsi_device *sdev) { if (sdev->no_dif) return 0; return sdev->scsi_level > SCSI_2 && sdev->inquiry[5] & (1<<0); } static inline int scsi_device_tpgs(struct scsi_device *sdev) { return sdev->inquiry ? (sdev->inquiry[5] >> 4) & 0x3 : 0; } /** * scsi_device_supports_vpd - test if a device supports VPD pages * @sdev: the &struct scsi_device to test * * If the 'try_vpd_pages' flag is set it takes precedence. * Otherwise we will assume VPD pages are supported if the * SCSI level is at least SPC-3 and 'skip_vpd_pages' is not set. */ static inline int scsi_device_supports_vpd(struct scsi_device *sdev) { /* Attempt VPD inquiry if the device blacklist explicitly calls * for it. */ if (sdev->try_vpd_pages) return 1; /* * Although VPD inquiries can go to SCSI-2 type devices, * some USB ones crash on receiving them, and the pages * we currently ask for are mandatory for SPC-2 and beyond */ if (sdev->scsi_level >= SCSI_SPC_2 && !sdev->skip_vpd_pages) return 1; return 0; } static inline int scsi_device_busy(struct scsi_device *sdev) { return sbitmap_weight(&sdev->budget_map); } /* Macros to access the UNIT ATTENTION counters */ #define scsi_get_ua_new_media_ctr(sdev) \ ((const unsigned int)(sdev->ua_new_media_ctr)) #define scsi_get_ua_por_ctr(sdev) \ ((const unsigned int)(sdev->ua_por_ctr)) #define MODULE_ALIAS_SCSI_DEVICE(type) \ MODULE_ALIAS("scsi:t-" __stringify(type) "*") #define SCSI_DEVICE_MODALIAS_FMT "scsi:t-0x%02x" #endif /* _SCSI_SCSI_DEVICE_H */ |
| 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 | // SPDX-License-Identifier: GPL-2.0-only /* * Driver for the Diolan DLN-2 USB adapter * * Copyright (c) 2014 Intel Corporation * * Derived from: * i2c-diolan-u2c.c * Copyright (c) 2010-2011 Ericsson AB */ #include <linux/kernel.h> #include <linux/module.h> #include <linux/types.h> #include <linux/slab.h> #include <linux/usb.h> #include <linux/mutex.h> #include <linux/platform_device.h> #include <linux/mfd/core.h> #include <linux/mfd/dln2.h> #include <linux/rculist.h> struct dln2_header { __le16 size; __le16 id; __le16 echo; __le16 handle; }; struct dln2_response { struct dln2_header hdr; __le16 result; }; #define DLN2_GENERIC_MODULE_ID 0x00 #define DLN2_GENERIC_CMD(cmd) DLN2_CMD(cmd, DLN2_GENERIC_MODULE_ID) #define CMD_GET_DEVICE_VER DLN2_GENERIC_CMD(0x30) #define CMD_GET_DEVICE_SN DLN2_GENERIC_CMD(0x31) #define DLN2_HW_ID 0x200 #define DLN2_USB_TIMEOUT 200 /* in ms */ #define DLN2_MAX_RX_SLOTS 16 #define DLN2_MAX_URBS 16 #define DLN2_RX_BUF_SIZE 512 enum dln2_handle { DLN2_HANDLE_EVENT = 0, /* don't change, hardware defined */ DLN2_HANDLE_CTRL, DLN2_HANDLE_GPIO, DLN2_HANDLE_I2C, DLN2_HANDLE_SPI, DLN2_HANDLE_ADC, DLN2_HANDLES }; /* * Receive context used between the receive demultiplexer and the transfer * routine. While sending a request the transfer routine will look for a free * receive context and use it to wait for a response and to receive the URB and * thus the response data. */ struct dln2_rx_context { /* completion used to wait for a response */ struct completion done; /* if non-NULL the URB contains the response */ struct urb *urb; /* if true then this context is used to wait for a response */ bool in_use; }; /* * Receive contexts for a particular DLN2 module (i2c, gpio, etc.). We use the * handle header field to identify the module in dln2_dev.mod_rx_slots and then * the echo header field to index the slots field and find the receive context * for a particular request. */ struct dln2_mod_rx_slots { /* RX slots bitmap */ DECLARE_BITMAP(bmap, DLN2_MAX_RX_SLOTS); /* used to wait for a free RX slot */ wait_queue_head_t wq; /* used to wait for an RX operation to complete */ struct dln2_rx_context slots[DLN2_MAX_RX_SLOTS]; /* avoid races between alloc/free_rx_slot and dln2_rx_transfer */ spinlock_t lock; }; struct dln2_dev { struct usb_device *usb_dev; struct usb_interface *interface; u8 ep_in; u8 ep_out; struct urb *rx_urb[DLN2_MAX_URBS]; void *rx_buf[DLN2_MAX_URBS]; struct dln2_mod_rx_slots mod_rx_slots[DLN2_HANDLES]; struct list_head event_cb_list; spinlock_t event_cb_lock; bool disconnect; int active_transfers; wait_queue_head_t disconnect_wq; spinlock_t disconnect_lock; }; struct dln2_event_cb_entry { struct list_head list; u16 id; struct platform_device *pdev; dln2_event_cb_t callback; }; int dln2_register_event_cb(struct platform_device *pdev, u16 id, dln2_event_cb_t event_cb) { struct dln2_dev *dln2 = dev_get_drvdata(pdev->dev.parent); struct dln2_event_cb_entry *i, *entry; unsigned long flags; int ret = 0; entry = kzalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; entry->id = id; entry->callback = event_cb; entry->pdev = pdev; spin_lock_irqsave(&dln2->event_cb_lock, flags); list_for_each_entry(i, &dln2->event_cb_list, list) { if (i->id == id) { ret = -EBUSY; break; } } if (!ret) list_add_rcu(&entry->list, &dln2->event_cb_list); spin_unlock_irqrestore(&dln2->event_cb_lock, flags); if (ret) kfree(entry); return ret; } EXPORT_SYMBOL(dln2_register_event_cb); void dln2_unregister_event_cb(struct platform_device *pdev, u16 id) { struct dln2_dev *dln2 = dev_get_drvdata(pdev->dev.parent); struct dln2_event_cb_entry *i; unsigned long flags; bool found = false; spin_lock_irqsave(&dln2->event_cb_lock, flags); list_for_each_entry(i, &dln2->event_cb_list, list) { if (i->id == id) { list_del_rcu(&i->list); found = true; break; } } spin_unlock_irqrestore(&dln2->event_cb_lock, flags); if (found) { synchronize_rcu(); kfree(i); } } EXPORT_SYMBOL(dln2_unregister_event_cb); /* * Returns true if a valid transfer slot is found. In this case the URB must not * be resubmitted immediately in dln2_rx as we need the data when dln2_transfer * is woke up. It will be resubmitted there. */ static bool dln2_transfer_complete(struct dln2_dev *dln2, struct urb *urb, u16 handle, u16 rx_slot) { struct device *dev = &dln2->interface->dev; struct dln2_mod_rx_slots *rxs = &dln2->mod_rx_slots[handle]; struct dln2_rx_context *rxc; unsigned long flags; bool valid_slot = false; if (rx_slot >= DLN2_MAX_RX_SLOTS) goto out; rxc = &rxs->slots[rx_slot]; spin_lock_irqsave(&rxs->lock, flags); if (rxc->in_use && !rxc->urb) { rxc->urb = urb; complete(&rxc->done); valid_slot = true; } spin_unlock_irqrestore(&rxs->lock, flags); out: if (!valid_slot) dev_warn(dev, "bad/late response %d/%d\n", handle, rx_slot); return valid_slot; } static void dln2_run_event_callbacks(struct dln2_dev *dln2, u16 id, u16 echo, void *data, int len) { struct dln2_event_cb_entry *i; rcu_read_lock(); list_for_each_entry_rcu(i, &dln2->event_cb_list, list) { if (i->id == id) { i->callback(i->pdev, echo, data, len); break; } } rcu_read_unlock(); } static void dln2_rx(struct urb *urb) { struct dln2_dev *dln2 = urb->context; struct dln2_header *hdr = urb->transfer_buffer; struct device *dev = &dln2->interface->dev; u16 id, echo, handle, size; u8 *data; int len; int err; switch (urb->status) { case 0: /* success */ break; case -ECONNRESET: case -ENOENT: case -ESHUTDOWN: case -EPIPE: /* this urb is terminated, clean up */ dev_dbg(dev, "urb shutting down with status %d\n", urb->status); return; default: dev_dbg(dev, "nonzero urb status received %d\n", urb->status); goto out; } if (urb->actual_length < sizeof(struct dln2_header)) { dev_err(dev, "short response: %d\n", urb->actual_length); goto out; } handle = le16_to_cpu(hdr->handle); id = le16_to_cpu(hdr->id); echo = le16_to_cpu(hdr->echo); size = le16_to_cpu(hdr->size); if (size != urb->actual_length) { dev_err(dev, "size mismatch: handle %x cmd %x echo %x size %d actual %d\n", handle, id, echo, size, urb->actual_length); goto out; } if (handle >= DLN2_HANDLES) { dev_warn(dev, "invalid handle %d\n", handle); goto out; } data = urb->transfer_buffer + sizeof(struct dln2_header); len = urb->actual_length - sizeof(struct dln2_header); if (handle == DLN2_HANDLE_EVENT) { unsigned long flags; spin_lock_irqsave(&dln2->event_cb_lock, flags); dln2_run_event_callbacks(dln2, id, echo, data, len); spin_unlock_irqrestore(&dln2->event_cb_lock, flags); } else { /* URB will be re-submitted in _dln2_transfer (free_rx_slot) */ if (dln2_transfer_complete(dln2, urb, handle, echo)) return; } out: err = usb_submit_urb(urb, GFP_ATOMIC); if (err < 0) dev_err(dev, "failed to resubmit RX URB: %d\n", err); } static void *dln2_prep_buf(u16 handle, u16 cmd, u16 echo, const void *obuf, int *obuf_len, gfp_t gfp) { int len; void *buf; struct dln2_header *hdr; len = *obuf_len + sizeof(*hdr); buf = kmalloc(len, gfp); if (!buf) return NULL; hdr = (struct dln2_header *)buf; hdr->id = cpu_to_le16(cmd); hdr->size = cpu_to_le16(len); hdr->echo = cpu_to_le16(echo); hdr->handle = cpu_to_le16(handle); memcpy(buf + sizeof(*hdr), obuf, *obuf_len); *obuf_len = len; return buf; } static int dln2_send_wait(struct dln2_dev *dln2, u16 handle, u16 cmd, u16 echo, const void *obuf, int obuf_len) { int ret = 0; int len = obuf_len; void *buf; int actual; buf = dln2_prep_buf(handle, cmd, echo, obuf, &len, GFP_KERNEL); if (!buf) return -ENOMEM; ret = usb_bulk_msg(dln2->usb_dev, usb_sndbulkpipe(dln2->usb_dev, dln2->ep_out), buf, len, &actual, DLN2_USB_TIMEOUT); kfree(buf); return ret; } static bool find_free_slot(struct dln2_dev *dln2, u16 handle, int *slot) { struct dln2_mod_rx_slots *rxs; unsigned long flags; if (dln2->disconnect) { *slot = -ENODEV; return true; } rxs = &dln2->mod_rx_slots[handle]; spin_lock_irqsave(&rxs->lock, flags); *slot = find_first_zero_bit(rxs->bmap, DLN2_MAX_RX_SLOTS); if (*slot < DLN2_MAX_RX_SLOTS) { struct dln2_rx_context *rxc = &rxs->slots[*slot]; set_bit(*slot, rxs->bmap); rxc->in_use = true; } spin_unlock_irqrestore(&rxs->lock, flags); return *slot < DLN2_MAX_RX_SLOTS; } static int alloc_rx_slot(struct dln2_dev *dln2, u16 handle) { int ret; int slot; /* * No need to timeout here, the wait is bounded by the timeout in * _dln2_transfer. */ ret = wait_event_interruptible(dln2->mod_rx_slots[handle].wq, find_free_slot(dln2, handle, &slot)); if (ret < 0) return ret; return slot; } static void free_rx_slot(struct dln2_dev *dln2, u16 handle, int slot) { struct dln2_mod_rx_slots *rxs; struct urb *urb = NULL; unsigned long flags; struct dln2_rx_context *rxc; rxs = &dln2->mod_rx_slots[handle]; spin_lock_irqsave(&rxs->lock, flags); clear_bit(slot, rxs->bmap); rxc = &rxs->slots[slot]; rxc->in_use = false; urb = rxc->urb; rxc->urb = NULL; reinit_completion(&rxc->done); spin_unlock_irqrestore(&rxs->lock, flags); if (urb) { int err; struct device *dev = &dln2->interface->dev; err = usb_submit_urb(urb, GFP_KERNEL); if (err < 0) dev_err(dev, "failed to resubmit RX URB: %d\n", err); } wake_up_interruptible(&rxs->wq); } static int _dln2_transfer(struct dln2_dev *dln2, u16 handle, u16 cmd, const void *obuf, unsigned obuf_len, void *ibuf, unsigned *ibuf_len) { int ret = 0; int rx_slot; struct dln2_response *rsp; struct dln2_rx_context *rxc; struct device *dev = &dln2->interface->dev; const unsigned long timeout = msecs_to_jiffies(DLN2_USB_TIMEOUT); struct dln2_mod_rx_slots *rxs = &dln2->mod_rx_slots[handle]; int size; spin_lock(&dln2->disconnect_lock); if (!dln2->disconnect) dln2->active_transfers++; else ret = -ENODEV; spin_unlock(&dln2->disconnect_lock); if (ret) return ret; rx_slot = alloc_rx_slot(dln2, handle); if (rx_slot < 0) { ret = rx_slot; goto out_decr; } ret = dln2_send_wait(dln2, handle, cmd, rx_slot, obuf, obuf_len); if (ret < 0) { dev_err(dev, "USB write failed: %d\n", ret); goto out_free_rx_slot; } rxc = &rxs->slots[rx_slot]; ret = wait_for_completion_interruptible_timeout(&rxc->done, timeout); if (ret <= 0) { if (!ret) ret = -ETIMEDOUT; goto out_free_rx_slot; } else { ret = 0; } if (dln2->disconnect) { ret = -ENODEV; goto out_free_rx_slot; } /* if we got here we know that the response header has been checked */ rsp = rxc->urb->transfer_buffer; size = le16_to_cpu(rsp->hdr.size); if (size < sizeof(*rsp)) { ret = -EPROTO; goto out_free_rx_slot; } if (le16_to_cpu(rsp->result) > 0x80) { dev_dbg(dev, "%d received response with error %d\n", handle, le16_to_cpu(rsp->result)); ret = -EREMOTEIO; goto out_free_rx_slot; } if (!ibuf) goto out_free_rx_slot; if (*ibuf_len > size - sizeof(*rsp)) *ibuf_len = size - sizeof(*rsp); memcpy(ibuf, rsp + 1, *ibuf_len); out_free_rx_slot: free_rx_slot(dln2, handle, rx_slot); out_decr: spin_lock(&dln2->disconnect_lock); dln2->active_transfers--; spin_unlock(&dln2->disconnect_lock); if (dln2->disconnect) wake_up(&dln2->disconnect_wq); return ret; } int dln2_transfer(struct platform_device *pdev, u16 cmd, const void *obuf, unsigned obuf_len, void *ibuf, unsigned *ibuf_len) { struct dln2_platform_data *dln2_pdata; struct dln2_dev *dln2; u16 handle; dln2 = dev_get_drvdata(pdev->dev.parent); dln2_pdata = dev_get_platdata(&pdev->dev); handle = dln2_pdata->handle; return _dln2_transfer(dln2, handle, cmd, obuf, obuf_len, ibuf, ibuf_len); } EXPORT_SYMBOL(dln2_transfer); static int dln2_check_hw(struct dln2_dev *dln2) { int ret; __le32 hw_type; int len = sizeof(hw_type); ret = _dln2_transfer(dln2, DLN2_HANDLE_CTRL, CMD_GET_DEVICE_VER, NULL, 0, &hw_type, &len); if (ret < 0) return ret; if (len < sizeof(hw_type)) return -EREMOTEIO; if (le32_to_cpu(hw_type) != DLN2_HW_ID) { dev_err(&dln2->interface->dev, "Device ID 0x%x not supported\n", le32_to_cpu(hw_type)); return -ENODEV; } return 0; } static int dln2_print_serialno(struct dln2_dev *dln2) { int ret; __le32 serial_no; int len = sizeof(serial_no); struct device *dev = &dln2->interface->dev; ret = _dln2_transfer(dln2, DLN2_HANDLE_CTRL, CMD_GET_DEVICE_SN, NULL, 0, &serial_no, &len); if (ret < 0) return ret; if (len < sizeof(serial_no)) return -EREMOTEIO; dev_info(dev, "Diolan DLN2 serial %u\n", le32_to_cpu(serial_no)); return 0; } static int dln2_hw_init(struct dln2_dev *dln2) { int ret; ret = dln2_check_hw(dln2); if (ret < 0) return ret; return dln2_print_serialno(dln2); } static void dln2_free_rx_urbs(struct dln2_dev *dln2) { int i; for (i = 0; i < DLN2_MAX_URBS; i++) { usb_free_urb(dln2->rx_urb[i]); kfree(dln2->rx_buf[i]); } } static void dln2_stop_rx_urbs(struct dln2_dev *dln2) { int i; for (i = 0; i < DLN2_MAX_URBS; i++) usb_kill_urb(dln2->rx_urb[i]); } static void dln2_free(struct dln2_dev *dln2) { dln2_free_rx_urbs(dln2); usb_put_dev(dln2->usb_dev); kfree(dln2); } static int dln2_setup_rx_urbs(struct dln2_dev *dln2, struct usb_host_interface *hostif) { int i; const int rx_max_size = DLN2_RX_BUF_SIZE; for (i = 0; i < DLN2_MAX_URBS; i++) { dln2->rx_buf[i] = kmalloc(rx_max_size, GFP_KERNEL); if (!dln2->rx_buf[i]) return -ENOMEM; dln2->rx_urb[i] = usb_alloc_urb(0, GFP_KERNEL); if (!dln2->rx_urb[i]) return -ENOMEM; usb_fill_bulk_urb(dln2->rx_urb[i], dln2->usb_dev, usb_rcvbulkpipe(dln2->usb_dev, dln2->ep_in), dln2->rx_buf[i], rx_max_size, dln2_rx, dln2); } return 0; } static int dln2_start_rx_urbs(struct dln2_dev *dln2, gfp_t gfp) { struct device *dev = &dln2->interface->dev; int ret; int i; for (i = 0; i < DLN2_MAX_URBS; i++) { ret = usb_submit_urb(dln2->rx_urb[i], gfp); if (ret < 0) { dev_err(dev, "failed to submit RX URB: %d\n", ret); return ret; } } return 0; } enum { DLN2_ACPI_MATCH_GPIO = 0, DLN2_ACPI_MATCH_I2C = 1, DLN2_ACPI_MATCH_SPI = 2, DLN2_ACPI_MATCH_ADC = 3, }; static struct dln2_platform_data dln2_pdata_gpio = { .handle = DLN2_HANDLE_GPIO, }; static struct mfd_cell_acpi_match dln2_acpi_match_gpio = { .adr = DLN2_ACPI_MATCH_GPIO, }; /* Only one I2C port seems to be supported on current hardware */ static struct dln2_platform_data dln2_pdata_i2c = { .handle = DLN2_HANDLE_I2C, .port = 0, }; static struct mfd_cell_acpi_match dln2_acpi_match_i2c = { .adr = DLN2_ACPI_MATCH_I2C, }; /* Only one SPI port supported */ static struct dln2_platform_data dln2_pdata_spi = { .handle = DLN2_HANDLE_SPI, .port = 0, }; static struct mfd_cell_acpi_match dln2_acpi_match_spi = { .adr = DLN2_ACPI_MATCH_SPI, }; /* Only one ADC port supported */ static struct dln2_platform_data dln2_pdata_adc = { .handle = DLN2_HANDLE_ADC, .port = 0, }; static struct mfd_cell_acpi_match dln2_acpi_match_adc = { .adr = DLN2_ACPI_MATCH_ADC, }; static const struct mfd_cell dln2_devs[] = { { .name = "dln2-gpio", .acpi_match = &dln2_acpi_match_gpio, .platform_data = &dln2_pdata_gpio, .pdata_size = sizeof(struct dln2_platform_data), }, { .name = "dln2-i2c", .acpi_match = &dln2_acpi_match_i2c, .platform_data = &dln2_pdata_i2c, .pdata_size = sizeof(struct dln2_platform_data), }, { .name = "dln2-spi", .acpi_match = &dln2_acpi_match_spi, .platform_data = &dln2_pdata_spi, .pdata_size = sizeof(struct dln2_platform_data), }, { .name = "dln2-adc", .acpi_match = &dln2_acpi_match_adc, .platform_data = &dln2_pdata_adc, .pdata_size = sizeof(struct dln2_platform_data), }, }; static void dln2_stop(struct dln2_dev *dln2) { int i, j; /* don't allow starting new transfers */ spin_lock(&dln2->disconnect_lock); dln2->disconnect = true; spin_unlock(&dln2->disconnect_lock); /* cancel in progress transfers */ for (i = 0; i < DLN2_HANDLES; i++) { struct dln2_mod_rx_slots *rxs = &dln2->mod_rx_slots[i]; unsigned long flags; spin_lock_irqsave(&rxs->lock, flags); /* cancel all response waiters */ for (j = 0; j < DLN2_MAX_RX_SLOTS; j++) { struct dln2_rx_context *rxc = &rxs->slots[j]; if (rxc->in_use) complete(&rxc->done); } spin_unlock_irqrestore(&rxs->lock, flags); } /* wait for transfers to end */ wait_event(dln2->disconnect_wq, !dln2->active_transfers); dln2_stop_rx_urbs(dln2); } static void dln2_disconnect(struct usb_interface *interface) { struct dln2_dev *dln2 = usb_get_intfdata(interface); dln2_stop(dln2); mfd_remove_devices(&interface->dev); dln2_free(dln2); } static int dln2_probe(struct usb_interface *interface, const struct usb_device_id *usb_id) { struct usb_host_interface *hostif = interface->cur_altsetting; struct usb_endpoint_descriptor *epin; struct usb_endpoint_descriptor *epout; struct device *dev = &interface->dev; struct dln2_dev *dln2; int ret; int i, j; if (hostif->desc.bInterfaceNumber != 0) return -ENODEV; ret = usb_find_common_endpoints(hostif, &epin, &epout, NULL, NULL); if (ret) return ret; dln2 = kzalloc(sizeof(*dln2), GFP_KERNEL); if (!dln2) return -ENOMEM; dln2->ep_out = epout->bEndpointAddress; dln2->ep_in = epin->bEndpointAddress; dln2->usb_dev = usb_get_dev(interface_to_usbdev(interface)); dln2->interface = interface; usb_set_intfdata(interface, dln2); init_waitqueue_head(&dln2->disconnect_wq); for (i = 0; i < DLN2_HANDLES; i++) { init_waitqueue_head(&dln2->mod_rx_slots[i].wq); spin_lock_init(&dln2->mod_rx_slots[i].lock); for (j = 0; j < DLN2_MAX_RX_SLOTS; j++) init_completion(&dln2->mod_rx_slots[i].slots[j].done); } spin_lock_init(&dln2->event_cb_lock); spin_lock_init(&dln2->disconnect_lock); INIT_LIST_HEAD(&dln2->event_cb_list); ret = dln2_setup_rx_urbs(dln2, hostif); if (ret) goto out_free; ret = dln2_start_rx_urbs(dln2, GFP_KERNEL); if (ret) goto out_stop_rx; ret = dln2_hw_init(dln2); if (ret < 0) { dev_err(dev, "failed to initialize hardware\n"); goto out_stop_rx; } ret = mfd_add_hotplug_devices(dev, dln2_devs, ARRAY_SIZE(dln2_devs)); if (ret != 0) { dev_err(dev, "failed to add mfd devices to core\n"); goto out_stop_rx; } return 0; out_stop_rx: dln2_stop_rx_urbs(dln2); out_free: dln2_free(dln2); return ret; } static int dln2_suspend(struct usb_interface *iface, pm_message_t message) { struct dln2_dev *dln2 = usb_get_intfdata(iface); dln2_stop(dln2); return 0; } static int dln2_resume(struct usb_interface *iface) { struct dln2_dev *dln2 = usb_get_intfdata(iface); dln2->disconnect = false; return dln2_start_rx_urbs(dln2, GFP_NOIO); } static const struct usb_device_id dln2_table[] = { { USB_DEVICE(0xa257, 0x2013) }, { } }; MODULE_DEVICE_TABLE(usb, dln2_table); static struct usb_driver dln2_driver = { .name = "dln2", .probe = dln2_probe, .disconnect = dln2_disconnect, .id_table = dln2_table, .suspend = dln2_suspend, .resume = dln2_resume, }; module_usb_driver(dln2_driver); MODULE_AUTHOR("Octavian Purdila <octavian.purdila@intel.com>"); MODULE_DESCRIPTION("Core driver for the Diolan DLN2 interface adapter"); MODULE_LICENSE("GPL v2"); |
| 9 9 1 1 2 1 1 1 7 7 6 6 7 5 5 5 5 5 16 16 1 1 1 13 7 3 1 2 9 9 10 1 1 1 4 3 2 3 3 3 2 1 1 3 3 1 1 1 1 1 5 1 4 20 20 1 19 1 17 14 1 15 1 2 13 1 1 2 1 16 7 5 3 2 9 12 12 6 7 10 2 12 2 11 4 1 3 5 12 12 4 3 7 3 3 1 1 1 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 | // SPDX-License-Identifier: GPL-2.0-or-later /* L2TPv3 IP encapsulation support for IPv6 * * Copyright (c) 2012 Katalix Systems Ltd */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/icmp.h> #include <linux/module.h> #include <linux/skbuff.h> #include <linux/random.h> #include <linux/socket.h> #include <linux/l2tp.h> #include <linux/in.h> #include <linux/in6.h> #include <net/sock.h> #include <net/ip.h> #include <net/icmp.h> #include <net/udp.h> #include <net/inet_common.h> #include <net/tcp_states.h> #include <net/protocol.h> #include <net/xfrm.h> #include <net/net_namespace.h> #include <net/netns/generic.h> #include <net/transp_v6.h> #include <net/addrconf.h> #include <net/ip6_route.h> #include "l2tp_core.h" /* per-net private data for this module */ static unsigned int l2tp_ip6_net_id; struct l2tp_ip6_net { rwlock_t l2tp_ip6_lock; struct hlist_head l2tp_ip6_table; struct hlist_head l2tp_ip6_bind_table; }; struct l2tp_ip6_sock { /* inet_sock has to be the first member of l2tp_ip6_sock */ struct inet_sock inet; u32 conn_id; u32 peer_conn_id; struct ipv6_pinfo inet6; }; static struct l2tp_ip6_sock *l2tp_ip6_sk(const struct sock *sk) { return (struct l2tp_ip6_sock *)sk; } static struct l2tp_ip6_net *l2tp_ip6_pernet(const struct net *net) { return net_generic(net, l2tp_ip6_net_id); } static struct sock *__l2tp_ip6_bind_lookup(const struct net *net, const struct in6_addr *laddr, const struct in6_addr *raddr, int dif, u32 tunnel_id) { struct l2tp_ip6_net *pn = l2tp_ip6_pernet(net); struct sock *sk; sk_for_each_bound(sk, &pn->l2tp_ip6_bind_table) { const struct in6_addr *sk_laddr = inet6_rcv_saddr(sk); const struct in6_addr *sk_raddr = &sk->sk_v6_daddr; const struct l2tp_ip6_sock *l2tp = l2tp_ip6_sk(sk); int bound_dev_if; if (!net_eq(sock_net(sk), net)) continue; bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); if (bound_dev_if && dif && bound_dev_if != dif) continue; if (sk_laddr && !ipv6_addr_any(sk_laddr) && !ipv6_addr_any(laddr) && !ipv6_addr_equal(sk_laddr, laddr)) continue; if (!ipv6_addr_any(sk_raddr) && raddr && !ipv6_addr_any(raddr) && !ipv6_addr_equal(sk_raddr, raddr)) continue; if (l2tp->conn_id != tunnel_id) continue; goto found; } sk = NULL; found: return sk; } /* When processing receive frames, there are two cases to * consider. Data frames consist of a non-zero session-id and an * optional cookie. Control frames consist of a regular L2TP header * preceded by 32-bits of zeros. * * L2TPv3 Session Header Over IP * * 0 1 2 3 * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | Session ID | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | Cookie (optional, maximum 64 bits)... * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * * L2TPv3 Control Message Header Over IP * * 0 1 2 3 * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | (32 bits of zeros) | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * |T|L|x|x|S|x|x|x|x|x|x|x| Ver | Length | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | Control Connection ID | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * | Ns | Nr | * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ * * All control frames are passed to userspace. */ static int l2tp_ip6_recv(struct sk_buff *skb) { struct net *net = dev_net(skb->dev); struct l2tp_ip6_net *pn; struct sock *sk; u32 session_id; u32 tunnel_id; unsigned char *ptr, *optr; struct l2tp_session *session; struct l2tp_tunnel *tunnel = NULL; struct ipv6hdr *iph; pn = l2tp_ip6_pernet(net); if (!pskb_may_pull(skb, 4)) goto discard; /* Point to L2TP header */ optr = skb->data; ptr = skb->data; session_id = ntohl(*((__be32 *)ptr)); ptr += 4; /* RFC3931: L2TP/IP packets have the first 4 bytes containing * the session_id. If it is 0, the packet is a L2TP control * frame and the session_id value can be discarded. */ if (session_id == 0) { __skb_pull(skb, 4); goto pass_up; } /* Ok, this is a data packet. Lookup the session. */ session = l2tp_v3_session_get(net, NULL, session_id); if (!session) goto discard; tunnel = session->tunnel; if (!tunnel) goto discard_sess; if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr)) goto discard_sess; l2tp_recv_common(session, skb, ptr, optr, 0, skb->len); l2tp_session_put(session); return 0; pass_up: /* Get the tunnel_id from the L2TP header */ if (!pskb_may_pull(skb, 12)) goto discard; if ((skb->data[0] & 0xc0) != 0xc0) goto discard; tunnel_id = ntohl(*(__be32 *)&skb->data[4]); iph = ipv6_hdr(skb); read_lock_bh(&pn->l2tp_ip6_lock); sk = __l2tp_ip6_bind_lookup(net, &iph->daddr, &iph->saddr, inet6_iif(skb), tunnel_id); if (!sk) { read_unlock_bh(&pn->l2tp_ip6_lock); goto discard; } sock_hold(sk); read_unlock_bh(&pn->l2tp_ip6_lock); if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_put; nf_reset_ct(skb); return sk_receive_skb(sk, skb, 1); discard_sess: l2tp_session_put(session); goto discard; discard_put: sock_put(sk); discard: kfree_skb(skb); return 0; } static int l2tp_ip6_hash(struct sock *sk) { struct l2tp_ip6_net *pn = l2tp_ip6_pernet(sock_net(sk)); if (sk_unhashed(sk)) { write_lock_bh(&pn->l2tp_ip6_lock); sk_add_node(sk, &pn->l2tp_ip6_table); write_unlock_bh(&pn->l2tp_ip6_lock); } return 0; } static void l2tp_ip6_unhash(struct sock *sk) { struct l2tp_ip6_net *pn = l2tp_ip6_pernet(sock_net(sk)); if (sk_unhashed(sk)) return; write_lock_bh(&pn->l2tp_ip6_lock); sk_del_node_init(sk); write_unlock_bh(&pn->l2tp_ip6_lock); } static int l2tp_ip6_open(struct sock *sk) { /* Prevent autobind. We don't have ports. */ inet_sk(sk)->inet_num = IPPROTO_L2TP; l2tp_ip6_hash(sk); return 0; } static void l2tp_ip6_close(struct sock *sk, long timeout) { struct l2tp_ip6_net *pn = l2tp_ip6_pernet(sock_net(sk)); write_lock_bh(&pn->l2tp_ip6_lock); hlist_del_init(&sk->sk_bind_node); sk_del_node_init(sk); write_unlock_bh(&pn->l2tp_ip6_lock); sk_common_release(sk); } static void l2tp_ip6_destroy_sock(struct sock *sk) { struct l2tp_tunnel *tunnel; lock_sock(sk); ip6_flush_pending_frames(sk); release_sock(sk); tunnel = l2tp_sk_to_tunnel(sk); if (tunnel) { l2tp_tunnel_delete(tunnel); l2tp_tunnel_put(tunnel); } } static int l2tp_ip6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_l2tpip6 *addr = (struct sockaddr_l2tpip6 *)uaddr; struct net *net = sock_net(sk); struct l2tp_ip6_net *pn; __be32 v4addr = 0; int bound_dev_if; int addr_type; int err; pn = l2tp_ip6_pernet(net); if (addr->l2tp_family != AF_INET6) return -EINVAL; if (addr_len < sizeof(*addr)) return -EINVAL; addr_type = ipv6_addr_type(&addr->l2tp_addr); /* l2tp_ip6 sockets are IPv6 only */ if (addr_type == IPV6_ADDR_MAPPED) return -EADDRNOTAVAIL; /* L2TP is point-point, not multicast */ if (addr_type & IPV6_ADDR_MULTICAST) return -EADDRNOTAVAIL; lock_sock(sk); err = -EINVAL; if (!sock_flag(sk, SOCK_ZAPPED)) goto out_unlock; if (sk->sk_state != TCP_CLOSE) goto out_unlock; bound_dev_if = sk->sk_bound_dev_if; /* Check if the address belongs to the host. */ rcu_read_lock(); if (addr_type != IPV6_ADDR_ANY) { struct net_device *dev = NULL; if (addr_type & IPV6_ADDR_LINKLOCAL) { if (addr->l2tp_scope_id) bound_dev_if = addr->l2tp_scope_id; /* Binding to link-local address requires an * interface. */ if (!bound_dev_if) goto out_unlock_rcu; err = -ENODEV; dev = dev_get_by_index_rcu(sock_net(sk), bound_dev_if); if (!dev) goto out_unlock_rcu; } /* ipv4 addr of the socket is invalid. Only the * unspecified and mapped address have a v4 equivalent. */ v4addr = LOOPBACK4_IPV6; err = -EADDRNOTAVAIL; if (!ipv6_chk_addr(sock_net(sk), &addr->l2tp_addr, dev, 0)) goto out_unlock_rcu; } rcu_read_unlock(); write_lock_bh(&pn->l2tp_ip6_lock); if (__l2tp_ip6_bind_lookup(net, &addr->l2tp_addr, NULL, bound_dev_if, addr->l2tp_conn_id)) { write_unlock_bh(&pn->l2tp_ip6_lock); err = -EADDRINUSE; goto out_unlock; } inet->inet_saddr = v4addr; inet->inet_rcv_saddr = v4addr; sk->sk_bound_dev_if = bound_dev_if; sk->sk_v6_rcv_saddr = addr->l2tp_addr; np->saddr = addr->l2tp_addr; l2tp_ip6_sk(sk)->conn_id = addr->l2tp_conn_id; sk_add_bind_node(sk, &pn->l2tp_ip6_bind_table); sk_del_node_init(sk); write_unlock_bh(&pn->l2tp_ip6_lock); sock_reset_flag(sk, SOCK_ZAPPED); release_sock(sk); return 0; out_unlock_rcu: rcu_read_unlock(); out_unlock: release_sock(sk); return err; } static int l2tp_ip6_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct sockaddr_l2tpip6 *lsa = (struct sockaddr_l2tpip6 *)uaddr; struct sockaddr_in6 *usin = (struct sockaddr_in6 *)uaddr; struct in6_addr *daddr; int addr_type; int rc; struct l2tp_ip6_net *pn; if (addr_len < sizeof(*lsa)) return -EINVAL; if (usin->sin6_family != AF_INET6) return -EINVAL; addr_type = ipv6_addr_type(&usin->sin6_addr); if (addr_type & IPV6_ADDR_MULTICAST) return -EINVAL; if (addr_type & IPV6_ADDR_MAPPED) { daddr = &usin->sin6_addr; if (ipv4_is_multicast(daddr->s6_addr32[3])) return -EINVAL; } lock_sock(sk); /* Must bind first - autobinding does not work */ if (sock_flag(sk, SOCK_ZAPPED)) { rc = -EINVAL; goto out_sk; } rc = __ip6_datagram_connect(sk, uaddr, addr_len); if (rc < 0) goto out_sk; l2tp_ip6_sk(sk)->peer_conn_id = lsa->l2tp_conn_id; pn = l2tp_ip6_pernet(sock_net(sk)); write_lock_bh(&pn->l2tp_ip6_lock); hlist_del_init(&sk->sk_bind_node); sk_add_bind_node(sk, &pn->l2tp_ip6_bind_table); write_unlock_bh(&pn->l2tp_ip6_lock); out_sk: release_sock(sk); return rc; } static int l2tp_ip6_disconnect(struct sock *sk, int flags) { if (sock_flag(sk, SOCK_ZAPPED)) return 0; return __udp_disconnect(sk, flags); } static int l2tp_ip6_getname(struct socket *sock, struct sockaddr *uaddr, int peer) { struct sockaddr_l2tpip6 *lsa = (struct sockaddr_l2tpip6 *)uaddr; struct sock *sk = sock->sk; struct ipv6_pinfo *np = inet6_sk(sk); struct l2tp_ip6_sock *lsk = l2tp_ip6_sk(sk); lsa->l2tp_family = AF_INET6; lsa->l2tp_flowinfo = 0; lsa->l2tp_scope_id = 0; lsa->l2tp_unused = 0; if (peer) { if (!lsk->peer_conn_id) return -ENOTCONN; lsa->l2tp_conn_id = lsk->peer_conn_id; lsa->l2tp_addr = sk->sk_v6_daddr; if (inet6_test_bit(SNDFLOW, sk)) lsa->l2tp_flowinfo = np->flow_label; } else { if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) lsa->l2tp_addr = np->saddr; else lsa->l2tp_addr = sk->sk_v6_rcv_saddr; lsa->l2tp_conn_id = lsk->conn_id; } if (ipv6_addr_type(&lsa->l2tp_addr) & IPV6_ADDR_LINKLOCAL) lsa->l2tp_scope_id = READ_ONCE(sk->sk_bound_dev_if); return sizeof(*lsa); } static int l2tp_ip6_backlog_recv(struct sock *sk, struct sk_buff *skb) { int rc; /* Charge it to the socket, dropping if the queue is full. */ rc = sock_queue_rcv_skb(sk, skb); if (rc < 0) goto drop; return 0; drop: IP_INC_STATS(sock_net(sk), IPSTATS_MIB_INDISCARDS); kfree_skb(skb); return -1; } static int l2tp_ip6_push_pending_frames(struct sock *sk) { struct sk_buff *skb; __be32 *transhdr = NULL; int err = 0; skb = skb_peek(&sk->sk_write_queue); if (!skb) goto out; transhdr = (__be32 *)skb_transport_header(skb); *transhdr = 0; err = ip6_push_pending_frames(sk); out: return err; } /* Userspace will call sendmsg() on the tunnel socket to send L2TP * control frames. */ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { struct ipv6_txoptions opt_space; DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name); struct in6_addr *daddr, *final_p, final; struct ipv6_pinfo *np = inet6_sk(sk); struct ipv6_txoptions *opt_to_free = NULL; struct ipv6_txoptions *opt = NULL; struct ip6_flowlabel *flowlabel = NULL; struct dst_entry *dst = NULL; struct flowi6 fl6; struct ipcm6_cookie ipc6; int addr_len = msg->msg_namelen; int transhdrlen = 4; /* zero session-id */ int ulen; int err; /* Rough check on arithmetic overflow, * better check is made in ip6_append_data(). */ if (len > INT_MAX - transhdrlen) return -EMSGSIZE; /* Mirror BSD error message compatibility */ if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP; /* Get and verify the address */ memset(&fl6, 0, sizeof(fl6)); fl6.flowi6_mark = READ_ONCE(sk->sk_mark); fl6.flowi6_uid = sk->sk_uid; ipcm6_init_sk(&ipc6, sk); if (lsa) { if (addr_len < SIN6_LEN_RFC2133) return -EINVAL; if (lsa->l2tp_family && lsa->l2tp_family != AF_INET6) return -EAFNOSUPPORT; daddr = &lsa->l2tp_addr; if (inet6_test_bit(SNDFLOW, sk)) { fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK; if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) { flowlabel = fl6_sock_lookup(sk, fl6.flowlabel); if (IS_ERR(flowlabel)) return -EINVAL; } } /* Otherwise it will be difficult to maintain * sk->sk_dst_cache. */ if (sk->sk_state == TCP_ESTABLISHED && ipv6_addr_equal(daddr, &sk->sk_v6_daddr)) daddr = &sk->sk_v6_daddr; if (addr_len >= sizeof(struct sockaddr_in6) && lsa->l2tp_scope_id && ipv6_addr_type(daddr) & IPV6_ADDR_LINKLOCAL) fl6.flowi6_oif = lsa->l2tp_scope_id; } else { if (sk->sk_state != TCP_ESTABLISHED) return -EDESTADDRREQ; daddr = &sk->sk_v6_daddr; fl6.flowlabel = np->flow_label; } if (fl6.flowi6_oif == 0) fl6.flowi6_oif = READ_ONCE(sk->sk_bound_dev_if); if (msg->msg_controllen) { opt = &opt_space; memset(opt, 0, sizeof(struct ipv6_txoptions)); opt->tot_len = sizeof(struct ipv6_txoptions); ipc6.opt = opt; err = ip6_datagram_send_ctl(sock_net(sk), sk, msg, &fl6, &ipc6); if (err < 0) { fl6_sock_release(flowlabel); return err; } if ((fl6.flowlabel & IPV6_FLOWLABEL_MASK) && !flowlabel) { flowlabel = fl6_sock_lookup(sk, fl6.flowlabel); if (IS_ERR(flowlabel)) return -EINVAL; } if (!(opt->opt_nflen | opt->opt_flen)) opt = NULL; } if (!opt) { opt = txopt_get(np); opt_to_free = opt; } if (flowlabel) opt = fl6_merge_options(&opt_space, flowlabel, opt); opt = ipv6_fixup_options(&opt_space, opt); ipc6.opt = opt; fl6.flowi6_proto = sk->sk_protocol; if (!ipv6_addr_any(daddr)) fl6.daddr = *daddr; else fl6.daddr.s6_addr[15] = 0x1; /* :: means loopback (BSD'ism) */ if (ipv6_addr_any(&fl6.saddr) && !ipv6_addr_any(&np->saddr)) fl6.saddr = np->saddr; final_p = fl6_update_dst(&fl6, opt, &final); if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr)) fl6.flowi6_oif = READ_ONCE(np->mcast_oif); else if (!fl6.flowi6_oif) fl6.flowi6_oif = READ_ONCE(np->ucast_oif); security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6)); fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel); dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; } if (ipc6.hlimit < 0) ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst); if (msg->msg_flags & MSG_CONFIRM) goto do_confirm; back_from_confirm: lock_sock(sk); ulen = len + (skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0); err = ip6_append_data(sk, ip_generic_getfrag, msg, ulen, transhdrlen, &ipc6, &fl6, dst_rt6_info(dst), msg->msg_flags); if (err) ip6_flush_pending_frames(sk); else if (!(msg->msg_flags & MSG_MORE)) err = l2tp_ip6_push_pending_frames(sk); release_sock(sk); done: dst_release(dst); out: fl6_sock_release(flowlabel); txopt_put(opt_to_free); return err < 0 ? err : len; do_confirm: if (msg->msg_flags & MSG_PROBE) dst_confirm_neigh(dst, &fl6.daddr); if (!(msg->msg_flags & MSG_PROBE) || len) goto back_from_confirm; err = 0; goto done; } static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len) { struct ipv6_pinfo *np = inet6_sk(sk); DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name); size_t copied = 0; int err = -EOPNOTSUPP; struct sk_buff *skb; if (flags & MSG_OOB) goto out; if (flags & MSG_ERRQUEUE) return ipv6_recv_error(sk, msg, len, addr_len); skb = skb_recv_datagram(sk, flags, &err); if (!skb) goto out; copied = skb->len; if (len < copied) { msg->msg_flags |= MSG_TRUNC; copied = len; } err = skb_copy_datagram_msg(skb, 0, msg, copied); if (err) goto done; sock_recv_timestamp(msg, sk, skb); /* Copy the address. */ if (lsa) { lsa->l2tp_family = AF_INET6; lsa->l2tp_unused = 0; lsa->l2tp_addr = ipv6_hdr(skb)->saddr; lsa->l2tp_flowinfo = 0; lsa->l2tp_scope_id = 0; lsa->l2tp_conn_id = 0; if (ipv6_addr_type(&lsa->l2tp_addr) & IPV6_ADDR_LINKLOCAL) lsa->l2tp_scope_id = inet6_iif(skb); *addr_len = sizeof(*lsa); } if (np->rxopt.all) ip6_datagram_recv_ctl(sk, msg, skb); if (flags & MSG_TRUNC) copied = skb->len; done: skb_free_datagram(sk, skb); out: return err ? err : copied; } static struct proto l2tp_ip6_prot = { .name = "L2TP/IPv6", .owner = THIS_MODULE, .init = l2tp_ip6_open, .close = l2tp_ip6_close, .bind = l2tp_ip6_bind, .connect = l2tp_ip6_connect, .disconnect = l2tp_ip6_disconnect, .ioctl = l2tp_ioctl, .destroy = l2tp_ip6_destroy_sock, .setsockopt = ipv6_setsockopt, .getsockopt = ipv6_getsockopt, .sendmsg = l2tp_ip6_sendmsg, .recvmsg = l2tp_ip6_recvmsg, .backlog_rcv = l2tp_ip6_backlog_recv, .hash = l2tp_ip6_hash, .unhash = l2tp_ip6_unhash, .obj_size = sizeof(struct l2tp_ip6_sock), .ipv6_pinfo_offset = offsetof(struct l2tp_ip6_sock, inet6), }; static const struct proto_ops l2tp_ip6_ops = { .family = PF_INET6, .owner = THIS_MODULE, .release = inet6_release, .bind = inet6_bind, .connect = inet_dgram_connect, .socketpair = sock_no_socketpair, .accept = sock_no_accept, .getname = l2tp_ip6_getname, .poll = datagram_poll, .ioctl = inet6_ioctl, .gettstamp = sock_gettstamp, .listen = sock_no_listen, .shutdown = inet_shutdown, .setsockopt = sock_common_setsockopt, .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, .recvmsg = sock_common_recvmsg, .mmap = sock_no_mmap, #ifdef CONFIG_COMPAT .compat_ioctl = inet6_compat_ioctl, #endif }; static struct inet_protosw l2tp_ip6_protosw = { .type = SOCK_DGRAM, .protocol = IPPROTO_L2TP, .prot = &l2tp_ip6_prot, .ops = &l2tp_ip6_ops, }; static struct inet6_protocol l2tp_ip6_protocol __read_mostly = { .handler = l2tp_ip6_recv, }; static __net_init int l2tp_ip6_init_net(struct net *net) { struct l2tp_ip6_net *pn = net_generic(net, l2tp_ip6_net_id); rwlock_init(&pn->l2tp_ip6_lock); INIT_HLIST_HEAD(&pn->l2tp_ip6_table); INIT_HLIST_HEAD(&pn->l2tp_ip6_bind_table); return 0; } static __net_exit void l2tp_ip6_exit_net(struct net *net) { struct l2tp_ip6_net *pn = l2tp_ip6_pernet(net); write_lock_bh(&pn->l2tp_ip6_lock); WARN_ON_ONCE(hlist_count_nodes(&pn->l2tp_ip6_table) != 0); WARN_ON_ONCE(hlist_count_nodes(&pn->l2tp_ip6_bind_table) != 0); write_unlock_bh(&pn->l2tp_ip6_lock); } static struct pernet_operations l2tp_ip6_net_ops = { .init = l2tp_ip6_init_net, .exit = l2tp_ip6_exit_net, .id = &l2tp_ip6_net_id, .size = sizeof(struct l2tp_ip6_net), }; static int __init l2tp_ip6_init(void) { int err; pr_info("L2TP IP encapsulation support for IPv6 (L2TPv3)\n"); err = register_pernet_device(&l2tp_ip6_net_ops); if (err) goto out; err = proto_register(&l2tp_ip6_prot, 1); if (err != 0) goto out1; err = inet6_add_protocol(&l2tp_ip6_protocol, IPPROTO_L2TP); if (err) goto out2; inet6_register_protosw(&l2tp_ip6_protosw); return 0; out2: proto_unregister(&l2tp_ip6_prot); out1: unregister_pernet_device(&l2tp_ip6_net_ops); out: return err; } static void __exit l2tp_ip6_exit(void) { inet6_unregister_protosw(&l2tp_ip6_protosw); inet6_del_protocol(&l2tp_ip6_protocol, IPPROTO_L2TP); proto_unregister(&l2tp_ip6_prot); unregister_pernet_device(&l2tp_ip6_net_ops); } module_init(l2tp_ip6_init); module_exit(l2tp_ip6_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Chris Elston <celston@katalix.com>"); MODULE_DESCRIPTION("L2TP IP encapsulation for IPv6"); MODULE_VERSION("1.0"); /* Use the values of SOCK_DGRAM (2) as type and IPPROTO_L2TP (115) as protocol, * because __stringify doesn't like enums */ MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_INET6, 115, 2); MODULE_ALIAS_NET_PF_PROTO(PF_INET6, 115); |
| 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | // SPDX-License-Identifier: GPL-2.0-only #include "netlink.h" #include "common.h" #include "bitset.h" struct eee_req_info { struct ethnl_req_info base; }; struct eee_reply_data { struct ethnl_reply_data base; struct ethtool_keee eee; }; #define EEE_REPDATA(__reply_base) \ container_of(__reply_base, struct eee_reply_data, base) const struct nla_policy ethnl_eee_get_policy[] = { [ETHTOOL_A_EEE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), }; static int eee_prepare_data(const struct ethnl_req_info *req_base, struct ethnl_reply_data *reply_base, const struct genl_info *info) { struct eee_reply_data *data = EEE_REPDATA(reply_base); struct net_device *dev = reply_base->dev; struct ethtool_keee *eee = &data->eee; int ret; if (!dev->ethtool_ops->get_eee) return -EOPNOTSUPP; ret = ethnl_ops_begin(dev); if (ret < 0) return ret; ret = dev->ethtool_ops->get_eee(dev, eee); ethnl_ops_complete(dev); return ret; } static int eee_reply_size(const struct ethnl_req_info *req_base, const struct ethnl_reply_data *reply_base) { bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; const struct eee_reply_data *data = EEE_REPDATA(reply_base); const struct ethtool_keee *eee = &data->eee; int len = 0; int ret; /* MODES_OURS */ ret = ethnl_bitset_size(eee->advertised, eee->supported, __ETHTOOL_LINK_MODE_MASK_NBITS, link_mode_names, compact); if (ret < 0) return ret; len += ret; /* MODES_PEERS */ ret = ethnl_bitset_size(eee->lp_advertised, NULL, __ETHTOOL_LINK_MODE_MASK_NBITS, link_mode_names, compact); if (ret < 0) return ret; len += ret; len += nla_total_size(sizeof(u8)) + /* _EEE_ACTIVE */ nla_total_size(sizeof(u8)) + /* _EEE_ENABLED */ nla_total_size(sizeof(u8)) + /* _EEE_TX_LPI_ENABLED */ nla_total_size(sizeof(u32)); /* _EEE_TX_LPI_TIMER */ return len; } static int eee_fill_reply(struct sk_buff *skb, const struct ethnl_req_info *req_base, const struct ethnl_reply_data *reply_base) { bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; const struct eee_reply_data *data = EEE_REPDATA(reply_base); const struct ethtool_keee *eee = &data->eee; int ret; ret = ethnl_put_bitset(skb, ETHTOOL_A_EEE_MODES_OURS, eee->advertised, eee->supported, __ETHTOOL_LINK_MODE_MASK_NBITS, link_mode_names, compact); if (ret < 0) return ret; ret = ethnl_put_bitset(skb, ETHTOOL_A_EEE_MODES_PEER, eee->lp_advertised, NULL, __ETHTOOL_LINK_MODE_MASK_NBITS, link_mode_names, compact); if (ret < 0) return ret; if (nla_put_u8(skb, ETHTOOL_A_EEE_ACTIVE, eee->eee_active) || nla_put_u8(skb, ETHTOOL_A_EEE_ENABLED, eee->eee_enabled) || nla_put_u8(skb, ETHTOOL_A_EEE_TX_LPI_ENABLED, eee->tx_lpi_enabled) || nla_put_u32(skb, ETHTOOL_A_EEE_TX_LPI_TIMER, eee->tx_lpi_timer)) return -EMSGSIZE; return 0; } /* EEE_SET */ const struct nla_policy ethnl_eee_set_policy[] = { [ETHTOOL_A_EEE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), [ETHTOOL_A_EEE_MODES_OURS] = { .type = NLA_NESTED }, [ETHTOOL_A_EEE_ENABLED] = { .type = NLA_U8 }, [ETHTOOL_A_EEE_TX_LPI_ENABLED] = { .type = NLA_U8 }, [ETHTOOL_A_EEE_TX_LPI_TIMER] = { .type = NLA_U32 }, }; static int ethnl_set_eee_validate(struct ethnl_req_info *req_info, struct genl_info *info) { const struct ethtool_ops *ops = req_info->dev->ethtool_ops; return ops->get_eee && ops->set_eee ? 1 : -EOPNOTSUPP; } static int ethnl_set_eee(struct ethnl_req_info *req_info, struct genl_info *info) { struct net_device *dev = req_info->dev; struct nlattr **tb = info->attrs; struct ethtool_keee eee = {}; bool mod = false; int ret; ret = dev->ethtool_ops->get_eee(dev, &eee); if (ret < 0) return ret; ret = ethnl_update_bitset(eee.advertised, __ETHTOOL_LINK_MODE_MASK_NBITS, tb[ETHTOOL_A_EEE_MODES_OURS], link_mode_names, info->extack, &mod); if (ret < 0) return ret; ethnl_update_bool(&eee.eee_enabled, tb[ETHTOOL_A_EEE_ENABLED], &mod); ethnl_update_bool(&eee.tx_lpi_enabled, tb[ETHTOOL_A_EEE_TX_LPI_ENABLED], &mod); ethnl_update_u32(&eee.tx_lpi_timer, tb[ETHTOOL_A_EEE_TX_LPI_TIMER], &mod); if (!mod) return 0; ret = dev->ethtool_ops->set_eee(dev, &eee); return ret < 0 ? ret : 1; } const struct ethnl_request_ops ethnl_eee_request_ops = { .request_cmd = ETHTOOL_MSG_EEE_GET, .reply_cmd = ETHTOOL_MSG_EEE_GET_REPLY, .hdr_attr = ETHTOOL_A_EEE_HEADER, .req_info_size = sizeof(struct eee_req_info), .reply_data_size = sizeof(struct eee_reply_data), .prepare_data = eee_prepare_data, .reply_size = eee_reply_size, .fill_reply = eee_fill_reply, .set_validate = ethnl_set_eee_validate, .set = ethnl_set_eee, .set_ntf_cmd = ETHTOOL_MSG_EEE_NTF, }; |
| 9 9 9 8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 | // SPDX-License-Identifier: GPL-2.0 #include <linux/quotaops.h> #include <linux/uuid.h> #include "ext4.h" #include "xattr.h" #include "ext4_jbd2.h" static void ext4_fname_from_fscrypt_name(struct ext4_filename *dst, const struct fscrypt_name *src) { memset(dst, 0, sizeof(*dst)); dst->usr_fname = src->usr_fname; dst->disk_name = src->disk_name; dst->hinfo.hash = src->hash; dst->hinfo.minor_hash = src->minor_hash; dst->crypto_buf = src->crypto_buf; } int ext4_fname_setup_filename(struct inode *dir, const struct qstr *iname, int lookup, struct ext4_filename *fname) { struct fscrypt_name name; int err; err = fscrypt_setup_filename(dir, iname, lookup, &name); if (err) return err; ext4_fname_from_fscrypt_name(fname, &name); err = ext4_fname_setup_ci_filename(dir, iname, fname); if (err) ext4_fname_free_filename(fname); return err; } int ext4_fname_prepare_lookup(struct inode *dir, struct dentry *dentry, struct ext4_filename *fname) { struct fscrypt_name name; int err; err = fscrypt_prepare_lookup(dir, dentry, &name); if (err) return err; ext4_fname_from_fscrypt_name(fname, &name); err = ext4_fname_setup_ci_filename(dir, &dentry->d_name, fname); if (err) ext4_fname_free_filename(fname); return err; } void ext4_fname_free_filename(struct ext4_filename *fname) { struct fscrypt_name name; name.crypto_buf = fname->crypto_buf; fscrypt_free_filename(&name); fname->crypto_buf.name = NULL; fname->usr_fname = NULL; fname->disk_name.name = NULL; ext4_fname_free_ci_filename(fname); } static bool uuid_is_zero(__u8 u[16]) { int i; for (i = 0; i < 16; i++) if (u[i]) return false; return true; } int ext4_ioctl_get_encryption_pwsalt(struct file *filp, void __user *arg) { struct super_block *sb = file_inode(filp)->i_sb; struct ext4_sb_info *sbi = EXT4_SB(sb); int err, err2; handle_t *handle; if (!ext4_has_feature_encrypt(sb)) return -EOPNOTSUPP; if (uuid_is_zero(sbi->s_es->s_encrypt_pw_salt)) { err = mnt_want_write_file(filp); if (err) return err; handle = ext4_journal_start_sb(sb, EXT4_HT_MISC, 1); if (IS_ERR(handle)) { err = PTR_ERR(handle); goto pwsalt_err_exit; } err = ext4_journal_get_write_access(handle, sb, sbi->s_sbh, EXT4_JTR_NONE); if (err) goto pwsalt_err_journal; lock_buffer(sbi->s_sbh); generate_random_uuid(sbi->s_es->s_encrypt_pw_salt); ext4_superblock_csum_set(sb); unlock_buffer(sbi->s_sbh); err = ext4_handle_dirty_metadata(handle, NULL, sbi->s_sbh); pwsalt_err_journal: err2 = ext4_journal_stop(handle); if (err2 && !err) err = err2; pwsalt_err_exit: mnt_drop_write_file(filp); if (err) return err; } if (copy_to_user(arg, sbi->s_es->s_encrypt_pw_salt, 16)) return -EFAULT; return 0; } static int ext4_get_context(struct inode *inode, void *ctx, size_t len) { return ext4_xattr_get(inode, EXT4_XATTR_INDEX_ENCRYPTION, EXT4_XATTR_NAME_ENCRYPTION_CONTEXT, ctx, len); } static int ext4_set_context(struct inode *inode, const void *ctx, size_t len, void *fs_data) { handle_t *handle = fs_data; int res, res2, credits, retries = 0; /* * Encrypting the root directory is not allowed because e2fsck expects * lost+found to exist and be unencrypted, and encrypting the root * directory would imply encrypting the lost+found directory as well as * the filename "lost+found" itself. */ if (inode->i_ino == EXT4_ROOT_INO) return -EPERM; if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode))) return -EINVAL; if (ext4_test_inode_flag(inode, EXT4_INODE_DAX)) return -EOPNOTSUPP; res = ext4_convert_inline_data(inode); if (res) return res; /* * If a journal handle was specified, then the encryption context is * being set on a new inode via inheritance and is part of a larger * transaction to create the inode. Otherwise the encryption context is * being set on an existing inode in its own transaction. Only in the * latter case should the "retry on ENOSPC" logic be used. */ if (handle) { res = ext4_xattr_set_handle(handle, inode, EXT4_XATTR_INDEX_ENCRYPTION, EXT4_XATTR_NAME_ENCRYPTION_CONTEXT, ctx, len, 0); if (!res) { ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT); ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); /* * Update inode->i_flags - S_ENCRYPTED will be enabled, * S_DAX may be disabled */ ext4_set_inode_flags(inode, false); } return res; } res = dquot_initialize(inode); if (res) return res; retry: res = ext4_xattr_set_credits(inode, len, false /* is_create */, &credits); if (res) return res; handle = ext4_journal_start(inode, EXT4_HT_MISC, credits); if (IS_ERR(handle)) return PTR_ERR(handle); res = ext4_xattr_set_handle(handle, inode, EXT4_XATTR_INDEX_ENCRYPTION, EXT4_XATTR_NAME_ENCRYPTION_CONTEXT, ctx, len, 0); if (!res) { ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT); /* * Update inode->i_flags - S_ENCRYPTED will be enabled, * S_DAX may be disabled */ ext4_set_inode_flags(inode, false); res = ext4_mark_inode_dirty(handle, inode); if (res) EXT4_ERROR_INODE(inode, "Failed to mark inode dirty"); } res2 = ext4_journal_stop(handle); if (res == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry; if (!res) res = res2; return res; } static const union fscrypt_policy *ext4_get_dummy_policy(struct super_block *sb) { return EXT4_SB(sb)->s_dummy_enc_policy.policy; } static bool ext4_has_stable_inodes(struct super_block *sb) { return ext4_has_feature_stable_inodes(sb); } const struct fscrypt_operations ext4_cryptops = { .needs_bounce_pages = 1, .has_32bit_inodes = 1, .supports_subblock_data_units = 1, .legacy_key_prefix = "ext4:", .get_context = ext4_get_context, .set_context = ext4_set_context, .get_dummy_policy = ext4_get_dummy_policy, .empty_dir = ext4_empty_dir, .has_stable_inodes = ext4_has_stable_inodes, }; |
| 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (C) 2016 Parav Pandit <pandit.parav@gmail.com> */ #include "core_priv.h" /** * ib_device_register_rdmacg - register with rdma cgroup. * @device: device to register to participate in resource * accounting by rdma cgroup. * * Register with the rdma cgroup. Should be called before * exposing rdma device to user space applications to avoid * resource accounting leak. */ void ib_device_register_rdmacg(struct ib_device *device) { device->cg_device.name = device->name; rdmacg_register_device(&device->cg_device); } /** * ib_device_unregister_rdmacg - unregister with rdma cgroup. * @device: device to unregister. * * Unregister with the rdma cgroup. Should be called after * all the resources are deallocated, and after a stage when any * other resource allocation by user application cannot be done * for this device to avoid any leak in accounting. */ void ib_device_unregister_rdmacg(struct ib_device *device) { rdmacg_unregister_device(&device->cg_device); } int ib_rdmacg_try_charge(struct ib_rdmacg_object *cg_obj, struct ib_device *device, enum rdmacg_resource_type resource_index) { return rdmacg_try_charge(&cg_obj->cg, &device->cg_device, resource_index); } EXPORT_SYMBOL(ib_rdmacg_try_charge); void ib_rdmacg_uncharge(struct ib_rdmacg_object *cg_obj, struct ib_device *device, enum rdmacg_resource_type resource_index) { rdmacg_uncharge(cg_obj->cg, &device->cg_device, resource_index); } EXPORT_SYMBOL(ib_rdmacg_uncharge); |
| 10 11 19 19 1 11 1 10 11 10 11 5 1 4 3 1 2 5 6 5 1 1 7 7 3 6 5 4 1 2 5 5 2 1 1 1 1 2 3 3 1 1 1 1 18 4 1 1 3 2 2 3 3 3 5 3 10 10 10 8 2 10 8 1 7 11 1 1 9 14 13 10 7 7 7 1 6 7 7 3 3 1 2 7 7 7 7 7 7 3 3 3 3 3 3 1 2 3 3 1 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 | // SPDX-License-Identifier: GPL-2.0-only /* * Framework for buffer objects that can be shared across devices/subsystems. * * Copyright(C) 2011 Linaro Limited. All rights reserved. * Author: Sumit Semwal <sumit.semwal@ti.com> * * Many thanks to linaro-mm-sig list, and specially * Arnd Bergmann <arnd@arndb.de>, Rob Clark <rob@ti.com> and * Daniel Vetter <daniel@ffwll.ch> for their support in creation and * refining of this idea. */ #include <linux/fs.h> #include <linux/slab.h> #include <linux/dma-buf.h> #include <linux/dma-fence.h> #include <linux/dma-fence-unwrap.h> #include <linux/anon_inodes.h> #include <linux/export.h> #include <linux/debugfs.h> #include <linux/list.h> #include <linux/module.h> #include <linux/mutex.h> #include <linux/seq_file.h> #include <linux/sync_file.h> #include <linux/poll.h> #include <linux/dma-resv.h> #include <linux/mm.h> #include <linux/mount.h> #include <linux/pseudo_fs.h> #include <uapi/linux/dma-buf.h> #include <uapi/linux/magic.h> #include "dma-buf-sysfs-stats.h" static inline int is_dma_buf_file(struct file *); static DEFINE_MUTEX(dmabuf_list_mutex); static LIST_HEAD(dmabuf_list); static void __dma_buf_list_add(struct dma_buf *dmabuf) { mutex_lock(&dmabuf_list_mutex); list_add(&dmabuf->list_node, &dmabuf_list); mutex_unlock(&dmabuf_list_mutex); } static void __dma_buf_list_del(struct dma_buf *dmabuf) { if (!dmabuf) return; mutex_lock(&dmabuf_list_mutex); list_del(&dmabuf->list_node); mutex_unlock(&dmabuf_list_mutex); } /** * dma_buf_iter_begin - begin iteration through global list of all DMA buffers * * Returns the first buffer in the global list of DMA-bufs that's not in the * process of being destroyed. Increments that buffer's reference count to * prevent buffer destruction. Callers must release the reference, either by * continuing iteration with dma_buf_iter_next(), or with dma_buf_put(). * * Return: * * First buffer from global list, with refcount elevated * * NULL if no active buffers are present */ struct dma_buf *dma_buf_iter_begin(void) { struct dma_buf *ret = NULL, *dmabuf; /* * The list mutex does not protect a dmabuf's refcount, so it can be * zeroed while we are iterating. We cannot call get_dma_buf() since the * caller may not already own a reference to the buffer. */ mutex_lock(&dmabuf_list_mutex); list_for_each_entry(dmabuf, &dmabuf_list, list_node) { if (file_ref_get(&dmabuf->file->f_ref)) { ret = dmabuf; break; } } mutex_unlock(&dmabuf_list_mutex); return ret; } /** * dma_buf_iter_next - continue iteration through global list of all DMA buffers * @dmabuf: [in] pointer to dma_buf * * Decrements the reference count on the provided buffer. Returns the next * buffer from the remainder of the global list of DMA-bufs with its reference * count incremented. Callers must release the reference, either by continuing * iteration with dma_buf_iter_next(), or with dma_buf_put(). * * Return: * * Next buffer from global list, with refcount elevated * * NULL if no additional active buffers are present */ struct dma_buf *dma_buf_iter_next(struct dma_buf *dmabuf) { struct dma_buf *ret = NULL; /* * The list mutex does not protect a dmabuf's refcount, so it can be * zeroed while we are iterating. We cannot call get_dma_buf() since the * caller may not already own a reference to the buffer. */ mutex_lock(&dmabuf_list_mutex); dma_buf_put(dmabuf); list_for_each_entry_continue(dmabuf, &dmabuf_list, list_node) { if (file_ref_get(&dmabuf->file->f_ref)) { ret = dmabuf; break; } } mutex_unlock(&dmabuf_list_mutex); return ret; } static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen) { struct dma_buf *dmabuf; char name[DMA_BUF_NAME_LEN]; ssize_t ret = 0; dmabuf = dentry->d_fsdata; spin_lock(&dmabuf->name_lock); if (dmabuf->name) ret = strscpy(name, dmabuf->name, sizeof(name)); spin_unlock(&dmabuf->name_lock); return dynamic_dname(buffer, buflen, "/%s:%s", dentry->d_name.name, ret > 0 ? name : ""); } static void dma_buf_release(struct dentry *dentry) { struct dma_buf *dmabuf; dmabuf = dentry->d_fsdata; if (unlikely(!dmabuf)) return; BUG_ON(dmabuf->vmapping_counter); /* * If you hit this BUG() it could mean: * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); dma_buf_stats_teardown(dmabuf); dmabuf->ops->release(dmabuf); if (dmabuf->resv == (struct dma_resv *)&dmabuf[1]) dma_resv_fini(dmabuf->resv); WARN_ON(!list_empty(&dmabuf->attachments)); module_put(dmabuf->owner); kfree(dmabuf->name); kfree(dmabuf); } static int dma_buf_file_release(struct inode *inode, struct file *file) { if (!is_dma_buf_file(file)) return -EINVAL; __dma_buf_list_del(file->private_data); return 0; } static const struct dentry_operations dma_buf_dentry_ops = { .d_dname = dmabuffs_dname, .d_release = dma_buf_release, }; static struct vfsmount *dma_buf_mnt; static int dma_buf_fs_init_context(struct fs_context *fc) { struct pseudo_fs_context *ctx; ctx = init_pseudo(fc, DMA_BUF_MAGIC); if (!ctx) return -ENOMEM; ctx->dops = &dma_buf_dentry_ops; return 0; } static struct file_system_type dma_buf_fs_type = { .name = "dmabuf", .init_fs_context = dma_buf_fs_init_context, .kill_sb = kill_anon_super, }; static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) { struct dma_buf *dmabuf; if (!is_dma_buf_file(file)) return -EINVAL; dmabuf = file->private_data; /* check if buffer supports mmap */ if (!dmabuf->ops->mmap) return -EINVAL; /* check for overflowing the buffer's size */ if (vma->vm_pgoff + vma_pages(vma) > dmabuf->size >> PAGE_SHIFT) return -EINVAL; return dmabuf->ops->mmap(dmabuf, vma); } static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) { struct dma_buf *dmabuf; loff_t base; if (!is_dma_buf_file(file)) return -EBADF; dmabuf = file->private_data; /* only support discovering the end of the buffer, * but also allow SEEK_SET to maintain the idiomatic * SEEK_END(0), SEEK_CUR(0) pattern. */ if (whence == SEEK_END) base = dmabuf->size; else if (whence == SEEK_SET) base = 0; else return -EINVAL; if (offset != 0) return -EINVAL; return base + offset; } /** * DOC: implicit fence polling * * To support cross-device and cross-driver synchronization of buffer access * implicit fences (represented internally in the kernel with &struct dma_fence) * can be attached to a &dma_buf. The glue for that and a few related things are * provided in the &dma_resv structure. * * Userspace can query the state of these implicitly tracked fences using poll() * and related system calls: * * - Checking for EPOLLIN, i.e. read access, can be use to query the state of the * most recent write or exclusive fence. * * - Checking for EPOLLOUT, i.e. write access, can be used to query the state of * all attached fences, shared and exclusive ones. * * Note that this only signals the completion of the respective fences, i.e. the * DMA transfers are complete. Cache flushing and any other necessary * preparations before CPU access can begin still need to happen. * * As an alternative to poll(), the set of fences on DMA buffer can be * exported as a &sync_file using &dma_buf_sync_file_export. */ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) { struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); unsigned long flags; spin_lock_irqsave(&dcb->poll->lock, flags); wake_up_locked_poll(dcb->poll, dcb->active); dcb->active = 0; spin_unlock_irqrestore(&dcb->poll->lock, flags); dma_fence_put(fence); /* Paired with get_file in dma_buf_poll */ fput(dmabuf->file); } static bool dma_buf_poll_add_cb(struct dma_resv *resv, bool write, struct dma_buf_poll_cb_t *dcb) { struct dma_resv_iter cursor; struct dma_fence *fence; int r; dma_resv_for_each_fence(&cursor, resv, dma_resv_usage_rw(write), fence) { dma_fence_get(fence); r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb); if (!r) return true; dma_fence_put(fence); } return false; } static __poll_t dma_buf_poll(struct file *file, poll_table *poll) { struct dma_buf *dmabuf; struct dma_resv *resv; __poll_t events; dmabuf = file->private_data; if (!dmabuf || !dmabuf->resv) return EPOLLERR; resv = dmabuf->resv; poll_wait(file, &dmabuf->poll, poll); events = poll_requested_events(poll) & (EPOLLIN | EPOLLOUT); if (!events) return 0; dma_resv_lock(resv, NULL); if (events & EPOLLOUT) { struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_out; /* Check that callback isn't busy */ spin_lock_irq(&dmabuf->poll.lock); if (dcb->active) events &= ~EPOLLOUT; else dcb->active = EPOLLOUT; spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLOUT) { /* Paired with fput in dma_buf_poll_cb */ get_file(dmabuf->file); if (!dma_buf_poll_add_cb(resv, true, dcb)) /* No callback queued, wake up any other waiters */ dma_buf_poll_cb(NULL, &dcb->cb); else events &= ~EPOLLOUT; } } if (events & EPOLLIN) { struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in; /* Check that callback isn't busy */ spin_lock_irq(&dmabuf->poll.lock); if (dcb->active) events &= ~EPOLLIN; else dcb->active = EPOLLIN; spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLIN) { /* Paired with fput in dma_buf_poll_cb */ get_file(dmabuf->file); if (!dma_buf_poll_add_cb(resv, false, dcb)) /* No callback queued, wake up any other waiters */ dma_buf_poll_cb(NULL, &dcb->cb); else events &= ~EPOLLIN; } } dma_resv_unlock(resv); return events; } /** * dma_buf_set_name - Set a name to a specific dma_buf to track the usage. * It could support changing the name of the dma-buf if the same * piece of memory is used for multiple purpose between different devices. * * @dmabuf: [in] dmabuf buffer that will be renamed. * @buf: [in] A piece of userspace memory that contains the name of * the dma-buf. * * Returns 0 on success. If the dma-buf buffer is already attached to * devices, return -EBUSY. * */ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf) { char *name = strndup_user(buf, DMA_BUF_NAME_LEN); if (IS_ERR(name)) return PTR_ERR(name); spin_lock(&dmabuf->name_lock); kfree(dmabuf->name); dmabuf->name = name; spin_unlock(&dmabuf->name_lock); return 0; } #if IS_ENABLED(CONFIG_SYNC_FILE) static long dma_buf_export_sync_file(struct dma_buf *dmabuf, void __user *user_data) { struct dma_buf_export_sync_file arg; enum dma_resv_usage usage; struct dma_fence *fence = NULL; struct sync_file *sync_file; int fd, ret; if (copy_from_user(&arg, user_data, sizeof(arg))) return -EFAULT; if (arg.flags & ~DMA_BUF_SYNC_RW) return -EINVAL; if ((arg.flags & DMA_BUF_SYNC_RW) == 0) return -EINVAL; fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) return fd; usage = dma_resv_usage_rw(arg.flags & DMA_BUF_SYNC_WRITE); ret = dma_resv_get_singleton(dmabuf->resv, usage, &fence); if (ret) goto err_put_fd; if (!fence) fence = dma_fence_get_stub(); sync_file = sync_file_create(fence); dma_fence_put(fence); if (!sync_file) { ret = -ENOMEM; goto err_put_fd; } arg.fd = fd; if (copy_to_user(user_data, &arg, sizeof(arg))) { ret = -EFAULT; goto err_put_file; } fd_install(fd, sync_file->file); return 0; err_put_file: fput(sync_file->file); err_put_fd: put_unused_fd(fd); return ret; } static long dma_buf_import_sync_file(struct dma_buf *dmabuf, const void __user *user_data) { struct dma_buf_import_sync_file arg; struct dma_fence *fence, *f; enum dma_resv_usage usage; struct dma_fence_unwrap iter; unsigned int num_fences; int ret = 0; if (copy_from_user(&arg, user_data, sizeof(arg))) return -EFAULT; if (arg.flags & ~DMA_BUF_SYNC_RW) return -EINVAL; if ((arg.flags & DMA_BUF_SYNC_RW) == 0) return -EINVAL; fence = sync_file_get_fence(arg.fd); if (!fence) return -EINVAL; usage = (arg.flags & DMA_BUF_SYNC_WRITE) ? DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ; num_fences = 0; dma_fence_unwrap_for_each(f, &iter, fence) ++num_fences; if (num_fences > 0) { dma_resv_lock(dmabuf->resv, NULL); ret = dma_resv_reserve_fences(dmabuf->resv, num_fences); if (!ret) { dma_fence_unwrap_for_each(f, &iter, fence) dma_resv_add_fence(dmabuf->resv, f, usage); } dma_resv_unlock(dmabuf->resv); } dma_fence_put(fence); return ret; } #endif static long dma_buf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct dma_buf *dmabuf; struct dma_buf_sync sync; enum dma_data_direction direction; int ret; dmabuf = file->private_data; switch (cmd) { case DMA_BUF_IOCTL_SYNC: if (copy_from_user(&sync, (void __user *) arg, sizeof(sync))) return -EFAULT; if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK) return -EINVAL; switch (sync.flags & DMA_BUF_SYNC_RW) { case DMA_BUF_SYNC_READ: direction = DMA_FROM_DEVICE; break; case DMA_BUF_SYNC_WRITE: direction = DMA_TO_DEVICE; break; case DMA_BUF_SYNC_RW: direction = DMA_BIDIRECTIONAL; break; default: return -EINVAL; } if (sync.flags & DMA_BUF_SYNC_END) ret = dma_buf_end_cpu_access(dmabuf, direction); else ret = dma_buf_begin_cpu_access(dmabuf, direction); return ret; case DMA_BUF_SET_NAME_A: case DMA_BUF_SET_NAME_B: return dma_buf_set_name(dmabuf, (const char __user *)arg); #if IS_ENABLED(CONFIG_SYNC_FILE) case DMA_BUF_IOCTL_EXPORT_SYNC_FILE: return dma_buf_export_sync_file(dmabuf, (void __user *)arg); case DMA_BUF_IOCTL_IMPORT_SYNC_FILE: return dma_buf_import_sync_file(dmabuf, (const void __user *)arg); #endif default: return -ENOTTY; } } static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) { struct dma_buf *dmabuf = file->private_data; seq_printf(m, "size:\t%zu\n", dmabuf->size); /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1); seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name); spin_lock(&dmabuf->name_lock); if (dmabuf->name) seq_printf(m, "name:\t%s\n", dmabuf->name); spin_unlock(&dmabuf->name_lock); } static const struct file_operations dma_buf_fops = { .release = dma_buf_file_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, .poll = dma_buf_poll, .unlocked_ioctl = dma_buf_ioctl, .compat_ioctl = compat_ptr_ioctl, .show_fdinfo = dma_buf_show_fdinfo, }; /* * is_dma_buf_file - Check if struct file* is associated with dma_buf */ static inline int is_dma_buf_file(struct file *file) { return file->f_op == &dma_buf_fops; } static struct file *dma_buf_getfile(size_t size, int flags) { static atomic64_t dmabuf_inode = ATOMIC64_INIT(0); struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb); struct file *file; if (IS_ERR(inode)) return ERR_CAST(inode); inode->i_size = size; inode_set_bytes(inode, size); /* * The ->i_ino acquired from get_next_ino() is not unique thus * not suitable for using it as dentry name by dmabuf stats. * Override ->i_ino with the unique and dmabuffs specific * value. */ inode->i_ino = atomic64_inc_return(&dmabuf_inode); flags &= O_ACCMODE | O_NONBLOCK; file = alloc_file_pseudo(inode, dma_buf_mnt, "dmabuf", flags, &dma_buf_fops); if (IS_ERR(file)) goto err_alloc_file; return file; err_alloc_file: iput(inode); return file; } /** * DOC: dma buf device access * * For device DMA access to a shared DMA buffer the usual sequence of operations * is fairly simple: * * 1. The exporter defines his exporter instance using * DEFINE_DMA_BUF_EXPORT_INFO() and calls dma_buf_export() to wrap a private * buffer object into a &dma_buf. It then exports that &dma_buf to userspace * as a file descriptor by calling dma_buf_fd(). * * 2. Userspace passes this file-descriptors to all drivers it wants this buffer * to share with: First the file descriptor is converted to a &dma_buf using * dma_buf_get(). Then the buffer is attached to the device using * dma_buf_attach(). * * Up to this stage the exporter is still free to migrate or reallocate the * backing storage. * * 3. Once the buffer is attached to all devices userspace can initiate DMA * access to the shared buffer. In the kernel this is done by calling * dma_buf_map_attachment() and dma_buf_unmap_attachment(). * * 4. Once a driver is done with a shared buffer it needs to call * dma_buf_detach() (after cleaning up any mappings) and then release the * reference acquired with dma_buf_get() by calling dma_buf_put(). * * For the detailed semantics exporters are expected to implement see * &dma_buf_ops. */ /** * dma_buf_export - Creates a new dma_buf, and associates an anon file * with this buffer, so it can be exported. * Also connect the allocator specific data and ops to the buffer. * Additionally, provide a name string for exporter; useful in debugging. * * @exp_info: [in] holds all the export related information provided * by the exporter. see &struct dma_buf_export_info * for further details. * * Returns, on success, a newly created struct dma_buf object, which wraps the * supplied private data and operations for struct dma_buf_ops. On either * missing ops, or error in allocating struct dma_buf, will return negative * error. * * For most cases the easiest way to create @exp_info is through the * %DEFINE_DMA_BUF_EXPORT_INFO macro. */ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) { struct dma_buf *dmabuf; struct dma_resv *resv = exp_info->resv; struct file *file; size_t alloc_size = sizeof(struct dma_buf); int ret; if (WARN_ON(!exp_info->priv || !exp_info->ops || !exp_info->ops->map_dma_buf || !exp_info->ops->unmap_dma_buf || !exp_info->ops->release)) return ERR_PTR(-EINVAL); if (WARN_ON(!exp_info->ops->pin != !exp_info->ops->unpin)) return ERR_PTR(-EINVAL); if (!try_module_get(exp_info->owner)) return ERR_PTR(-ENOENT); file = dma_buf_getfile(exp_info->size, exp_info->flags); if (IS_ERR(file)) { ret = PTR_ERR(file); goto err_module; } if (!exp_info->resv) alloc_size += sizeof(struct dma_resv); else /* prevent &dma_buf[1] == dma_buf->resv */ alloc_size += 1; dmabuf = kzalloc(alloc_size, GFP_KERNEL); if (!dmabuf) { ret = -ENOMEM; goto err_file; } dmabuf->priv = exp_info->priv; dmabuf->ops = exp_info->ops; dmabuf->size = exp_info->size; dmabuf->exp_name = exp_info->exp_name; dmabuf->owner = exp_info->owner; spin_lock_init(&dmabuf->name_lock); init_waitqueue_head(&dmabuf->poll); dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; dmabuf->cb_in.active = dmabuf->cb_out.active = 0; INIT_LIST_HEAD(&dmabuf->attachments); if (!resv) { dmabuf->resv = (struct dma_resv *)&dmabuf[1]; dma_resv_init(dmabuf->resv); } else { dmabuf->resv = resv; } ret = dma_buf_stats_setup(dmabuf, file); if (ret) goto err_dmabuf; file->private_data = dmabuf; file->f_path.dentry->d_fsdata = dmabuf; dmabuf->file = file; __dma_buf_list_add(dmabuf); return dmabuf; err_dmabuf: if (!resv) dma_resv_fini(dmabuf->resv); kfree(dmabuf); err_file: fput(file); err_module: module_put(exp_info->owner); return ERR_PTR(ret); } EXPORT_SYMBOL_NS_GPL(dma_buf_export, "DMA_BUF"); /** * dma_buf_fd - returns a file descriptor for the given struct dma_buf * @dmabuf: [in] pointer to dma_buf for which fd is required. * @flags: [in] flags to give to fd * * On success, returns an associated 'fd'. Else, returns error. */ int dma_buf_fd(struct dma_buf *dmabuf, int flags) { int fd; if (!dmabuf || !dmabuf->file) return -EINVAL; fd = get_unused_fd_flags(flags); if (fd < 0) return fd; fd_install(fd, dmabuf->file); return fd; } EXPORT_SYMBOL_NS_GPL(dma_buf_fd, "DMA_BUF"); /** * dma_buf_get - returns the struct dma_buf related to an fd * @fd: [in] fd associated with the struct dma_buf to be returned * * On success, returns the struct dma_buf associated with an fd; uses * file's refcounting done by fget to increase refcount. returns ERR_PTR * otherwise. */ struct dma_buf *dma_buf_get(int fd) { struct file *file; file = fget(fd); if (!file) return ERR_PTR(-EBADF); if (!is_dma_buf_file(file)) { fput(file); return ERR_PTR(-EINVAL); } return file->private_data; } EXPORT_SYMBOL_NS_GPL(dma_buf_get, "DMA_BUF"); /** * dma_buf_put - decreases refcount of the buffer * @dmabuf: [in] buffer to reduce refcount of * * Uses file's refcounting done implicitly by fput(). * * If, as a result of this call, the refcount becomes 0, the 'release' file * operation related to this fd is called. It calls &dma_buf_ops.release vfunc * in turn, and frees the memory allocated for dmabuf when exported. */ void dma_buf_put(struct dma_buf *dmabuf) { if (WARN_ON(!dmabuf || !dmabuf->file)) return; fput(dmabuf->file); } EXPORT_SYMBOL_NS_GPL(dma_buf_put, "DMA_BUF"); static void mangle_sg_table(struct sg_table *sg_table) { #ifdef CONFIG_DMABUF_DEBUG int i; struct scatterlist *sg; /* To catch abuse of the underlying struct page by importers mix * up the bits, but take care to preserve the low SG_ bits to * not corrupt the sgt. The mixing is undone on unmap * before passing the sgt back to the exporter. */ for_each_sgtable_sg(sg_table, sg, i) sg->page_link ^= ~0xffUL; #endif } static inline bool dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach) { return !!attach->importer_ops; } static bool dma_buf_pin_on_map(struct dma_buf_attachment *attach) { return attach->dmabuf->ops->pin && (!dma_buf_attachment_is_dynamic(attach) || !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)); } /** * DOC: locking convention * * In order to avoid deadlock situations between dma-buf exports and importers, * all dma-buf API users must follow the common dma-buf locking convention. * * Convention for importers * * 1. Importers must hold the dma-buf reservation lock when calling these * functions: * * - dma_buf_pin() * - dma_buf_unpin() * - dma_buf_map_attachment() * - dma_buf_unmap_attachment() * - dma_buf_vmap() * - dma_buf_vunmap() * * 2. Importers must not hold the dma-buf reservation lock when calling these * functions: * * - dma_buf_attach() * - dma_buf_dynamic_attach() * - dma_buf_detach() * - dma_buf_export() * - dma_buf_fd() * - dma_buf_get() * - dma_buf_put() * - dma_buf_mmap() * - dma_buf_begin_cpu_access() * - dma_buf_end_cpu_access() * - dma_buf_map_attachment_unlocked() * - dma_buf_unmap_attachment_unlocked() * - dma_buf_vmap_unlocked() * - dma_buf_vunmap_unlocked() * * Convention for exporters * * 1. These &dma_buf_ops callbacks are invoked with unlocked dma-buf * reservation and exporter can take the lock: * * - &dma_buf_ops.attach() * - &dma_buf_ops.detach() * - &dma_buf_ops.release() * - &dma_buf_ops.begin_cpu_access() * - &dma_buf_ops.end_cpu_access() * - &dma_buf_ops.mmap() * * 2. These &dma_buf_ops callbacks are invoked with locked dma-buf * reservation and exporter can't take the lock: * * - &dma_buf_ops.pin() * - &dma_buf_ops.unpin() * - &dma_buf_ops.map_dma_buf() * - &dma_buf_ops.unmap_dma_buf() * - &dma_buf_ops.vmap() * - &dma_buf_ops.vunmap() * * 3. Exporters must hold the dma-buf reservation lock when calling these * functions: * * - dma_buf_move_notify() */ /** * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list * @dmabuf: [in] buffer to attach device to. * @dev: [in] device to be attached. * @importer_ops: [in] importer operations for the attachment * @importer_priv: [in] importer private pointer for the attachment * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). * * Optionally this calls &dma_buf_ops.attach to allow device-specific attach * functionality. * * Returns: * * A pointer to newly created &dma_buf_attachment on success, or a negative * error code wrapped into a pointer on failure. * * Note that this can fail if the backing storage of @dmabuf is in a place not * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ struct dma_buf_attachment * dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, const struct dma_buf_attach_ops *importer_ops, void *importer_priv) { struct dma_buf_attachment *attach; int ret; if (WARN_ON(!dmabuf || !dev)) return ERR_PTR(-EINVAL); if (WARN_ON(importer_ops && !importer_ops->move_notify)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); attach->dev = dev; attach->dmabuf = dmabuf; if (importer_ops) attach->peer2peer = importer_ops->allow_peer2peer; attach->importer_ops = importer_ops; attach->importer_priv = importer_priv; if (dmabuf->ops->attach) { ret = dmabuf->ops->attach(dmabuf, attach); if (ret) goto err_attach; } dma_resv_lock(dmabuf->resv, NULL); list_add(&attach->node, &dmabuf->attachments); dma_resv_unlock(dmabuf->resv); return attach; err_attach: kfree(attach); return ERR_PTR(ret); } EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); /** * dma_buf_attach - Wrapper for dma_buf_dynamic_attach * @dmabuf: [in] buffer to attach device to. * @dev: [in] device to be attached. * * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static * mapping. */ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev) { return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); } EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); /** * dma_buf_detach - Remove the given attachment from dmabuf's attachments list * @dmabuf: [in] buffer to detach from. * @attach: [in] attachment to be detached; is free'd after this call. * * Clean up a device attachment obtained by calling dma_buf_attach(). * * Optionally this calls &dma_buf_ops.detach for device-specific detach. */ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) { if (WARN_ON(!dmabuf || !attach || dmabuf != attach->dmabuf)) return; dma_resv_lock(dmabuf->resv, NULL); list_del(&attach->node); dma_resv_unlock(dmabuf->resv); if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach); kfree(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_detach, "DMA_BUF"); /** * dma_buf_pin - Lock down the DMA-buf * @attach: [in] attachment which should be pinned * * Only dynamic importers (who set up @attach with dma_buf_dynamic_attach()) may * call this, and only for limited use cases like scanout and not for temporary * pin operations. It is not permitted to allow userspace to pin arbitrary * amounts of buffers through this interface. * * Buffers must be unpinned by calling dma_buf_unpin(). * * Returns: * 0 on success, negative error code on failure. */ int dma_buf_pin(struct dma_buf_attachment *attach) { struct dma_buf *dmabuf = attach->dmabuf; int ret = 0; WARN_ON(!attach->importer_ops); dma_resv_assert_held(dmabuf->resv); if (dmabuf->ops->pin) ret = dmabuf->ops->pin(attach); return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF"); /** * dma_buf_unpin - Unpin a DMA-buf * @attach: [in] attachment which should be unpinned * * This unpins a buffer pinned by dma_buf_pin() and allows the exporter to move * any mapping of @attach again and inform the importer through * &dma_buf_attach_ops.move_notify. */ void dma_buf_unpin(struct dma_buf_attachment *attach) { struct dma_buf *dmabuf = attach->dmabuf; WARN_ON(!attach->importer_ops); dma_resv_assert_held(dmabuf->resv); if (dmabuf->ops->unpin) dmabuf->ops->unpin(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_unpin, "DMA_BUF"); /** * dma_buf_map_attachment - Returns the scatterlist table of the attachment; * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the * dma_buf_ops. * @attach: [in] attachment whose scatterlist is to be returned * @direction: [in] direction of DMA transfer * * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR * on error. May return -EINTR if it is interrupted by a signal. * * On success, the DMA addresses and lengths in the returned scatterlist are * PAGE_SIZE aligned. * * A mapping must be unmapped by using dma_buf_unmap_attachment(). Note that * the underlying backing storage is pinned for as long as a mapping exists, * therefore users/importers should not hold onto a mapping for undue amounts of * time. * * Important: Dynamic importers must wait for the exclusive fence of the struct * dma_resv attached to the DMA-BUF first. */ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, enum dma_data_direction direction) { struct sg_table *sg_table; signed long ret; might_sleep(); if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); dma_resv_assert_held(attach->dmabuf->resv); if (dma_buf_pin_on_map(attach)) { ret = attach->dmabuf->ops->pin(attach); /* * Catch exporters making buffers inaccessible even when * attachments preventing that exist. */ WARN_ON_ONCE(ret == -EBUSY); if (ret) return ERR_PTR(ret); } sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); if (!sg_table) sg_table = ERR_PTR(-ENOMEM); if (IS_ERR(sg_table)) goto error_unpin; /* * Importers with static attachments don't wait for fences. */ if (!dma_buf_attachment_is_dynamic(attach)) { ret = dma_resv_wait_timeout(attach->dmabuf->resv, DMA_RESV_USAGE_KERNEL, true, MAX_SCHEDULE_TIMEOUT); if (ret < 0) goto error_unmap; } mangle_sg_table(sg_table); #ifdef CONFIG_DMA_API_DEBUG { struct scatterlist *sg; u64 addr; int len; int i; for_each_sgtable_dma_sg(sg_table, sg, i) { addr = sg_dma_address(sg); len = sg_dma_len(sg); if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(len)) { pr_debug("%s: addr %llx or len %x is not page aligned!\n", __func__, addr, len); } } } #endif /* CONFIG_DMA_API_DEBUG */ return sg_table; error_unmap: attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction); sg_table = ERR_PTR(ret); error_unpin: if (dma_buf_pin_on_map(attach)) attach->dmabuf->ops->unpin(attach); return sg_table; } EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment, "DMA_BUF"); /** * dma_buf_map_attachment_unlocked - Returns the scatterlist table of the attachment; * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the * dma_buf_ops. * @attach: [in] attachment whose scatterlist is to be returned * @direction: [in] direction of DMA transfer * * Unlocked variant of dma_buf_map_attachment(). */ struct sg_table * dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, enum dma_data_direction direction) { struct sg_table *sg_table; might_sleep(); if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); dma_resv_lock(attach->dmabuf->resv, NULL); sg_table = dma_buf_map_attachment(attach, direction); dma_resv_unlock(attach->dmabuf->resv); return sg_table; } EXPORT_SYMBOL_NS_GPL(dma_buf_map_attachment_unlocked, "DMA_BUF"); /** * dma_buf_unmap_attachment - unmaps and decreases usecount of the buffer;might * deallocate the scatterlist associated. Is a wrapper for unmap_dma_buf() of * dma_buf_ops. * @attach: [in] attachment to unmap buffer from * @sg_table: [in] scatterlist info of the buffer to unmap * @direction: [in] direction of DMA transfer * * This unmaps a DMA mapping for @attached obtained by dma_buf_map_attachment(). */ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, struct sg_table *sg_table, enum dma_data_direction direction) { might_sleep(); if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; dma_resv_assert_held(attach->dmabuf->resv); mangle_sg_table(sg_table); attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction); if (dma_buf_pin_on_map(attach)) attach->dmabuf->ops->unpin(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment, "DMA_BUF"); /** * dma_buf_unmap_attachment_unlocked - unmaps and decreases usecount of the buffer;might * deallocate the scatterlist associated. Is a wrapper for unmap_dma_buf() of * dma_buf_ops. * @attach: [in] attachment to unmap buffer from * @sg_table: [in] scatterlist info of the buffer to unmap * @direction: [in] direction of DMA transfer * * Unlocked variant of dma_buf_unmap_attachment(). */ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, struct sg_table *sg_table, enum dma_data_direction direction) { might_sleep(); if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; dma_resv_lock(attach->dmabuf->resv, NULL); dma_buf_unmap_attachment(attach, sg_table, direction); dma_resv_unlock(attach->dmabuf->resv); } EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF"); /** * dma_buf_move_notify - notify attachments that DMA-buf is moving * * @dmabuf: [in] buffer which is moving * * Informs all attachments that they need to destroy and recreate all their * mappings. */ void dma_buf_move_notify(struct dma_buf *dmabuf) { struct dma_buf_attachment *attach; dma_resv_assert_held(dmabuf->resv); list_for_each_entry(attach, &dmabuf->attachments, node) if (attach->importer_ops) attach->importer_ops->move_notify(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF"); /** * DOC: cpu access * * There are multiple reasons for supporting CPU access to a dma buffer object: * * - Fallback operations in the kernel, for example when a device is connected * over USB and the kernel needs to shuffle the data around first before * sending it away. Cache coherency is handled by bracketing any transactions * with calls to dma_buf_begin_cpu_access() and dma_buf_end_cpu_access() * access. * * Since for most kernel internal dma-buf accesses need the entire buffer, a * vmap interface is introduced. Note that on very old 32-bit architectures * vmalloc space might be limited and result in vmap calls failing. * * Interfaces: * * .. code-block:: c * * void *dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) * void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) * * The vmap call can fail if there is no vmap support in the exporter, or if * it runs out of vmalloc space. Note that the dma-buf layer keeps a reference * count for all vmap access and calls down into the exporter's vmap function * only when no vmapping exists, and only unmaps it once. Protection against * concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex. * * - For full compatibility on the importer side with existing userspace * interfaces, which might already support mmap'ing buffers. This is needed in * many processing pipelines (e.g. feeding a software rendered image into a * hardware pipeline, thumbnail creation, snapshots, ...). Also, Android's ION * framework already supported this and for DMA buffer file descriptors to * replace ION buffers mmap support was needed. * * There is no special interfaces, userspace simply calls mmap on the dma-buf * fd. But like for CPU access there's a need to bracket the actual access, * which is handled by the ioctl (DMA_BUF_IOCTL_SYNC). Note that * DMA_BUF_IOCTL_SYNC can fail with -EAGAIN or -EINTR, in which case it must * be restarted. * * Some systems might need some sort of cache coherency management e.g. when * CPU and GPU domains are being accessed through dma-buf at the same time. * To circumvent this problem there are begin/end coherency markers, that * forward directly to existing dma-buf device drivers vfunc hooks. Userspace * can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The * sequence would be used like following: * * - mmap dma-buf fd * - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write * to mmap area 3. SYNC_END ioctl. This can be repeated as often as you * want (with the new data being consumed by say the GPU or the scanout * device) * - munmap once you don't need the buffer any more * * For correctness and optimal performance, it is always required to use * SYNC_START and SYNC_END before and after, respectively, when accessing the * mapped address. Userspace cannot rely on coherent access, even when there * are systems where it just works without calling these ioctls. * * - And as a CPU fallback in userspace processing pipelines. * * Similar to the motivation for kernel cpu access it is again important that * the userspace code of a given importing subsystem can use the same * interfaces with a imported dma-buf buffer object as with a native buffer * object. This is especially important for drm where the userspace part of * contemporary OpenGL, X, and other drivers is huge, and reworking them to * use a different way to mmap a buffer rather invasive. * * The assumption in the current dma-buf interfaces is that redirecting the * initial mmap is all that's needed. A survey of some of the existing * subsystems shows that no driver seems to do any nefarious thing like * syncing up with outstanding asynchronous processing on the device or * allocating special resources at fault time. So hopefully this is good * enough, since adding interfaces to intercept pagefaults and allow pte * shootdowns would increase the complexity quite a bit. * * Interface: * * .. code-block:: c * * int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); * * If the importing subsystem simply provides a special-purpose mmap call to * set up a mapping in userspace, calling do_mmap with &dma_buf.file will * equally achieve that for a dma-buf object. */ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { bool write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE); struct dma_resv *resv = dmabuf->resv; long ret; /* Wait on any implicit rendering fences */ ret = dma_resv_wait_timeout(resv, dma_resv_usage_rw(write), true, MAX_SCHEDULE_TIMEOUT); if (ret < 0) return ret; return 0; } /** * dma_buf_begin_cpu_access - Must be called before accessing a dma_buf from the * cpu in the kernel context. Calls begin_cpu_access to allow exporter-specific * preparations. Coherency is only guaranteed in the specified range for the * specified access direction. * @dmabuf: [in] buffer to prepare cpu access for. * @direction: [in] direction of access. * * After the cpu access is complete the caller should call * dma_buf_end_cpu_access(). Only when cpu access is bracketed by both calls is * it guaranteed to be coherent with other DMA access. * * This function will also wait for any DMA transactions tracked through * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit * synchronization this function will only ensure cache coherency, callers must * ensure synchronization with such DMA transactions on their own. * * Can return negative error values, returns 0 on success. */ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { int ret = 0; if (WARN_ON(!dmabuf)) return -EINVAL; might_lock(&dmabuf->resv->lock.base); if (dmabuf->ops->begin_cpu_access) ret = dmabuf->ops->begin_cpu_access(dmabuf, direction); /* Ensure that all fences are waited upon - but we first allow * the native handler the chance to do so more efficiently if it * chooses. A double invocation here will be reasonably cheap no-op. */ if (ret == 0) ret = __dma_buf_begin_cpu_access(dmabuf, direction); return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_begin_cpu_access, "DMA_BUF"); /** * dma_buf_end_cpu_access - Must be called after accessing a dma_buf from the * cpu in the kernel context. Calls end_cpu_access to allow exporter-specific * actions. Coherency is only guaranteed in the specified range for the * specified access direction. * @dmabuf: [in] buffer to complete cpu access for. * @direction: [in] direction of access. * * This terminates CPU access started with dma_buf_begin_cpu_access(). * * Can return negative error values, returns 0 on success. */ int dma_buf_end_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { int ret = 0; WARN_ON(!dmabuf); might_lock(&dmabuf->resv->lock.base); if (dmabuf->ops->end_cpu_access) ret = dmabuf->ops->end_cpu_access(dmabuf, direction); return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_end_cpu_access, "DMA_BUF"); /** * dma_buf_mmap - Setup up a userspace mmap with the given vma * @dmabuf: [in] buffer that should back the vma * @vma: [in] vma for the mmap * @pgoff: [in] offset in pages where this mmap should start within the * dma-buf buffer. * * This function adjusts the passed in vma so that it points at the file of the * dma_buf operation. It also adjusts the starting pgoff and does bounds * checking on the size of the vma. Then it calls the exporters mmap function to * set up the mapping. * * Can return negative error values, returns 0 on success. */ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, unsigned long pgoff) { if (WARN_ON(!dmabuf || !vma)) return -EINVAL; /* check if buffer supports mmap */ if (!dmabuf->ops->mmap) return -EINVAL; /* check for offset overflow */ if (pgoff + vma_pages(vma) < pgoff) return -EOVERFLOW; /* check for overflowing the buffer's size */ if (pgoff + vma_pages(vma) > dmabuf->size >> PAGE_SHIFT) return -EINVAL; /* readjust the vma */ vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff; return dmabuf->ops->mmap(dmabuf, vma); } EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, "DMA_BUF"); /** * dma_buf_vmap - Create virtual mapping for the buffer object into kernel * address space. Same restrictions as for vmap and friends apply. * @dmabuf: [in] buffer to vmap * @map: [out] returns the vmap pointer * * This call may fail due to lack of virtual mapping address space. * These calls are optional in drivers. The intended use for them * is for mapping objects linear in kernel space for high use objects. * * To ensure coherency users must call dma_buf_begin_cpu_access() and * dma_buf_end_cpu_access() around any cpu access performed through this * mapping. * * Returns 0 on success, or a negative errno code otherwise. */ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) { struct iosys_map ptr; int ret; iosys_map_clear(map); if (WARN_ON(!dmabuf)) return -EINVAL; dma_resv_assert_held(dmabuf->resv); if (!dmabuf->ops->vmap) return -EINVAL; if (dmabuf->vmapping_counter) { dmabuf->vmapping_counter++; BUG_ON(iosys_map_is_null(&dmabuf->vmap_ptr)); *map = dmabuf->vmap_ptr; return 0; } BUG_ON(iosys_map_is_set(&dmabuf->vmap_ptr)); ret = dmabuf->ops->vmap(dmabuf, &ptr); if (WARN_ON_ONCE(ret)) return ret; dmabuf->vmap_ptr = ptr; dmabuf->vmapping_counter = 1; *map = dmabuf->vmap_ptr; return 0; } EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, "DMA_BUF"); /** * dma_buf_vmap_unlocked - Create virtual mapping for the buffer object into kernel * address space. Same restrictions as for vmap and friends apply. * @dmabuf: [in] buffer to vmap * @map: [out] returns the vmap pointer * * Unlocked version of dma_buf_vmap() * * Returns 0 on success, or a negative errno code otherwise. */ int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map) { int ret; iosys_map_clear(map); if (WARN_ON(!dmabuf)) return -EINVAL; dma_resv_lock(dmabuf->resv, NULL); ret = dma_buf_vmap(dmabuf, map); dma_resv_unlock(dmabuf->resv); return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_vmap_unlocked, "DMA_BUF"); /** * dma_buf_vunmap - Unmap a vmap obtained by dma_buf_vmap. * @dmabuf: [in] buffer to vunmap * @map: [in] vmap pointer to vunmap */ void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) { if (WARN_ON(!dmabuf)) return; dma_resv_assert_held(dmabuf->resv); BUG_ON(iosys_map_is_null(&dmabuf->vmap_ptr)); BUG_ON(dmabuf->vmapping_counter == 0); BUG_ON(!iosys_map_is_equal(&dmabuf->vmap_ptr, map)); if (--dmabuf->vmapping_counter == 0) { if (dmabuf->ops->vunmap) dmabuf->ops->vunmap(dmabuf, map); iosys_map_clear(&dmabuf->vmap_ptr); } } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, "DMA_BUF"); /** * dma_buf_vunmap_unlocked - Unmap a vmap obtained by dma_buf_vmap. * @dmabuf: [in] buffer to vunmap * @map: [in] vmap pointer to vunmap */ void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map) { if (WARN_ON(!dmabuf)) return; dma_resv_lock(dmabuf->resv, NULL); dma_buf_vunmap(dmabuf, map); dma_resv_unlock(dmabuf->resv); } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap_unlocked, "DMA_BUF"); #ifdef CONFIG_DEBUG_FS static int dma_buf_debug_show(struct seq_file *s, void *unused) { struct dma_buf *buf_obj; struct dma_buf_attachment *attach_obj; int count = 0, attach_count; size_t size = 0; int ret; ret = mutex_lock_interruptible(&dmabuf_list_mutex); if (ret) return ret; seq_puts(s, "\nDma-buf Objects:\n"); seq_printf(s, "%-8s\t%-8s\t%-8s\t%-8s\texp_name\t%-8s\tname\n", "size", "flags", "mode", "count", "ino"); list_for_each_entry(buf_obj, &dmabuf_list, list_node) { ret = dma_resv_lock_interruptible(buf_obj->resv, NULL); if (ret) goto error_unlock; spin_lock(&buf_obj->name_lock); seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\t%08lu\t%s\n", buf_obj->size, buf_obj->file->f_flags, buf_obj->file->f_mode, file_count(buf_obj->file), buf_obj->exp_name, file_inode(buf_obj->file)->i_ino, buf_obj->name ?: "<none>"); spin_unlock(&buf_obj->name_lock); dma_resv_describe(buf_obj->resv, s); seq_puts(s, "\tAttached Devices:\n"); attach_count = 0; list_for_each_entry(attach_obj, &buf_obj->attachments, node) { seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); attach_count++; } dma_resv_unlock(buf_obj->resv); seq_printf(s, "Total %d devices attached\n\n", attach_count); count++; size += buf_obj->size; } seq_printf(s, "\nTotal %d objects, %zu bytes\n", count, size); mutex_unlock(&dmabuf_list_mutex); return 0; error_unlock: mutex_unlock(&dmabuf_list_mutex); return ret; } DEFINE_SHOW_ATTRIBUTE(dma_buf_debug); static struct dentry *dma_buf_debugfs_dir; static int dma_buf_init_debugfs(void) { struct dentry *d; int err = 0; d = debugfs_create_dir("dma_buf", NULL); if (IS_ERR(d)) return PTR_ERR(d); dma_buf_debugfs_dir = d; d = debugfs_create_file("bufinfo", 0444, dma_buf_debugfs_dir, NULL, &dma_buf_debug_fops); if (IS_ERR(d)) { pr_debug("dma_buf: debugfs: failed to create node bufinfo\n"); debugfs_remove_recursive(dma_buf_debugfs_dir); dma_buf_debugfs_dir = NULL; err = PTR_ERR(d); } return err; } static void dma_buf_uninit_debugfs(void) { debugfs_remove_recursive(dma_buf_debugfs_dir); } #else static inline int dma_buf_init_debugfs(void) { return 0; } static inline void dma_buf_uninit_debugfs(void) { } #endif static int __init dma_buf_init(void) { int ret; ret = dma_buf_init_sysfs_statistics(); if (ret) return ret; dma_buf_mnt = kern_mount(&dma_buf_fs_type); if (IS_ERR(dma_buf_mnt)) return PTR_ERR(dma_buf_mnt); dma_buf_init_debugfs(); return 0; } subsys_initcall(dma_buf_init); static void __exit dma_buf_deinit(void) { dma_buf_uninit_debugfs(); kern_unmount(dma_buf_mnt); dma_buf_uninit_sysfs_statistics(); } __exitcall(dma_buf_deinit); |
| 1 1 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 | // SPDX-License-Identifier: GPL-2.0 /* * BlueZ - Bluetooth protocol stack for Linux * * Copyright (C) 2022 Intel Corporation * Copyright 2023-2024 NXP */ #include <linux/module.h> #include <linux/debugfs.h> #include <linux/seq_file.h> #include <linux/sched/signal.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> #include <net/bluetooth/iso.h> #include "eir.h" static const struct proto_ops iso_sock_ops; static struct bt_sock_list iso_sk_list = { .lock = __RW_LOCK_UNLOCKED(iso_sk_list.lock) }; /* ---- ISO connections ---- */ struct iso_conn { struct hci_conn *hcon; /* @lock: spinlock protecting changes to iso_conn fields */ spinlock_t lock; struct sock *sk; struct delayed_work timeout_work; struct sk_buff *rx_skb; __u32 rx_len; __u16 tx_sn; struct kref ref; }; #define iso_conn_lock(c) spin_lock(&(c)->lock) #define iso_conn_unlock(c) spin_unlock(&(c)->lock) static void iso_sock_close(struct sock *sk); static void iso_sock_kill(struct sock *sk); /* ----- ISO socket info ----- */ #define iso_pi(sk) ((struct iso_pinfo *)sk) #define EIR_SERVICE_DATA_LENGTH 4 #define BASE_MAX_LENGTH (HCI_MAX_PER_AD_LENGTH - EIR_SERVICE_DATA_LENGTH) #define EIR_BAA_SERVICE_UUID 0x1851 /* iso_pinfo flags values */ enum { BT_SK_BIG_SYNC, BT_SK_PA_SYNC, }; struct iso_pinfo { struct bt_sock bt; bdaddr_t src; __u8 src_type; bdaddr_t dst; __u8 dst_type; __u8 bc_sid; __u8 bc_num_bis; __u8 bc_bis[ISO_MAX_NUM_BIS]; __u16 sync_handle; unsigned long flags; struct bt_iso_qos qos; bool qos_user_set; __u8 base_len; __u8 base[BASE_MAX_LENGTH]; struct iso_conn *conn; }; static struct bt_iso_qos default_qos; static bool check_ucast_qos(struct bt_iso_qos *qos); static bool check_bcast_qos(struct bt_iso_qos *qos); static bool iso_match_sid(struct sock *sk, void *data); static bool iso_match_sync_handle(struct sock *sk, void *data); static bool iso_match_sync_handle_pa_report(struct sock *sk, void *data); static void iso_sock_disconn(struct sock *sk); typedef bool (*iso_sock_match_t)(struct sock *sk, void *data); static struct sock *iso_get_sock(bdaddr_t *src, bdaddr_t *dst, enum bt_sock_state state, iso_sock_match_t match, void *data); /* ---- ISO timers ---- */ #define ISO_CONN_TIMEOUT (HZ * 40) #define ISO_DISCONN_TIMEOUT (HZ * 2) static void iso_conn_free(struct kref *ref) { struct iso_conn *conn = container_of(ref, struct iso_conn, ref); BT_DBG("conn %p", conn); if (conn->sk) iso_pi(conn->sk)->conn = NULL; if (conn->hcon) { conn->hcon->iso_data = NULL; hci_conn_drop(conn->hcon); } /* Ensure no more work items will run since hci_conn has been dropped */ disable_delayed_work_sync(&conn->timeout_work); kfree(conn); } static void iso_conn_put(struct iso_conn *conn) { if (!conn) return; BT_DBG("conn %p refcnt %d", conn, kref_read(&conn->ref)); kref_put(&conn->ref, iso_conn_free); } static struct iso_conn *iso_conn_hold_unless_zero(struct iso_conn *conn) { if (!conn) return NULL; BT_DBG("conn %p refcnt %u", conn, kref_read(&conn->ref)); if (!kref_get_unless_zero(&conn->ref)) return NULL; return conn; } static struct sock *iso_sock_hold(struct iso_conn *conn) { if (!conn || !bt_sock_linked(&iso_sk_list, conn->sk)) return NULL; sock_hold(conn->sk); return conn->sk; } static void iso_sock_timeout(struct work_struct *work) { struct iso_conn *conn = container_of(work, struct iso_conn, timeout_work.work); struct sock *sk; conn = iso_conn_hold_unless_zero(conn); if (!conn) return; iso_conn_lock(conn); sk = iso_sock_hold(conn); iso_conn_unlock(conn); iso_conn_put(conn); if (!sk) return; BT_DBG("sock %p state %d", sk, sk->sk_state); lock_sock(sk); sk->sk_err = ETIMEDOUT; sk->sk_state_change(sk); release_sock(sk); sock_put(sk); } static void iso_sock_set_timer(struct sock *sk, long timeout) { if (!iso_pi(sk)->conn) return; BT_DBG("sock %p state %d timeout %ld", sk, sk->sk_state, timeout); cancel_delayed_work(&iso_pi(sk)->conn->timeout_work); schedule_delayed_work(&iso_pi(sk)->conn->timeout_work, timeout); } static void iso_sock_clear_timer(struct sock *sk) { if (!iso_pi(sk)->conn) return; BT_DBG("sock %p state %d", sk, sk->sk_state); cancel_delayed_work(&iso_pi(sk)->conn->timeout_work); } /* ---- ISO connections ---- */ static struct iso_conn *iso_conn_add(struct hci_conn *hcon) { struct iso_conn *conn = hcon->iso_data; conn = iso_conn_hold_unless_zero(conn); if (conn) { if (!conn->hcon) { iso_conn_lock(conn); conn->hcon = hcon; iso_conn_unlock(conn); } iso_conn_put(conn); return conn; } conn = kzalloc(sizeof(*conn), GFP_KERNEL); if (!conn) return NULL; kref_init(&conn->ref); spin_lock_init(&conn->lock); INIT_DELAYED_WORK(&conn->timeout_work, iso_sock_timeout); hcon->iso_data = conn; conn->hcon = hcon; conn->tx_sn = 0; BT_DBG("hcon %p conn %p", hcon, conn); return conn; } /* Delete channel. Must be called on the locked socket. */ static void iso_chan_del(struct sock *sk, int err) { struct iso_conn *conn; struct sock *parent; conn = iso_pi(sk)->conn; iso_pi(sk)->conn = NULL; BT_DBG("sk %p, conn %p, err %d", sk, conn, err); if (conn) { iso_conn_lock(conn); conn->sk = NULL; iso_conn_unlock(conn); iso_conn_put(conn); } sk->sk_state = BT_CLOSED; sk->sk_err = err; parent = bt_sk(sk)->parent; if (parent) { bt_accept_unlink(sk); parent->sk_data_ready(parent); } else { sk->sk_state_change(sk); } sock_set_flag(sk, SOCK_ZAPPED); } static void iso_conn_del(struct hci_conn *hcon, int err) { struct iso_conn *conn = hcon->iso_data; struct sock *sk; conn = iso_conn_hold_unless_zero(conn); if (!conn) return; BT_DBG("hcon %p conn %p, err %d", hcon, conn, err); /* Kill socket */ iso_conn_lock(conn); sk = iso_sock_hold(conn); iso_conn_unlock(conn); iso_conn_put(conn); if (!sk) { iso_conn_put(conn); return; } lock_sock(sk); iso_sock_clear_timer(sk); iso_chan_del(sk, err); release_sock(sk); sock_put(sk); } static int __iso_chan_add(struct iso_conn *conn, struct sock *sk, struct sock *parent) { BT_DBG("conn %p", conn); if (iso_pi(sk)->conn == conn && conn->sk == sk) return 0; if (conn->sk) { BT_ERR("conn->sk already set"); return -EBUSY; } iso_pi(sk)->conn = conn; conn->sk = sk; if (parent) bt_accept_enqueue(parent, sk, true); return 0; } static int iso_chan_add(struct iso_conn *conn, struct sock *sk, struct sock *parent) { int err; iso_conn_lock(conn); err = __iso_chan_add(conn, sk, parent); iso_conn_unlock(conn); return err; } static inline u8 le_addr_type(u8 bdaddr_type) { if (bdaddr_type == BDADDR_LE_PUBLIC) return ADDR_LE_DEV_PUBLIC; else return ADDR_LE_DEV_RANDOM; } static int iso_connect_bis(struct sock *sk) { struct iso_conn *conn; struct hci_conn *hcon; struct hci_dev *hdev; int err; BT_DBG("%pMR (SID 0x%2.2x)", &iso_pi(sk)->src, iso_pi(sk)->bc_sid); hdev = hci_get_route(&iso_pi(sk)->dst, &iso_pi(sk)->src, iso_pi(sk)->src_type); if (!hdev) return -EHOSTUNREACH; hci_dev_lock(hdev); if (!bis_capable(hdev)) { err = -EOPNOTSUPP; goto unlock; } /* Fail if user set invalid QoS */ if (iso_pi(sk)->qos_user_set && !check_bcast_qos(&iso_pi(sk)->qos)) { iso_pi(sk)->qos = default_qos; err = -EINVAL; goto unlock; } /* Fail if out PHYs are marked as disabled */ if (!iso_pi(sk)->qos.bcast.out.phy) { err = -EINVAL; goto unlock; } /* Just bind if DEFER_SETUP has been set */ if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { hcon = hci_bind_bis(hdev, &iso_pi(sk)->dst, iso_pi(sk)->bc_sid, &iso_pi(sk)->qos, iso_pi(sk)->base_len, iso_pi(sk)->base); if (IS_ERR(hcon)) { err = PTR_ERR(hcon); goto unlock; } } else { hcon = hci_connect_bis(hdev, &iso_pi(sk)->dst, le_addr_type(iso_pi(sk)->dst_type), iso_pi(sk)->bc_sid, &iso_pi(sk)->qos, iso_pi(sk)->base_len, iso_pi(sk)->base); if (IS_ERR(hcon)) { err = PTR_ERR(hcon); goto unlock; } /* Update SID if it was not set */ if (iso_pi(sk)->bc_sid == HCI_SID_INVALID) iso_pi(sk)->bc_sid = hcon->sid; } conn = iso_conn_add(hcon); if (!conn) { hci_conn_drop(hcon); err = -ENOMEM; goto unlock; } lock_sock(sk); err = iso_chan_add(conn, sk, NULL); if (err) { release_sock(sk); goto unlock; } /* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); if (hcon->state == BT_CONNECTED) { iso_sock_clear_timer(sk); sk->sk_state = BT_CONNECTED; } else if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { iso_sock_clear_timer(sk); sk->sk_state = BT_CONNECT; } else { sk->sk_state = BT_CONNECT; iso_sock_set_timer(sk, sk->sk_sndtimeo); } release_sock(sk); unlock: hci_dev_unlock(hdev); hci_dev_put(hdev); return err; } static int iso_connect_cis(struct sock *sk) { struct iso_conn *conn; struct hci_conn *hcon; struct hci_dev *hdev; int err; BT_DBG("%pMR -> %pMR", &iso_pi(sk)->src, &iso_pi(sk)->dst); hdev = hci_get_route(&iso_pi(sk)->dst, &iso_pi(sk)->src, iso_pi(sk)->src_type); if (!hdev) return -EHOSTUNREACH; hci_dev_lock(hdev); if (!cis_central_capable(hdev)) { err = -EOPNOTSUPP; goto unlock; } /* Fail if user set invalid QoS */ if (iso_pi(sk)->qos_user_set && !check_ucast_qos(&iso_pi(sk)->qos)) { iso_pi(sk)->qos = default_qos; err = -EINVAL; goto unlock; } /* Fail if either PHYs are marked as disabled */ if (!iso_pi(sk)->qos.ucast.in.phy && !iso_pi(sk)->qos.ucast.out.phy) { err = -EINVAL; goto unlock; } /* Just bind if DEFER_SETUP has been set */ if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { hcon = hci_bind_cis(hdev, &iso_pi(sk)->dst, le_addr_type(iso_pi(sk)->dst_type), &iso_pi(sk)->qos); if (IS_ERR(hcon)) { err = PTR_ERR(hcon); goto unlock; } } else { hcon = hci_connect_cis(hdev, &iso_pi(sk)->dst, le_addr_type(iso_pi(sk)->dst_type), &iso_pi(sk)->qos); if (IS_ERR(hcon)) { err = PTR_ERR(hcon); goto unlock; } } conn = iso_conn_add(hcon); if (!conn) { hci_conn_drop(hcon); err = -ENOMEM; goto unlock; } lock_sock(sk); err = iso_chan_add(conn, sk, NULL); if (err) { release_sock(sk); goto unlock; } /* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); if (hcon->state == BT_CONNECTED) { iso_sock_clear_timer(sk); sk->sk_state = BT_CONNECTED; } else if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { iso_sock_clear_timer(sk); sk->sk_state = BT_CONNECT; } else { sk->sk_state = BT_CONNECT; iso_sock_set_timer(sk, sk->sk_sndtimeo); } release_sock(sk); unlock: hci_dev_unlock(hdev); hci_dev_put(hdev); return err; } static struct bt_iso_qos *iso_sock_get_qos(struct sock *sk) { if (sk->sk_state == BT_CONNECTED || sk->sk_state == BT_CONNECT2) return &iso_pi(sk)->conn->hcon->iso_qos; return &iso_pi(sk)->qos; } static int iso_send_frame(struct sock *sk, struct sk_buff *skb, const struct sockcm_cookie *sockc) { struct iso_conn *conn = iso_pi(sk)->conn; struct bt_iso_qos *qos = iso_sock_get_qos(sk); struct hci_iso_data_hdr *hdr; int len = 0; BT_DBG("sk %p len %d", sk, skb->len); if (skb->len > qos->ucast.out.sdu) return -EMSGSIZE; len = skb->len; /* Push ISO data header */ hdr = skb_push(skb, HCI_ISO_DATA_HDR_SIZE); hdr->sn = cpu_to_le16(conn->tx_sn++); hdr->slen = cpu_to_le16(hci_iso_data_len_pack(len, HCI_ISO_STATUS_VALID)); if (sk->sk_state == BT_CONNECTED) { hci_setup_tx_timestamp(skb, 1, sockc); hci_send_iso(conn->hcon, skb); } else { len = -ENOTCONN; } return len; } static void iso_recv_frame(struct iso_conn *conn, struct sk_buff *skb) { struct sock *sk; iso_conn_lock(conn); sk = conn->sk; iso_conn_unlock(conn); if (!sk) goto drop; BT_DBG("sk %p len %d", sk, skb->len); if (sk->sk_state != BT_CONNECTED) goto drop; if (!sock_queue_rcv_skb(sk, skb)) return; drop: kfree_skb(skb); } /* -------- Socket interface ---------- */ static struct sock *__iso_get_sock_listen_by_addr(bdaddr_t *src, bdaddr_t *dst) { struct sock *sk; sk_for_each(sk, &iso_sk_list.head) { if (sk->sk_state != BT_LISTEN) continue; if (bacmp(&iso_pi(sk)->dst, dst)) continue; if (!bacmp(&iso_pi(sk)->src, src)) return sk; } return NULL; } static struct sock *__iso_get_sock_listen_by_sid(bdaddr_t *ba, bdaddr_t *bc, __u8 sid) { struct sock *sk; sk_for_each(sk, &iso_sk_list.head) { if (sk->sk_state != BT_LISTEN) continue; if (bacmp(&iso_pi(sk)->src, ba)) continue; if (bacmp(&iso_pi(sk)->dst, bc)) continue; if (iso_pi(sk)->bc_sid == sid) return sk; } return NULL; } /* Find socket in given state: * source bdaddr (Unicast) * destination bdaddr (Broadcast only) * match func - pass NULL to ignore * match func data - pass -1 to ignore * Returns closest match. */ static struct sock *iso_get_sock(bdaddr_t *src, bdaddr_t *dst, enum bt_sock_state state, iso_sock_match_t match, void *data) { struct sock *sk = NULL, *sk1 = NULL; read_lock(&iso_sk_list.lock); sk_for_each(sk, &iso_sk_list.head) { if (sk->sk_state != state) continue; /* Match Broadcast destination */ if (bacmp(dst, BDADDR_ANY) && bacmp(&iso_pi(sk)->dst, dst)) continue; /* Use Match function if provided */ if (match && !match(sk, data)) continue; /* Exact match. */ if (!bacmp(&iso_pi(sk)->src, src)) { sock_hold(sk); break; } /* Closest match */ if (!bacmp(&iso_pi(sk)->src, BDADDR_ANY)) { if (sk1) sock_put(sk1); sk1 = sk; sock_hold(sk1); } } if (sk && sk1) sock_put(sk1); read_unlock(&iso_sk_list.lock); return sk ? sk : sk1; } static struct sock *iso_get_sock_big(struct sock *match_sk, bdaddr_t *src, bdaddr_t *dst, uint8_t big) { struct sock *sk = NULL; read_lock(&iso_sk_list.lock); sk_for_each(sk, &iso_sk_list.head) { if (match_sk == sk) continue; /* Look for sockets that have already been * connected to the BIG */ if (sk->sk_state != BT_CONNECTED && sk->sk_state != BT_CONNECT) continue; /* Match Broadcast destination */ if (bacmp(&iso_pi(sk)->dst, dst)) continue; /* Match BIG handle */ if (iso_pi(sk)->qos.bcast.big != big) continue; /* Match source address */ if (bacmp(&iso_pi(sk)->src, src)) continue; sock_hold(sk); break; } read_unlock(&iso_sk_list.lock); return sk; } static void iso_sock_destruct(struct sock *sk) { BT_DBG("sk %p", sk); iso_conn_put(iso_pi(sk)->conn); skb_queue_purge(&sk->sk_receive_queue); skb_queue_purge(&sk->sk_write_queue); } static void iso_sock_cleanup_listen(struct sock *parent) { struct sock *sk; BT_DBG("parent %p", parent); /* Close not yet accepted channels */ while ((sk = bt_accept_dequeue(parent, NULL))) { iso_sock_close(sk); iso_sock_kill(sk); } /* If listening socket has a hcon, properly disconnect it */ if (iso_pi(parent)->conn && iso_pi(parent)->conn->hcon) { iso_sock_disconn(parent); return; } parent->sk_state = BT_CLOSED; sock_set_flag(parent, SOCK_ZAPPED); } /* Kill socket (only if zapped and orphan) * Must be called on unlocked socket. */ static void iso_sock_kill(struct sock *sk) { if (!sock_flag(sk, SOCK_ZAPPED) || sk->sk_socket || sock_flag(sk, SOCK_DEAD)) return; BT_DBG("sk %p state %d", sk, sk->sk_state); /* Kill poor orphan */ bt_sock_unlink(&iso_sk_list, sk); sock_set_flag(sk, SOCK_DEAD); sock_put(sk); } static void iso_sock_disconn(struct sock *sk) { struct sock *bis_sk; struct hci_conn *hcon = iso_pi(sk)->conn->hcon; if (test_bit(HCI_CONN_BIG_CREATED, &hcon->flags)) { bis_sk = iso_get_sock_big(sk, &iso_pi(sk)->src, &iso_pi(sk)->dst, iso_pi(sk)->qos.bcast.big); /* If there are any other connected sockets for the * same BIG, just delete the sk and leave the bis * hcon active, in case later rebinding is needed. */ if (bis_sk) { hcon->state = BT_OPEN; hcon->iso_data = NULL; iso_pi(sk)->conn->hcon = NULL; iso_sock_clear_timer(sk); iso_chan_del(sk, bt_to_errno(hcon->abort_reason)); sock_put(bis_sk); return; } } sk->sk_state = BT_DISCONN; iso_conn_lock(iso_pi(sk)->conn); hci_conn_drop(iso_pi(sk)->conn->hcon); iso_pi(sk)->conn->hcon = NULL; iso_conn_unlock(iso_pi(sk)->conn); } static void __iso_sock_close(struct sock *sk) { BT_DBG("sk %p state %d socket %p", sk, sk->sk_state, sk->sk_socket); switch (sk->sk_state) { case BT_LISTEN: iso_sock_cleanup_listen(sk); break; case BT_CONNECT: case BT_CONNECTED: case BT_CONFIG: if (iso_pi(sk)->conn->hcon) iso_sock_disconn(sk); else iso_chan_del(sk, ECONNRESET); break; case BT_CONNECT2: if (iso_pi(sk)->conn->hcon && (test_bit(HCI_CONN_PA_SYNC, &iso_pi(sk)->conn->hcon->flags) || test_bit(HCI_CONN_PA_SYNC_FAILED, &iso_pi(sk)->conn->hcon->flags))) iso_sock_disconn(sk); else iso_chan_del(sk, ECONNRESET); break; case BT_DISCONN: iso_chan_del(sk, ECONNRESET); break; default: sock_set_flag(sk, SOCK_ZAPPED); break; } } /* Must be called on unlocked socket. */ static void iso_sock_close(struct sock *sk) { iso_sock_clear_timer(sk); lock_sock(sk); __iso_sock_close(sk); release_sock(sk); iso_sock_kill(sk); } static void iso_sock_init(struct sock *sk, struct sock *parent) { BT_DBG("sk %p", sk); if (parent) { sk->sk_type = parent->sk_type; bt_sk(sk)->flags = bt_sk(parent)->flags; security_sk_clone(parent, sk); } } static struct proto iso_proto = { .name = "ISO", .owner = THIS_MODULE, .obj_size = sizeof(struct iso_pinfo) }; #define DEFAULT_IO_QOS \ { \ .interval = 10000u, \ .latency = 10u, \ .sdu = 40u, \ .phy = BT_ISO_PHY_2M, \ .rtn = 2u, \ } static struct bt_iso_qos default_qos = { .bcast = { .big = BT_ISO_QOS_BIG_UNSET, .bis = BT_ISO_QOS_BIS_UNSET, .sync_factor = 0x01, .packing = 0x00, .framing = 0x00, .in = DEFAULT_IO_QOS, .out = DEFAULT_IO_QOS, .encryption = 0x00, .bcode = {0x00}, .options = 0x00, .skip = 0x0000, .sync_timeout = BT_ISO_SYNC_TIMEOUT, .sync_cte_type = 0x00, .mse = 0x00, .timeout = BT_ISO_SYNC_TIMEOUT, }, }; static struct sock *iso_sock_alloc(struct net *net, struct socket *sock, int proto, gfp_t prio, int kern) { struct sock *sk; sk = bt_sock_alloc(net, sock, &iso_proto, proto, prio, kern); if (!sk) return NULL; sk->sk_destruct = iso_sock_destruct; sk->sk_sndtimeo = ISO_CONN_TIMEOUT; /* Set address type as public as default src address is BDADDR_ANY */ iso_pi(sk)->src_type = BDADDR_LE_PUBLIC; iso_pi(sk)->qos = default_qos; iso_pi(sk)->sync_handle = -1; bt_sock_link(&iso_sk_list, sk); return sk; } static int iso_sock_create(struct net *net, struct socket *sock, int protocol, int kern) { struct sock *sk; BT_DBG("sock %p", sock); sock->state = SS_UNCONNECTED; if (sock->type != SOCK_SEQPACKET) return -ESOCKTNOSUPPORT; sock->ops = &iso_sock_ops; sk = iso_sock_alloc(net, sock, protocol, GFP_ATOMIC, kern); if (!sk) return -ENOMEM; iso_sock_init(sk, NULL); return 0; } static int iso_sock_bind_bc(struct socket *sock, struct sockaddr *addr, int addr_len) { struct sockaddr_iso *sa = (struct sockaddr_iso *)addr; struct sock *sk = sock->sk; int i; BT_DBG("sk %p bc_sid %u bc_num_bis %u", sk, sa->iso_bc->bc_sid, sa->iso_bc->bc_num_bis); if (addr_len != sizeof(*sa) + sizeof(*sa->iso_bc)) return -EINVAL; bacpy(&iso_pi(sk)->dst, &sa->iso_bc->bc_bdaddr); /* Check if the address type is of LE type */ if (!bdaddr_type_is_le(sa->iso_bc->bc_bdaddr_type)) return -EINVAL; iso_pi(sk)->dst_type = sa->iso_bc->bc_bdaddr_type; if (sa->iso_bc->bc_sid > 0x0f && sa->iso_bc->bc_sid != HCI_SID_INVALID) return -EINVAL; iso_pi(sk)->bc_sid = sa->iso_bc->bc_sid; if (sa->iso_bc->bc_num_bis > ISO_MAX_NUM_BIS) return -EINVAL; iso_pi(sk)->bc_num_bis = sa->iso_bc->bc_num_bis; for (i = 0; i < iso_pi(sk)->bc_num_bis; i++) if (sa->iso_bc->bc_bis[i] < 0x01 || sa->iso_bc->bc_bis[i] > 0x1f) return -EINVAL; memcpy(iso_pi(sk)->bc_bis, sa->iso_bc->bc_bis, iso_pi(sk)->bc_num_bis); return 0; } static int iso_sock_bind_pa_sk(struct sock *sk, struct sockaddr_iso *sa, int addr_len) { int err = 0; if (sk->sk_type != SOCK_SEQPACKET) { err = -EINVAL; goto done; } if (addr_len != sizeof(*sa) + sizeof(*sa->iso_bc)) { err = -EINVAL; goto done; } if (sa->iso_bc->bc_num_bis > ISO_MAX_NUM_BIS) { err = -EINVAL; goto done; } iso_pi(sk)->bc_num_bis = sa->iso_bc->bc_num_bis; for (int i = 0; i < iso_pi(sk)->bc_num_bis; i++) if (sa->iso_bc->bc_bis[i] < 0x01 || sa->iso_bc->bc_bis[i] > 0x1f) { err = -EINVAL; goto done; } memcpy(iso_pi(sk)->bc_bis, sa->iso_bc->bc_bis, iso_pi(sk)->bc_num_bis); done: return err; } static int iso_sock_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct sockaddr_iso *sa = (struct sockaddr_iso *)addr; struct sock *sk = sock->sk; int err = 0; BT_DBG("sk %p %pMR type %u", sk, &sa->iso_bdaddr, sa->iso_bdaddr_type); if (!addr || addr_len < sizeof(struct sockaddr_iso) || addr->sa_family != AF_BLUETOOTH) return -EINVAL; lock_sock(sk); /* Allow the user to bind a PA sync socket to a number * of BISes to sync to. */ if ((sk->sk_state == BT_CONNECT2 || sk->sk_state == BT_CONNECTED) && test_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags)) { err = iso_sock_bind_pa_sk(sk, sa, addr_len); goto done; } if (sk->sk_state != BT_OPEN) { err = -EBADFD; goto done; } if (sk->sk_type != SOCK_SEQPACKET) { err = -EINVAL; goto done; } /* Check if the address type is of LE type */ if (!bdaddr_type_is_le(sa->iso_bdaddr_type)) { err = -EINVAL; goto done; } bacpy(&iso_pi(sk)->src, &sa->iso_bdaddr); iso_pi(sk)->src_type = sa->iso_bdaddr_type; /* Check for Broadcast address */ if (addr_len > sizeof(*sa)) { err = iso_sock_bind_bc(sock, addr, addr_len); if (err) goto done; } sk->sk_state = BT_BOUND; done: release_sock(sk); return err; } static int iso_sock_connect(struct socket *sock, struct sockaddr *addr, int alen, int flags) { struct sockaddr_iso *sa = (struct sockaddr_iso *)addr; struct sock *sk = sock->sk; int err; BT_DBG("sk %p", sk); if (alen < sizeof(struct sockaddr_iso) || addr->sa_family != AF_BLUETOOTH) return -EINVAL; if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND) return -EBADFD; if (sk->sk_type != SOCK_SEQPACKET) return -EINVAL; /* Check if the address type is of LE type */ if (!bdaddr_type_is_le(sa->iso_bdaddr_type)) return -EINVAL; lock_sock(sk); bacpy(&iso_pi(sk)->dst, &sa->iso_bdaddr); iso_pi(sk)->dst_type = sa->iso_bdaddr_type; release_sock(sk); if (bacmp(&iso_pi(sk)->dst, BDADDR_ANY)) err = iso_connect_cis(sk); else err = iso_connect_bis(sk); if (err) return err; lock_sock(sk); if (!test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { err = bt_sock_wait_state(sk, BT_CONNECTED, sock_sndtimeo(sk, flags & O_NONBLOCK)); } release_sock(sk); return err; } static int iso_listen_bis(struct sock *sk) { struct hci_dev *hdev; int err = 0; struct iso_conn *conn; struct hci_conn *hcon; BT_DBG("%pMR -> %pMR (SID 0x%2.2x)", &iso_pi(sk)->src, &iso_pi(sk)->dst, iso_pi(sk)->bc_sid); write_lock(&iso_sk_list.lock); if (__iso_get_sock_listen_by_sid(&iso_pi(sk)->src, &iso_pi(sk)->dst, iso_pi(sk)->bc_sid)) err = -EADDRINUSE; write_unlock(&iso_sk_list.lock); if (err) return err; hdev = hci_get_route(&iso_pi(sk)->dst, &iso_pi(sk)->src, iso_pi(sk)->src_type); if (!hdev) return -EHOSTUNREACH; hci_dev_lock(hdev); lock_sock(sk); /* Fail if user set invalid QoS */ if (iso_pi(sk)->qos_user_set && !check_bcast_qos(&iso_pi(sk)->qos)) { iso_pi(sk)->qos = default_qos; err = -EINVAL; goto unlock; } hcon = hci_pa_create_sync(hdev, &iso_pi(sk)->dst, le_addr_type(iso_pi(sk)->dst_type), iso_pi(sk)->bc_sid, &iso_pi(sk)->qos); if (IS_ERR(hcon)) { err = PTR_ERR(hcon); goto unlock; } conn = iso_conn_add(hcon); if (!conn) { hci_conn_drop(hcon); err = -ENOMEM; goto unlock; } err = iso_chan_add(conn, sk, NULL); if (err) { hci_conn_drop(hcon); goto unlock; } unlock: release_sock(sk); hci_dev_unlock(hdev); hci_dev_put(hdev); return err; } static int iso_listen_cis(struct sock *sk) { int err = 0; BT_DBG("%pMR", &iso_pi(sk)->src); write_lock(&iso_sk_list.lock); if (__iso_get_sock_listen_by_addr(&iso_pi(sk)->src, &iso_pi(sk)->dst)) err = -EADDRINUSE; write_unlock(&iso_sk_list.lock); return err; } static int iso_sock_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk; int err = 0; BT_DBG("sk %p backlog %d", sk, backlog); sock_hold(sk); lock_sock(sk); if (sk->sk_state != BT_BOUND) { err = -EBADFD; goto done; } if (sk->sk_type != SOCK_SEQPACKET) { err = -EINVAL; goto done; } if (!bacmp(&iso_pi(sk)->dst, BDADDR_ANY)) { err = iso_listen_cis(sk); } else { /* Drop sock lock to avoid potential * deadlock with the hdev lock. */ release_sock(sk); err = iso_listen_bis(sk); lock_sock(sk); } if (err) goto done; sk->sk_max_ack_backlog = backlog; sk->sk_ack_backlog = 0; sk->sk_state = BT_LISTEN; done: release_sock(sk); sock_put(sk); return err; } static int iso_sock_accept(struct socket *sock, struct socket *newsock, struct proto_accept_arg *arg) { DEFINE_WAIT_FUNC(wait, woken_wake_function); struct sock *sk = sock->sk, *ch; long timeo; int err = 0; /* Use explicit nested locking to avoid lockdep warnings generated * because the parent socket and the child socket are locked on the * same thread. */ lock_sock_nested(sk, SINGLE_DEPTH_NESTING); timeo = sock_rcvtimeo(sk, arg->flags & O_NONBLOCK); BT_DBG("sk %p timeo %ld", sk, timeo); /* Wait for an incoming connection. (wake-one). */ add_wait_queue_exclusive(sk_sleep(sk), &wait); while (1) { if (sk->sk_state != BT_LISTEN) { err = -EBADFD; break; } ch = bt_accept_dequeue(sk, newsock); if (ch) break; if (!timeo) { err = -EAGAIN; break; } if (signal_pending(current)) { err = sock_intr_errno(timeo); break; } release_sock(sk); timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, timeo); lock_sock_nested(sk, SINGLE_DEPTH_NESTING); } remove_wait_queue(sk_sleep(sk), &wait); if (err) goto done; newsock->state = SS_CONNECTED; BT_DBG("new socket %p", ch); /* A Broadcast Sink might require BIG sync to be terminated * and re-established multiple times, while keeping the same * PA sync handle active. To allow this, once all BIS * connections have been accepted on a PA sync parent socket, * "reset" socket state, to allow future BIG re-sync procedures. */ if (test_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags)) { /* Iterate through the list of bound BIS indices * and clear each BIS as they are accepted by the * user space, one by one. */ for (int i = 0; i < iso_pi(sk)->bc_num_bis; i++) { if (iso_pi(sk)->bc_bis[i] > 0) { iso_pi(sk)->bc_bis[i] = 0; iso_pi(sk)->bc_num_bis--; break; } } if (iso_pi(sk)->bc_num_bis == 0) { /* Once the last BIS was accepted, reset parent * socket parameters to mark that the listening * process for BIS connections has been completed: * * 1. Reset the DEFER setup flag on the parent sk. * 2. Clear the flag marking that the BIG create * sync command is pending. * 3. Transition socket state from BT_LISTEN to * BT_CONNECTED. */ set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags); clear_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags); sk->sk_state = BT_CONNECTED; } } done: release_sock(sk); return err; } static int iso_sock_getname(struct socket *sock, struct sockaddr *addr, int peer) { struct sockaddr_iso *sa = (struct sockaddr_iso *)addr; struct sock *sk = sock->sk; int len = sizeof(struct sockaddr_iso); BT_DBG("sock %p, sk %p", sock, sk); addr->sa_family = AF_BLUETOOTH; if (peer) { struct hci_conn *hcon = iso_pi(sk)->conn ? iso_pi(sk)->conn->hcon : NULL; bacpy(&sa->iso_bdaddr, &iso_pi(sk)->dst); sa->iso_bdaddr_type = iso_pi(sk)->dst_type; if (hcon && hcon->type == BIS_LINK) { sa->iso_bc->bc_sid = iso_pi(sk)->bc_sid; sa->iso_bc->bc_num_bis = iso_pi(sk)->bc_num_bis; memcpy(sa->iso_bc->bc_bis, iso_pi(sk)->bc_bis, ISO_MAX_NUM_BIS); len += sizeof(struct sockaddr_iso_bc); } } else { bacpy(&sa->iso_bdaddr, &iso_pi(sk)->src); sa->iso_bdaddr_type = iso_pi(sk)->src_type; } return len; } static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct sk_buff *skb, **frag; struct sockcm_cookie sockc; size_t mtu; int err; BT_DBG("sock %p, sk %p", sock, sk); err = sock_error(sk); if (err) return err; if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP; hci_sockcm_init(&sockc, sk); if (msg->msg_controllen) { err = sock_cmsg_send(sk, msg, &sockc); if (err) return err; } lock_sock(sk); if (sk->sk_state != BT_CONNECTED) { release_sock(sk); return -ENOTCONN; } mtu = iso_pi(sk)->conn->hcon->mtu; release_sock(sk); skb = bt_skb_sendmsg(sk, msg, len, mtu, HCI_ISO_DATA_HDR_SIZE, 0); if (IS_ERR(skb)) return PTR_ERR(skb); len -= skb->len; BT_DBG("skb %p len %d", sk, skb->len); /* Continuation fragments */ frag = &skb_shinfo(skb)->frag_list; while (len) { struct sk_buff *tmp; tmp = bt_skb_sendmsg(sk, msg, len, mtu, 0, 0); if (IS_ERR(tmp)) { kfree_skb(skb); return PTR_ERR(tmp); } *frag = tmp; len -= tmp->len; skb->len += tmp->len; skb->data_len += tmp->len; BT_DBG("frag %p len %d", *frag, tmp->len); frag = &(*frag)->next; } lock_sock(sk); if (sk->sk_state == BT_CONNECTED) err = iso_send_frame(sk, skb, &sockc); else err = -ENOTCONN; release_sock(sk); if (err < 0) kfree_skb(skb); return err; } static void iso_conn_defer_accept(struct hci_conn *conn) { struct hci_cp_le_accept_cis cp; struct hci_dev *hdev = conn->hdev; BT_DBG("conn %p", conn); conn->state = BT_CONFIG; cp.handle = cpu_to_le16(conn->handle); hci_send_cmd(hdev, HCI_OP_LE_ACCEPT_CIS, sizeof(cp), &cp); } static void iso_conn_big_sync(struct sock *sk) { int err; struct hci_dev *hdev; hdev = hci_get_route(&iso_pi(sk)->dst, &iso_pi(sk)->src, iso_pi(sk)->src_type); if (!hdev) return; /* hci_le_big_create_sync requires hdev lock to be held, since * it enqueues the HCI LE BIG Create Sync command via * hci_cmd_sync_queue_once, which checks hdev flags that might * change. */ hci_dev_lock(hdev); lock_sock(sk); if (!test_and_set_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags)) { err = hci_conn_big_create_sync(hdev, iso_pi(sk)->conn->hcon, &iso_pi(sk)->qos, iso_pi(sk)->sync_handle, iso_pi(sk)->bc_num_bis, iso_pi(sk)->bc_bis); if (err) bt_dev_err(hdev, "hci_big_create_sync: %d", err); } release_sock(sk); hci_dev_unlock(hdev); } static int iso_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) { struct sock *sk = sock->sk; struct iso_pinfo *pi = iso_pi(sk); bool early_ret = false; int err = 0; BT_DBG("sk %p", sk); if (unlikely(flags & MSG_ERRQUEUE)) return sock_recv_errqueue(sk, msg, len, SOL_BLUETOOTH, BT_SCM_ERROR); if (test_and_clear_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { sock_hold(sk); lock_sock(sk); switch (sk->sk_state) { case BT_CONNECT2: if (test_bit(BT_SK_PA_SYNC, &pi->flags)) { release_sock(sk); iso_conn_big_sync(sk); lock_sock(sk); sk->sk_state = BT_LISTEN; } else { iso_conn_defer_accept(pi->conn->hcon); sk->sk_state = BT_CONFIG; } early_ret = true; break; case BT_CONNECTED: if (test_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags)) { release_sock(sk); iso_conn_big_sync(sk); lock_sock(sk); sk->sk_state = BT_LISTEN; early_ret = true; } break; case BT_CONNECT: release_sock(sk); err = iso_connect_cis(sk); lock_sock(sk); early_ret = true; break; default: break; } release_sock(sk); sock_put(sk); if (early_ret) return err; } return bt_sock_recvmsg(sock, msg, len, flags); } static bool check_io_qos(struct bt_iso_io_qos *qos) { /* If no PHY is enable SDU must be 0 */ if (!qos->phy && qos->sdu) return false; if (qos->interval && (qos->interval < 0xff || qos->interval > 0xfffff)) return false; if (qos->latency && (qos->latency < 0x05 || qos->latency > 0xfa0)) return false; if (qos->phy > BT_ISO_PHY_ANY) return false; return true; } static bool check_ucast_qos(struct bt_iso_qos *qos) { if (qos->ucast.cig > 0xef && qos->ucast.cig != BT_ISO_QOS_CIG_UNSET) return false; if (qos->ucast.cis > 0xef && qos->ucast.cis != BT_ISO_QOS_CIS_UNSET) return false; if (qos->ucast.sca > 0x07) return false; if (qos->ucast.packing > 0x01) return false; if (qos->ucast.framing > 0x01) return false; if (!check_io_qos(&qos->ucast.in)) return false; if (!check_io_qos(&qos->ucast.out)) return false; return true; } static bool check_bcast_qos(struct bt_iso_qos *qos) { if (!qos->bcast.sync_factor) qos->bcast.sync_factor = 0x01; if (qos->bcast.packing > 0x01) return false; if (qos->bcast.framing > 0x01) return false; if (!check_io_qos(&qos->bcast.in)) return false; if (!check_io_qos(&qos->bcast.out)) return false; if (qos->bcast.encryption > 0x01) return false; if (qos->bcast.options > 0x07) return false; if (qos->bcast.skip > 0x01f3) return false; if (!qos->bcast.sync_timeout) qos->bcast.sync_timeout = BT_ISO_SYNC_TIMEOUT; if (qos->bcast.sync_timeout < 0x000a || qos->bcast.sync_timeout > 0x4000) return false; if (qos->bcast.sync_cte_type > 0x1f) return false; if (qos->bcast.mse > 0x1f) return false; if (!qos->bcast.timeout) qos->bcast.sync_timeout = BT_ISO_SYNC_TIMEOUT; if (qos->bcast.timeout < 0x000a || qos->bcast.timeout > 0x4000) return false; return true; } static int iso_sock_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { struct sock *sk = sock->sk; int err = 0; struct bt_iso_qos qos = default_qos; u32 opt; BT_DBG("sk %p", sk); lock_sock(sk); switch (optname) { case BT_DEFER_SETUP: if (sk->sk_state != BT_BOUND && sk->sk_state != BT_LISTEN) { err = -EINVAL; break; } err = copy_safe_from_sockptr(&opt, sizeof(opt), optval, optlen); if (err) break; if (opt) set_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags); else clear_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags); break; case BT_PKT_STATUS: err = copy_safe_from_sockptr(&opt, sizeof(opt), optval, optlen); if (err) break; if (opt) set_bit(BT_SK_PKT_STATUS, &bt_sk(sk)->flags); else clear_bit(BT_SK_PKT_STATUS, &bt_sk(sk)->flags); break; case BT_ISO_QOS: if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND && sk->sk_state != BT_CONNECT2 && (!test_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags) || sk->sk_state != BT_CONNECTED)) { err = -EINVAL; break; } err = copy_safe_from_sockptr(&qos, sizeof(qos), optval, optlen); if (err) break; iso_pi(sk)->qos = qos; iso_pi(sk)->qos_user_set = true; break; case BT_ISO_BASE: if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND && sk->sk_state != BT_CONNECT2) { err = -EINVAL; break; } if (optlen > sizeof(iso_pi(sk)->base)) { err = -EINVAL; break; } err = copy_safe_from_sockptr(iso_pi(sk)->base, optlen, optval, optlen); if (err) break; iso_pi(sk)->base_len = optlen; break; default: err = -ENOPROTOOPT; break; } release_sock(sk); return err; } static int iso_sock_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct sock *sk = sock->sk; int len, err = 0; struct bt_iso_qos *qos; u8 base_len; u8 *base; BT_DBG("sk %p", sk); if (get_user(len, optlen)) return -EFAULT; lock_sock(sk); switch (optname) { case BT_DEFER_SETUP: if (sk->sk_state == BT_CONNECTED) { err = -EINVAL; break; } if (put_user(test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags), (u32 __user *)optval)) err = -EFAULT; break; case BT_PKT_STATUS: if (put_user(test_bit(BT_SK_PKT_STATUS, &bt_sk(sk)->flags), (int __user *)optval)) err = -EFAULT; break; case BT_ISO_QOS: qos = iso_sock_get_qos(sk); len = min_t(unsigned int, len, sizeof(*qos)); if (copy_to_user(optval, qos, len)) err = -EFAULT; break; case BT_ISO_BASE: if (sk->sk_state == BT_CONNECTED && !bacmp(&iso_pi(sk)->dst, BDADDR_ANY)) { base_len = iso_pi(sk)->conn->hcon->le_per_adv_data_len; base = iso_pi(sk)->conn->hcon->le_per_adv_data; } else { base_len = iso_pi(sk)->base_len; base = iso_pi(sk)->base; } len = min_t(unsigned int, len, base_len); if (copy_to_user(optval, base, len)) err = -EFAULT; if (put_user(len, optlen)) err = -EFAULT; break; default: err = -ENOPROTOOPT; break; } release_sock(sk); return err; } static int iso_sock_shutdown(struct socket *sock, int how) { struct sock *sk = sock->sk; int err = 0; BT_DBG("sock %p, sk %p, how %d", sock, sk, how); if (!sk) return 0; sock_hold(sk); lock_sock(sk); switch (how) { case SHUT_RD: if (sk->sk_shutdown & RCV_SHUTDOWN) goto unlock; sk->sk_shutdown |= RCV_SHUTDOWN; break; case SHUT_WR: if (sk->sk_shutdown & SEND_SHUTDOWN) goto unlock; sk->sk_shutdown |= SEND_SHUTDOWN; break; case SHUT_RDWR: if (sk->sk_shutdown & SHUTDOWN_MASK) goto unlock; sk->sk_shutdown |= SHUTDOWN_MASK; break; } iso_sock_clear_timer(sk); __iso_sock_close(sk); if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime && !(current->flags & PF_EXITING)) err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime); unlock: release_sock(sk); sock_put(sk); return err; } static int iso_sock_release(struct socket *sock) { struct sock *sk = sock->sk; int err = 0; BT_DBG("sock %p, sk %p", sock, sk); if (!sk) return 0; iso_sock_close(sk); if (sock_flag(sk, SOCK_LINGER) && READ_ONCE(sk->sk_lingertime) && !(current->flags & PF_EXITING)) { lock_sock(sk); err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime); release_sock(sk); } sock_orphan(sk); iso_sock_kill(sk); return err; } static void iso_sock_ready(struct sock *sk) { BT_DBG("sk %p", sk); if (!sk) return; lock_sock(sk); iso_sock_clear_timer(sk); sk->sk_state = BT_CONNECTED; sk->sk_state_change(sk); release_sock(sk); } static bool iso_match_big(struct sock *sk, void *data) { struct hci_evt_le_big_sync_estabilished *ev = data; return ev->handle == iso_pi(sk)->qos.bcast.big; } static bool iso_match_big_hcon(struct sock *sk, void *data) { struct hci_conn *hcon = data; return hcon->iso_qos.bcast.big == iso_pi(sk)->qos.bcast.big; } static bool iso_match_pa_sync_flag(struct sock *sk, void *data) { return test_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags); } static void iso_conn_ready(struct iso_conn *conn) { struct sock *parent = NULL; struct sock *sk = conn->sk; struct hci_ev_le_big_sync_estabilished *ev = NULL; struct hci_ev_le_pa_sync_established *ev2 = NULL; struct hci_ev_le_per_adv_report *ev3 = NULL; struct hci_conn *hcon; BT_DBG("conn %p", conn); if (sk) { iso_sock_ready(conn->sk); } else { hcon = conn->hcon; if (!hcon) return; if (test_bit(HCI_CONN_BIG_SYNC, &hcon->flags)) { /* A BIS slave hcon is notified to the ISO layer * after the Command Complete for the LE Setup * ISO Data Path command is received. Get the * parent socket that matches the hcon BIG handle. */ parent = iso_get_sock(&hcon->src, &hcon->dst, BT_LISTEN, iso_match_big_hcon, hcon); } else if (test_bit(HCI_CONN_BIG_SYNC_FAILED, &hcon->flags)) { ev = hci_recv_event_data(hcon->hdev, HCI_EVT_LE_BIG_SYNC_ESTABLISHED); /* Get reference to PA sync parent socket, if it exists */ parent = iso_get_sock(&hcon->src, &hcon->dst, BT_LISTEN, iso_match_pa_sync_flag, NULL); if (!parent && ev) parent = iso_get_sock(&hcon->src, &hcon->dst, BT_LISTEN, iso_match_big, ev); } else if (test_bit(HCI_CONN_PA_SYNC_FAILED, &hcon->flags)) { ev2 = hci_recv_event_data(hcon->hdev, HCI_EV_LE_PA_SYNC_ESTABLISHED); if (ev2) parent = iso_get_sock(&hcon->src, &hcon->dst, BT_LISTEN, iso_match_sid, ev2); } else if (test_bit(HCI_CONN_PA_SYNC, &hcon->flags)) { ev3 = hci_recv_event_data(hcon->hdev, HCI_EV_LE_PER_ADV_REPORT); if (ev3) parent = iso_get_sock(&hcon->src, &hcon->dst, BT_LISTEN, iso_match_sync_handle_pa_report, ev3); } if (!parent) parent = iso_get_sock(&hcon->src, BDADDR_ANY, BT_LISTEN, NULL, NULL); if (!parent) return; lock_sock(parent); sk = iso_sock_alloc(sock_net(parent), NULL, BTPROTO_ISO, GFP_ATOMIC, 0); if (!sk) { release_sock(parent); return; } iso_sock_init(sk, parent); bacpy(&iso_pi(sk)->src, &hcon->src); /* Convert from HCI to three-value type */ if (hcon->src_type == ADDR_LE_DEV_PUBLIC) iso_pi(sk)->src_type = BDADDR_LE_PUBLIC; else iso_pi(sk)->src_type = BDADDR_LE_RANDOM; /* If hcon has no destination address (BDADDR_ANY) it means it * was created by HCI_EV_LE_BIG_SYNC_ESTABILISHED or * HCI_EV_LE_PA_SYNC_ESTABLISHED so we need to initialize using * the parent socket destination address. */ if (!bacmp(&hcon->dst, BDADDR_ANY)) { bacpy(&hcon->dst, &iso_pi(parent)->dst); hcon->dst_type = iso_pi(parent)->dst_type; } if (test_bit(HCI_CONN_PA_SYNC, &hcon->flags)) { iso_pi(sk)->qos = iso_pi(parent)->qos; hcon->iso_qos = iso_pi(sk)->qos; iso_pi(sk)->bc_sid = iso_pi(parent)->bc_sid; iso_pi(sk)->bc_num_bis = iso_pi(parent)->bc_num_bis; memcpy(iso_pi(sk)->bc_bis, iso_pi(parent)->bc_bis, ISO_MAX_NUM_BIS); set_bit(BT_SK_PA_SYNC, &iso_pi(sk)->flags); } bacpy(&iso_pi(sk)->dst, &hcon->dst); iso_pi(sk)->dst_type = hcon->dst_type; iso_pi(sk)->sync_handle = iso_pi(parent)->sync_handle; memcpy(iso_pi(sk)->base, iso_pi(parent)->base, iso_pi(parent)->base_len); iso_pi(sk)->base_len = iso_pi(parent)->base_len; hci_conn_hold(hcon); iso_chan_add(conn, sk, parent); if ((ev && ((struct hci_evt_le_big_sync_estabilished *)ev)->status) || (ev2 && ev2->status)) { /* Trigger error signal on child socket */ sk->sk_err = ECONNREFUSED; sk->sk_error_report(sk); } if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(parent)->flags)) sk->sk_state = BT_CONNECT2; else sk->sk_state = BT_CONNECTED; /* Wake up parent */ parent->sk_data_ready(parent); release_sock(parent); sock_put(parent); } } static bool iso_match_sid(struct sock *sk, void *data) { struct hci_ev_le_pa_sync_established *ev = data; if (iso_pi(sk)->bc_sid == HCI_SID_INVALID) return true; return ev->sid == iso_pi(sk)->bc_sid; } static bool iso_match_sync_handle(struct sock *sk, void *data) { struct hci_evt_le_big_info_adv_report *ev = data; return le16_to_cpu(ev->sync_handle) == iso_pi(sk)->sync_handle; } static bool iso_match_sync_handle_pa_report(struct sock *sk, void *data) { struct hci_ev_le_per_adv_report *ev = data; return le16_to_cpu(ev->sync_handle) == iso_pi(sk)->sync_handle; } /* ----- ISO interface with lower layer (HCI) ----- */ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) { struct hci_ev_le_pa_sync_established *ev1; struct hci_evt_le_big_info_adv_report *ev2; struct hci_ev_le_per_adv_report *ev3; struct sock *sk; bt_dev_dbg(hdev, "bdaddr %pMR", bdaddr); /* Broadcast receiver requires handling of some events before it can * proceed to establishing a BIG sync: * * 1. HCI_EV_LE_PA_SYNC_ESTABLISHED: The socket may specify a specific * SID to listen to and once sync is estabilished its handle needs to * be stored in iso_pi(sk)->sync_handle so it can be matched once * receiving the BIG Info. * 2. HCI_EVT_LE_BIG_INFO_ADV_REPORT: When connect_ind is triggered by a * a BIG Info it attempts to check if there any listening socket with * the same sync_handle and if it does then attempt to create a sync. * 3. HCI_EV_LE_PER_ADV_REPORT: When a PA report is received, it is stored * in iso_pi(sk)->base so it can be passed up to user, in the case of a * broadcast sink. */ ev1 = hci_recv_event_data(hdev, HCI_EV_LE_PA_SYNC_ESTABLISHED); if (ev1) { sk = iso_get_sock(&hdev->bdaddr, bdaddr, BT_LISTEN, iso_match_sid, ev1); if (sk && !ev1->status) { iso_pi(sk)->sync_handle = le16_to_cpu(ev1->handle); iso_pi(sk)->bc_sid = ev1->sid; } goto done; } ev2 = hci_recv_event_data(hdev, HCI_EVT_LE_BIG_INFO_ADV_REPORT); if (ev2) { /* Check if BIGInfo report has already been handled */ sk = iso_get_sock(&hdev->bdaddr, bdaddr, BT_CONNECTED, iso_match_sync_handle, ev2); if (sk) { sock_put(sk); sk = NULL; goto done; } /* Try to get PA sync socket, if it exists */ sk = iso_get_sock(&hdev->bdaddr, bdaddr, BT_CONNECT2, iso_match_sync_handle, ev2); if (!sk) sk = iso_get_sock(&hdev->bdaddr, bdaddr, BT_LISTEN, iso_match_sync_handle, ev2); if (sk) { int err; struct hci_conn *hcon = iso_pi(sk)->conn->hcon; iso_pi(sk)->qos.bcast.encryption = ev2->encryption; if (ev2->num_bis < iso_pi(sk)->bc_num_bis) iso_pi(sk)->bc_num_bis = ev2->num_bis; if (!test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags) && !test_and_set_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags)) { err = hci_conn_big_create_sync(hdev, hcon, &iso_pi(sk)->qos, iso_pi(sk)->sync_handle, iso_pi(sk)->bc_num_bis, iso_pi(sk)->bc_bis); if (err) { bt_dev_err(hdev, "hci_le_big_create_sync: %d", err); sock_put(sk); sk = NULL; } } } goto done; } ev3 = hci_recv_event_data(hdev, HCI_EV_LE_PER_ADV_REPORT); if (ev3) { size_t base_len = 0; u8 *base; struct hci_conn *hcon; sk = iso_get_sock(&hdev->bdaddr, bdaddr, BT_LISTEN, iso_match_sync_handle_pa_report, ev3); if (!sk) goto done; hcon = iso_pi(sk)->conn->hcon; if (!hcon) goto done; if (ev3->data_status == LE_PA_DATA_TRUNCATED) { /* The controller was unable to retrieve PA data. */ memset(hcon->le_per_adv_data, 0, HCI_MAX_PER_AD_TOT_LEN); hcon->le_per_adv_data_len = 0; hcon->le_per_adv_data_offset = 0; goto done; } if (hcon->le_per_adv_data_offset + ev3->length > HCI_MAX_PER_AD_TOT_LEN) goto done; memcpy(hcon->le_per_adv_data + hcon->le_per_adv_data_offset, ev3->data, ev3->length); hcon->le_per_adv_data_offset += ev3->length; if (ev3->data_status == LE_PA_DATA_COMPLETE) { /* All PA data has been received. */ hcon->le_per_adv_data_len = hcon->le_per_adv_data_offset; hcon->le_per_adv_data_offset = 0; /* Extract BASE */ base = eir_get_service_data(hcon->le_per_adv_data, hcon->le_per_adv_data_len, EIR_BAA_SERVICE_UUID, &base_len); if (!base || base_len > BASE_MAX_LENGTH) goto done; memcpy(iso_pi(sk)->base, base, base_len); iso_pi(sk)->base_len = base_len; } else { /* This is a PA data fragment. Keep pa_data_len set to 0 * until all data has been reassembled. */ hcon->le_per_adv_data_len = 0; } } else { sk = iso_get_sock(&hdev->bdaddr, BDADDR_ANY, BT_LISTEN, NULL, NULL); } done: if (!sk) return 0; if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) *flags |= HCI_PROTO_DEFER; sock_put(sk); return HCI_LM_ACCEPT; } static void iso_connect_cfm(struct hci_conn *hcon, __u8 status) { if (hcon->type != CIS_LINK && hcon->type != BIS_LINK) { if (hcon->type != LE_LINK) return; /* Check if LE link has failed */ if (status) { struct hci_link *link, *t; list_for_each_entry_safe(link, t, &hcon->link_list, list) iso_conn_del(link->conn, bt_to_errno(status)); return; } /* Create CIS if pending */ hci_le_create_cis_pending(hcon->hdev); return; } BT_DBG("hcon %p bdaddr %pMR status %d", hcon, &hcon->dst, status); /* Similar to the success case, if HCI_CONN_BIG_SYNC_FAILED or * HCI_CONN_PA_SYNC_FAILED is set, queue the failed connection * into the accept queue of the listening socket and wake up * userspace, to inform the user about the event. */ if (!status || test_bit(HCI_CONN_BIG_SYNC_FAILED, &hcon->flags) || test_bit(HCI_CONN_PA_SYNC_FAILED, &hcon->flags)) { struct iso_conn *conn; conn = iso_conn_add(hcon); if (conn) iso_conn_ready(conn); } else { iso_conn_del(hcon, bt_to_errno(status)); } } static void iso_disconn_cfm(struct hci_conn *hcon, __u8 reason) { if (hcon->type != CIS_LINK && hcon->type != BIS_LINK) return; BT_DBG("hcon %p reason %d", hcon, reason); iso_conn_del(hcon, bt_to_errno(reason)); } void iso_recv(struct hci_conn *hcon, struct sk_buff *skb, u16 flags) { struct iso_conn *conn = hcon->iso_data; __u16 pb, ts, len; if (!conn) goto drop; pb = hci_iso_flags_pb(flags); ts = hci_iso_flags_ts(flags); BT_DBG("conn %p len %d pb 0x%x ts 0x%x", conn, skb->len, pb, ts); switch (pb) { case ISO_START: case ISO_SINGLE: if (conn->rx_len) { BT_ERR("Unexpected start frame (len %d)", skb->len); kfree_skb(conn->rx_skb); conn->rx_skb = NULL; conn->rx_len = 0; } if (ts) { struct hci_iso_ts_data_hdr *hdr; /* TODO: add timestamp to the packet? */ hdr = skb_pull_data(skb, HCI_ISO_TS_DATA_HDR_SIZE); if (!hdr) { BT_ERR("Frame is too short (len %d)", skb->len); goto drop; } len = __le16_to_cpu(hdr->slen); } else { struct hci_iso_data_hdr *hdr; hdr = skb_pull_data(skb, HCI_ISO_DATA_HDR_SIZE); if (!hdr) { BT_ERR("Frame is too short (len %d)", skb->len); goto drop; } len = __le16_to_cpu(hdr->slen); } flags = hci_iso_data_flags(len); len = hci_iso_data_len(len); BT_DBG("Start: total len %d, frag len %d flags 0x%4.4x", len, skb->len, flags); if (len == skb->len) { /* Complete frame received */ hci_skb_pkt_status(skb) = flags & 0x03; iso_recv_frame(conn, skb); return; } if (pb == ISO_SINGLE) { BT_ERR("Frame malformed (len %d, expected len %d)", skb->len, len); goto drop; } if (skb->len > len) { BT_ERR("Frame is too long (len %d, expected len %d)", skb->len, len); goto drop; } /* Allocate skb for the complete frame (with header) */ conn->rx_skb = bt_skb_alloc(len, GFP_KERNEL); if (!conn->rx_skb) goto drop; hci_skb_pkt_status(conn->rx_skb) = flags & 0x03; skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len), skb->len); conn->rx_len = len - skb->len; break; case ISO_CONT: BT_DBG("Cont: frag len %d (expecting %d)", skb->len, conn->rx_len); if (!conn->rx_len) { BT_ERR("Unexpected continuation frame (len %d)", skb->len); goto drop; } if (skb->len > conn->rx_len) { BT_ERR("Fragment is too long (len %d, expected %d)", skb->len, conn->rx_len); kfree_skb(conn->rx_skb); conn->rx_skb = NULL; conn->rx_len = 0; goto drop; } skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len), skb->len); conn->rx_len -= skb->len; return; case ISO_END: skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len), skb->len); conn->rx_len -= skb->len; if (!conn->rx_len) { struct sk_buff *rx_skb = conn->rx_skb; /* Complete frame received. iso_recv_frame * takes ownership of the skb so set the global * rx_skb pointer to NULL first. */ conn->rx_skb = NULL; iso_recv_frame(conn, rx_skb); } break; } drop: kfree_skb(skb); } static struct hci_cb iso_cb = { .name = "ISO", .connect_cfm = iso_connect_cfm, .disconn_cfm = iso_disconn_cfm, }; static int iso_debugfs_show(struct seq_file *f, void *p) { struct sock *sk; read_lock(&iso_sk_list.lock); sk_for_each(sk, &iso_sk_list.head) { seq_printf(f, "%pMR %pMR %d\n", &iso_pi(sk)->src, &iso_pi(sk)->dst, sk->sk_state); } read_unlock(&iso_sk_list.lock); return 0; } DEFINE_SHOW_ATTRIBUTE(iso_debugfs); static struct dentry *iso_debugfs; static const struct proto_ops iso_sock_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .release = iso_sock_release, .bind = iso_sock_bind, .connect = iso_sock_connect, .listen = iso_sock_listen, .accept = iso_sock_accept, .getname = iso_sock_getname, .sendmsg = iso_sock_sendmsg, .recvmsg = iso_sock_recvmsg, .poll = bt_sock_poll, .ioctl = bt_sock_ioctl, .mmap = sock_no_mmap, .socketpair = sock_no_socketpair, .shutdown = iso_sock_shutdown, .setsockopt = iso_sock_setsockopt, .getsockopt = iso_sock_getsockopt }; static const struct net_proto_family iso_sock_family_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .create = iso_sock_create, }; static bool iso_inited; bool iso_enabled(void) { return iso_inited; } int iso_init(void) { int err; BUILD_BUG_ON(sizeof(struct sockaddr_iso) > sizeof(struct sockaddr)); if (iso_inited) return -EALREADY; err = proto_register(&iso_proto, 0); if (err < 0) return err; err = bt_sock_register(BTPROTO_ISO, &iso_sock_family_ops); if (err < 0) { BT_ERR("ISO socket registration failed"); goto error; } err = bt_procfs_init(&init_net, "iso", &iso_sk_list, NULL); if (err < 0) { BT_ERR("Failed to create ISO proc file"); bt_sock_unregister(BTPROTO_ISO); goto error; } BT_INFO("ISO socket layer initialized"); hci_register_cb(&iso_cb); if (!IS_ERR_OR_NULL(bt_debugfs)) iso_debugfs = debugfs_create_file("iso", 0444, bt_debugfs, NULL, &iso_debugfs_fops); iso_inited = true; return 0; error: proto_unregister(&iso_proto); return err; } int iso_exit(void) { if (!iso_inited) return -EALREADY; bt_procfs_cleanup(&init_net, "iso"); debugfs_remove(iso_debugfs); iso_debugfs = NULL; hci_unregister_cb(&iso_cb); bt_sock_unregister(BTPROTO_ISO); proto_unregister(&iso_proto); iso_inited = false; return 0; } |
| 5 5 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | /* SPDX-License-Identifier: GPL-2.0-or-later */ #ifndef _LINUX_RSTREASON_H #define _LINUX_RSTREASON_H #include <net/dropreason-core.h> #include <uapi/linux/mptcp.h> #define DEFINE_RST_REASON(FN, FNe) \ FN(NOT_SPECIFIED) \ FN(NO_SOCKET) \ FN(TCP_INVALID_ACK_SEQUENCE) \ FN(TCP_RFC7323_PAWS) \ FN(TCP_TOO_OLD_ACK) \ FN(TCP_ACK_UNSENT_DATA) \ FN(TCP_FLAGS) \ FN(TCP_OLD_ACK) \ FN(TCP_ABORT_ON_DATA) \ FN(TCP_TIMEWAIT_SOCKET) \ FN(INVALID_SYN) \ FN(TCP_ABORT_ON_CLOSE) \ FN(TCP_ABORT_ON_LINGER) \ FN(TCP_ABORT_ON_MEMORY) \ FN(TCP_STATE) \ FN(TCP_KEEPALIVE_TIMEOUT) \ FN(TCP_DISCONNECT_WITH_DATA) \ FN(MPTCP_RST_EUNSPEC) \ FN(MPTCP_RST_EMPTCP) \ FN(MPTCP_RST_ERESOURCE) \ FN(MPTCP_RST_EPROHIBIT) \ FN(MPTCP_RST_EWQ2BIG) \ FN(MPTCP_RST_EBADPERF) \ FN(MPTCP_RST_EMIDDLEBOX) \ FN(ERROR) \ FNe(MAX) /** * enum sk_rst_reason - the reasons of socket reset * * The reasons of sk reset, which are used in TCP/MPTCP protocols. * * There are three parts in order: * 1) skb drop reasons: relying on drop reasons for such as passive reset * 2) independent reset reasons: such as active reset reasons * 3) reset reasons in MPTCP: only for MPTCP use */ enum sk_rst_reason { /* Refer to include/net/dropreason-core.h * Rely on skb drop reasons because it indicates exactly why RST * could happen. */ /** @SK_RST_REASON_NOT_SPECIFIED: reset reason is not specified */ SK_RST_REASON_NOT_SPECIFIED, /** @SK_RST_REASON_NO_SOCKET: no valid socket that can be used */ SK_RST_REASON_NO_SOCKET, /** * @SK_RST_REASON_TCP_INVALID_ACK_SEQUENCE: Not acceptable ACK SEQ * field because ack sequence is not in the window between snd_una * and snd_nxt */ SK_RST_REASON_TCP_INVALID_ACK_SEQUENCE, /** * @SK_RST_REASON_TCP_RFC7323_PAWS: PAWS check, corresponding to * LINUX_MIB_PAWSESTABREJECTED, LINUX_MIB_PAWSACTIVEREJECTED */ SK_RST_REASON_TCP_RFC7323_PAWS, /** @SK_RST_REASON_TCP_TOO_OLD_ACK: TCP ACK is too old */ SK_RST_REASON_TCP_TOO_OLD_ACK, /** * @SK_RST_REASON_TCP_ACK_UNSENT_DATA: TCP ACK for data we haven't * sent yet */ SK_RST_REASON_TCP_ACK_UNSENT_DATA, /** @SK_RST_REASON_TCP_FLAGS: TCP flags invalid */ SK_RST_REASON_TCP_FLAGS, /** @SK_RST_REASON_TCP_OLD_ACK: TCP ACK is old, but in window */ SK_RST_REASON_TCP_OLD_ACK, /** * @SK_RST_REASON_TCP_ABORT_ON_DATA: abort on data * corresponding to LINUX_MIB_TCPABORTONDATA */ SK_RST_REASON_TCP_ABORT_ON_DATA, /* Here start with the independent reasons */ /** @SK_RST_REASON_TCP_TIMEWAIT_SOCKET: happen on the timewait socket */ SK_RST_REASON_TCP_TIMEWAIT_SOCKET, /** * @SK_RST_REASON_INVALID_SYN: receive bad syn packet * RFC 793 says if the state is not CLOSED/LISTEN/SYN-SENT then * "fourth, check the SYN bit,...If the SYN is in the window it is * an error, send a reset" */ SK_RST_REASON_INVALID_SYN, /** * @SK_RST_REASON_TCP_ABORT_ON_CLOSE: abort on close * corresponding to LINUX_MIB_TCPABORTONCLOSE */ SK_RST_REASON_TCP_ABORT_ON_CLOSE, /** * @SK_RST_REASON_TCP_ABORT_ON_LINGER: abort on linger * corresponding to LINUX_MIB_TCPABORTONLINGER */ SK_RST_REASON_TCP_ABORT_ON_LINGER, /** * @SK_RST_REASON_TCP_ABORT_ON_MEMORY: abort on memory * corresponding to LINUX_MIB_TCPABORTONMEMORY */ SK_RST_REASON_TCP_ABORT_ON_MEMORY, /** * @SK_RST_REASON_TCP_STATE: abort on tcp state * Please see RFC 9293 for all possible reset conditions */ SK_RST_REASON_TCP_STATE, /** * @SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT: time to timeout * When we have already run out of all the chances, which means * keepalive timeout, we have to reset the connection */ SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT, /** * @SK_RST_REASON_TCP_DISCONNECT_WITH_DATA: disconnect when write * queue is not empty * It means user has written data into the write queue when doing * disconnecting, so we have to send an RST. */ SK_RST_REASON_TCP_DISCONNECT_WITH_DATA, /* Copy from include/uapi/linux/mptcp.h. * These reset fields will not be changed since they adhere to * RFC 8684. So do not touch them. I'm going to list each definition * of them respectively. */ /** * @SK_RST_REASON_MPTCP_RST_EUNSPEC: Unspecified error. * This is the default error; it implies that the subflow is no * longer available. The presence of this option shows that the * RST was generated by an MPTCP-aware device. */ SK_RST_REASON_MPTCP_RST_EUNSPEC, /** * @SK_RST_REASON_MPTCP_RST_EMPTCP: MPTCP-specific error. * An error has been detected in the processing of MPTCP options. * This is the usual reason code to return in the cases where a RST * is being sent to close a subflow because of an invalid response. */ SK_RST_REASON_MPTCP_RST_EMPTCP, /** * @SK_RST_REASON_MPTCP_RST_ERESOURCE: Lack of resources. * This code indicates that the sending host does not have enough * resources to support the terminated subflow. */ SK_RST_REASON_MPTCP_RST_ERESOURCE, /** * @SK_RST_REASON_MPTCP_RST_EPROHIBIT: Administratively prohibited. * This code indicates that the requested subflow is prohibited by * the policies of the sending host. */ SK_RST_REASON_MPTCP_RST_EPROHIBIT, /** * @SK_RST_REASON_MPTCP_RST_EWQ2BIG: Too much outstanding data. * This code indicates that there is an excessive amount of data * that needs to be transmitted over the terminated subflow while * having already been acknowledged over one or more other subflows. * This may occur if a path has been unavailable for a short period * and it is more efficient to reset and start again than it is to * retransmit the queued data. */ SK_RST_REASON_MPTCP_RST_EWQ2BIG, /** * @SK_RST_REASON_MPTCP_RST_EBADPERF: Unacceptable performance. * This code indicates that the performance of this subflow was * too low compared to the other subflows of this Multipath TCP * connection. */ SK_RST_REASON_MPTCP_RST_EBADPERF, /** * @SK_RST_REASON_MPTCP_RST_EMIDDLEBOX: Middlebox interference. * Middlebox interference has been detected over this subflow, * making MPTCP signaling invalid. For example, this may be sent * if the checksum does not validate. */ SK_RST_REASON_MPTCP_RST_EMIDDLEBOX, /** @SK_RST_REASON_ERROR: unexpected error happens */ SK_RST_REASON_ERROR, /** * @SK_RST_REASON_MAX: Maximum of socket reset reasons. * It shouldn't be used as a real 'reason'. */ SK_RST_REASON_MAX, }; /* Convert skb drop reasons to enum sk_rst_reason type */ static inline enum sk_rst_reason sk_rst_convert_drop_reason(enum skb_drop_reason reason) { switch (reason) { case SKB_DROP_REASON_NOT_SPECIFIED: return SK_RST_REASON_NOT_SPECIFIED; case SKB_DROP_REASON_NO_SOCKET: return SK_RST_REASON_NO_SOCKET; case SKB_DROP_REASON_TCP_INVALID_ACK_SEQUENCE: return SK_RST_REASON_TCP_INVALID_ACK_SEQUENCE; case SKB_DROP_REASON_TCP_RFC7323_PAWS: return SK_RST_REASON_TCP_RFC7323_PAWS; case SKB_DROP_REASON_TCP_TOO_OLD_ACK: return SK_RST_REASON_TCP_TOO_OLD_ACK; case SKB_DROP_REASON_TCP_ACK_UNSENT_DATA: return SK_RST_REASON_TCP_ACK_UNSENT_DATA; case SKB_DROP_REASON_TCP_FLAGS: return SK_RST_REASON_TCP_FLAGS; case SKB_DROP_REASON_TCP_OLD_ACK: return SK_RST_REASON_TCP_OLD_ACK; case SKB_DROP_REASON_TCP_ABORT_ON_DATA: return SK_RST_REASON_TCP_ABORT_ON_DATA; default: /* If we don't have our own corresponding reason */ return SK_RST_REASON_NOT_SPECIFIED; } } #endif |
| 2 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 | // SPDX-License-Identifier: GPL-2.0-or-later /* * * Copyright (C) Jonathan Naylor G4KLX (g4klx@g4klx.demon.co.uk) */ #include <linux/errno.h> #include <linux/types.h> #include <linux/socket.h> #include <linux/in.h> #include <linux/kernel.h> #include <linux/jiffies.h> #include <linux/timer.h> #include <linux/string.h> #include <linux/sockios.h> #include <linux/net.h> #include <linux/slab.h> #include <net/ax25.h> #include <linux/inet.h> #include <linux/netdevice.h> #include <linux/skbuff.h> #include <net/sock.h> #include <linux/fcntl.h> #include <linux/mm.h> #include <linux/interrupt.h> #include <net/rose.h> static void rose_ftimer_expiry(struct timer_list *); static void rose_t0timer_expiry(struct timer_list *); static void rose_transmit_restart_confirmation(struct rose_neigh *neigh); static void rose_transmit_restart_request(struct rose_neigh *neigh); void rose_start_ftimer(struct rose_neigh *neigh) { timer_delete(&neigh->ftimer); neigh->ftimer.function = rose_ftimer_expiry; neigh->ftimer.expires = jiffies + msecs_to_jiffies(sysctl_rose_link_fail_timeout); add_timer(&neigh->ftimer); } static void rose_start_t0timer(struct rose_neigh *neigh) { timer_delete(&neigh->t0timer); neigh->t0timer.function = rose_t0timer_expiry; neigh->t0timer.expires = jiffies + msecs_to_jiffies(sysctl_rose_restart_request_timeout); add_timer(&neigh->t0timer); } void rose_stop_ftimer(struct rose_neigh *neigh) { timer_delete(&neigh->ftimer); } void rose_stop_t0timer(struct rose_neigh *neigh) { timer_delete(&neigh->t0timer); } int rose_ftimer_running(struct rose_neigh *neigh) { return timer_pending(&neigh->ftimer); } static int rose_t0timer_running(struct rose_neigh *neigh) { return timer_pending(&neigh->t0timer); } static void rose_ftimer_expiry(struct timer_list *t) { } static void rose_t0timer_expiry(struct timer_list *t) { struct rose_neigh *neigh = timer_container_of(neigh, t, t0timer); rose_transmit_restart_request(neigh); neigh->dce_mode = 0; rose_start_t0timer(neigh); } /* * Interface to ax25_send_frame. Changes my level 2 callsign depending * on whether we have a global ROSE callsign or use the default port * callsign. */ static int rose_send_frame(struct sk_buff *skb, struct rose_neigh *neigh) { const ax25_address *rose_call; ax25_cb *ax25s; if (ax25cmp(&rose_callsign, &null_ax25_address) == 0) rose_call = (const ax25_address *)neigh->dev->dev_addr; else rose_call = &rose_callsign; ax25s = neigh->ax25; neigh->ax25 = ax25_send_frame(skb, 260, rose_call, &neigh->callsign, neigh->digipeat, neigh->dev); if (ax25s) ax25_cb_put(ax25s); return neigh->ax25 != NULL; } /* * Interface to ax25_link_up. Changes my level 2 callsign depending * on whether we have a global ROSE callsign or use the default port * callsign. */ static int rose_link_up(struct rose_neigh *neigh) { const ax25_address *rose_call; ax25_cb *ax25s; if (ax25cmp(&rose_callsign, &null_ax25_address) == 0) rose_call = (const ax25_address *)neigh->dev->dev_addr; else rose_call = &rose_callsign; ax25s = neigh->ax25; neigh->ax25 = ax25_find_cb(rose_call, &neigh->callsign, neigh->digipeat, neigh->dev); if (ax25s) ax25_cb_put(ax25s); return neigh->ax25 != NULL; } /* * This handles all restart and diagnostic frames. */ void rose_link_rx_restart(struct sk_buff *skb, struct rose_neigh *neigh, unsigned short frametype) { struct sk_buff *skbn; switch (frametype) { case ROSE_RESTART_REQUEST: rose_stop_t0timer(neigh); neigh->restarted = 1; neigh->dce_mode = (skb->data[3] == ROSE_DTE_ORIGINATED); rose_transmit_restart_confirmation(neigh); break; case ROSE_RESTART_CONFIRMATION: rose_stop_t0timer(neigh); neigh->restarted = 1; break; case ROSE_DIAGNOSTIC: pr_warn("ROSE: received diagnostic #%d - %3ph\n", skb->data[3], skb->data + 4); break; default: printk(KERN_WARNING "ROSE: received unknown %02X with LCI 000\n", frametype); break; } if (neigh->restarted) { while ((skbn = skb_dequeue(&neigh->queue)) != NULL) if (!rose_send_frame(skbn, neigh)) kfree_skb(skbn); } } /* * This routine is called when a Restart Request is needed */ static void rose_transmit_restart_request(struct rose_neigh *neigh) { struct sk_buff *skb; unsigned char *dptr; int len; len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 3; if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL) return; skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN); dptr = skb_put(skb, ROSE_MIN_LEN + 3); *dptr++ = AX25_P_ROSE; *dptr++ = ROSE_GFI; *dptr++ = 0x00; *dptr++ = ROSE_RESTART_REQUEST; *dptr++ = ROSE_DTE_ORIGINATED; *dptr++ = 0; if (!rose_send_frame(skb, neigh)) kfree_skb(skb); } /* * This routine is called when a Restart Confirmation is needed */ static void rose_transmit_restart_confirmation(struct rose_neigh *neigh) { struct sk_buff *skb; unsigned char *dptr; int len; len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 1; if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL) return; skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN); dptr = skb_put(skb, ROSE_MIN_LEN + 1); *dptr++ = AX25_P_ROSE; *dptr++ = ROSE_GFI; *dptr++ = 0x00; *dptr++ = ROSE_RESTART_CONFIRMATION; if (!rose_send_frame(skb, neigh)) kfree_skb(skb); } /* * This routine is called when a Clear Request is needed outside of the context * of a connected socket. */ void rose_transmit_clear_request(struct rose_neigh *neigh, unsigned int lci, unsigned char cause, unsigned char diagnostic) { struct sk_buff *skb; unsigned char *dptr; int len; if (!neigh->dev) return; len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 3; if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL) return; skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN); dptr = skb_put(skb, ROSE_MIN_LEN + 3); *dptr++ = AX25_P_ROSE; *dptr++ = ((lci >> 8) & 0x0F) | ROSE_GFI; *dptr++ = ((lci >> 0) & 0xFF); *dptr++ = ROSE_CLEAR_REQUEST; *dptr++ = cause; *dptr++ = diagnostic; if (!rose_send_frame(skb, neigh)) kfree_skb(skb); } void rose_transmit_link(struct sk_buff *skb, struct rose_neigh *neigh) { unsigned char *dptr; if (neigh->loopback) { rose_loopback_queue(skb, neigh); return; } if (!rose_link_up(neigh)) neigh->restarted = 0; dptr = skb_push(skb, 1); *dptr++ = AX25_P_ROSE; if (neigh->restarted) { if (!rose_send_frame(skb, neigh)) kfree_skb(skb); } else { skb_queue_tail(&neigh->queue, skb); if (!rose_t0timer_running(neigh)) { rose_transmit_restart_request(neigh); neigh->dce_mode = 0; rose_start_t0timer(neigh); } } } |
| 162 121 4 46 48 232 165 73 234 155 82 22 22 20 1 252 252 251 1 253 20 222 13 74 81 2 7 7 14 2 35 36 1 1 7 7 52 1 14 36 12 19 19 1 18 31 5 2 18 11 19 10 8 8 1 6 12 1 3 8 8 1 9 6 7 11 1 7 6 8 8 1 6 21 20 8 12 6 292 290 290 2 46 44 45 2 10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 | // SPDX-License-Identifier: GPL-2.0 /* * linux/fs/stat.c * * Copyright (C) 1991, 1992 Linus Torvalds */ #include <linux/blkdev.h> #include <linux/export.h> #include <linux/mm.h> #include <linux/errno.h> #include <linux/file.h> #include <linux/highuid.h> #include <linux/fs.h> #include <linux/namei.h> #include <linux/security.h> #include <linux/cred.h> #include <linux/syscalls.h> #include <linux/pagemap.h> #include <linux/compat.h> #include <linux/iversion.h> #include <linux/uaccess.h> #include <asm/unistd.h> #include <trace/events/timestamp.h> #include "internal.h" #include "mount.h" /** * fill_mg_cmtime - Fill in the mtime and ctime and flag ctime as QUERIED * @stat: where to store the resulting values * @request_mask: STATX_* values requested * @inode: inode from which to grab the c/mtime * * Given @inode, grab the ctime and mtime out if it and store the result * in @stat. When fetching the value, flag it as QUERIED (if not already) * so the next write will record a distinct timestamp. * * NB: The QUERIED flag is tracked in the ctime, but we set it there even * if only the mtime was requested, as that ensures that the next mtime * change will be distinct. */ void fill_mg_cmtime(struct kstat *stat, u32 request_mask, struct inode *inode) { atomic_t *pcn = (atomic_t *)&inode->i_ctime_nsec; /* If neither time was requested, then don't report them */ if (!(request_mask & (STATX_CTIME|STATX_MTIME))) { stat->result_mask &= ~(STATX_CTIME|STATX_MTIME); return; } stat->mtime = inode_get_mtime(inode); stat->ctime.tv_sec = inode->i_ctime_sec; stat->ctime.tv_nsec = (u32)atomic_read(pcn); if (!(stat->ctime.tv_nsec & I_CTIME_QUERIED)) stat->ctime.tv_nsec = ((u32)atomic_fetch_or(I_CTIME_QUERIED, pcn)); stat->ctime.tv_nsec &= ~I_CTIME_QUERIED; trace_fill_mg_cmtime(inode, &stat->ctime, &stat->mtime); } EXPORT_SYMBOL(fill_mg_cmtime); /** * generic_fillattr - Fill in the basic attributes from the inode struct * @idmap: idmap of the mount the inode was found from * @request_mask: statx request_mask * @inode: Inode to use as the source * @stat: Where to fill in the attributes * * Fill in the basic attributes in the kstat structure from data that's to be * found on the VFS inode structure. This is the default if no getattr inode * operation is supplied. * * If the inode has been found through an idmapped mount the idmap of * the vfsmount must be passed through @idmap. This function will then * take care to map the inode according to @idmap before filling in the * uid and gid filds. On non-idmapped mounts or if permission checking is to be * performed on the raw inode simply pass @nop_mnt_idmap. */ void generic_fillattr(struct mnt_idmap *idmap, u32 request_mask, struct inode *inode, struct kstat *stat) { vfsuid_t vfsuid = i_uid_into_vfsuid(idmap, inode); vfsgid_t vfsgid = i_gid_into_vfsgid(idmap, inode); stat->dev = inode->i_sb->s_dev; stat->ino = inode->i_ino; stat->mode = inode->i_mode; stat->nlink = inode->i_nlink; stat->uid = vfsuid_into_kuid(vfsuid); stat->gid = vfsgid_into_kgid(vfsgid); stat->rdev = inode->i_rdev; stat->size = i_size_read(inode); stat->atime = inode_get_atime(inode); if (is_mgtime(inode)) { fill_mg_cmtime(stat, request_mask, inode); } else { stat->ctime = inode_get_ctime(inode); stat->mtime = inode_get_mtime(inode); } stat->blksize = i_blocksize(inode); stat->blocks = inode->i_blocks; if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { stat->result_mask |= STATX_CHANGE_COOKIE; stat->change_cookie = inode_query_iversion(inode); } } EXPORT_SYMBOL(generic_fillattr); /** * generic_fill_statx_attr - Fill in the statx attributes from the inode flags * @inode: Inode to use as the source * @stat: Where to fill in the attribute flags * * Fill in the STATX_ATTR_* flags in the kstat structure for properties of the * inode that are published on i_flags and enforced by the VFS. */ void generic_fill_statx_attr(struct inode *inode, struct kstat *stat) { if (inode->i_flags & S_IMMUTABLE) stat->attributes |= STATX_ATTR_IMMUTABLE; if (inode->i_flags & S_APPEND) stat->attributes |= STATX_ATTR_APPEND; stat->attributes_mask |= KSTAT_ATTR_VFS_FLAGS; } EXPORT_SYMBOL(generic_fill_statx_attr); /** * generic_fill_statx_atomic_writes - Fill in atomic writes statx attributes * @stat: Where to fill in the attribute flags * @unit_min: Minimum supported atomic write length in bytes * @unit_max: Maximum supported atomic write length in bytes * @unit_max_opt: Optimised maximum supported atomic write length in bytes * * Fill in the STATX{_ATTR}_WRITE_ATOMIC flags in the kstat structure from * atomic write unit_min and unit_max values. */ void generic_fill_statx_atomic_writes(struct kstat *stat, unsigned int unit_min, unsigned int unit_max, unsigned int unit_max_opt) { /* Confirm that the request type is known */ stat->result_mask |= STATX_WRITE_ATOMIC; /* Confirm that the file attribute type is known */ stat->attributes_mask |= STATX_ATTR_WRITE_ATOMIC; if (unit_min) { stat->atomic_write_unit_min = unit_min; stat->atomic_write_unit_max = unit_max; stat->atomic_write_unit_max_opt = unit_max_opt; /* Initially only allow 1x segment */ stat->atomic_write_segments_max = 1; /* Confirm atomic writes are actually supported */ stat->attributes |= STATX_ATTR_WRITE_ATOMIC; } } EXPORT_SYMBOL_GPL(generic_fill_statx_atomic_writes); /** * vfs_getattr_nosec - getattr without security checks * @path: file to get attributes from * @stat: structure to return attributes in * @request_mask: STATX_xxx flags indicating what the caller wants * @query_flags: Query mode (AT_STATX_SYNC_TYPE) * * Get attributes without calling security_inode_getattr. * * Currently the only caller other than vfs_getattr is internal to the * filehandle lookup code, which uses only the inode number and returns no * attributes to any user. Any other code probably wants vfs_getattr. */ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct mnt_idmap *idmap; struct inode *inode = d_backing_inode(path->dentry); memset(stat, 0, sizeof(*stat)); stat->result_mask |= STATX_BASIC_STATS; query_flags &= AT_STATX_SYNC_TYPE; /* allow the fs to override these if it really wants to */ /* SB_NOATIME means filesystem supplies dummy atime value */ if (inode->i_sb->s_flags & SB_NOATIME) stat->result_mask &= ~STATX_ATIME; /* * Note: If you add another clause to set an attribute flag, please * update attributes_mask below. */ if (IS_AUTOMOUNT(inode)) stat->attributes |= STATX_ATTR_AUTOMOUNT; if (IS_DAX(inode)) stat->attributes |= STATX_ATTR_DAX; stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | STATX_ATTR_DAX); idmap = mnt_idmap(path->mnt); if (inode->i_op->getattr) { int ret; ret = inode->i_op->getattr(idmap, path, stat, request_mask, query_flags); if (ret) return ret; } else { generic_fillattr(idmap, request_mask, inode, stat); } /* * If this is a block device inode, override the filesystem attributes * with the block device specific parameters that need to be obtained * from the bdev backing inode. */ if (S_ISBLK(stat->mode)) bdev_statx(path, stat, request_mask); return 0; } EXPORT_SYMBOL(vfs_getattr_nosec); /* * vfs_getattr - Get the enhanced basic attributes of a file * @path: The file of interest * @stat: Where to return the statistics * @request_mask: STATX_xxx flags indicating what the caller wants * @query_flags: Query mode (AT_STATX_SYNC_TYPE) * * Ask the filesystem for a file's attributes. The caller must indicate in * request_mask and query_flags to indicate what they want. * * If the file is remote, the filesystem can be forced to update the attributes * from the backing store by passing AT_STATX_FORCE_SYNC in query_flags or can * suppress the update by passing AT_STATX_DONT_SYNC. * * Bits must have been set in request_mask to indicate which attributes the * caller wants retrieving. Any such attribute not requested may be returned * anyway, but the value may be approximate, and, if remote, may not have been * synchronised with the server. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ int vfs_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { int retval; retval = security_inode_getattr(path); if (unlikely(retval)) return retval; return vfs_getattr_nosec(path, stat, request_mask, query_flags); } EXPORT_SYMBOL(vfs_getattr); /** * vfs_fstat - Get the basic attributes by file descriptor * @fd: The file descriptor referring to the file of interest * @stat: The result structure to fill in. * * This function is a wrapper around vfs_getattr(). The main difference is * that it uses a file descriptor to determine the file location. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ int vfs_fstat(int fd, struct kstat *stat) { CLASS(fd_raw, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_getattr(&fd_file(f)->f_path, stat, STATX_BASIC_STATS, 0); } static int statx_lookup_flags(int flags) { int lookup_flags = 0; if (!(flags & AT_SYMLINK_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; if (!(flags & AT_NO_AUTOMOUNT)) lookup_flags |= LOOKUP_AUTOMOUNT; return lookup_flags; } static int vfs_statx_path(struct path *path, int flags, struct kstat *stat, u32 request_mask) { int error = vfs_getattr(path, stat, request_mask, flags); if (error) return error; if (request_mask & STATX_MNT_ID_UNIQUE) { stat->mnt_id = real_mount(path->mnt)->mnt_id_unique; stat->result_mask |= STATX_MNT_ID_UNIQUE; } else { stat->mnt_id = real_mount(path->mnt)->mnt_id; stat->result_mask |= STATX_MNT_ID; } if (path_mounted(path)) stat->attributes |= STATX_ATTR_MOUNT_ROOT; stat->attributes_mask |= STATX_ATTR_MOUNT_ROOT; return 0; } static int vfs_statx_fd(int fd, int flags, struct kstat *stat, u32 request_mask) { CLASS(fd_raw, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_statx_path(&fd_file(f)->f_path, flags, stat, request_mask); } /** * vfs_statx - Get basic and extra attributes by filename * @dfd: A file descriptor representing the base dir for a relative filename * @filename: The name of the file of interest * @flags: Flags to control the query * @stat: The result structure to fill in. * @request_mask: STATX_xxx flags indicating what the caller wants * * This function is a wrapper around vfs_getattr(). The main difference is * that it uses a filename and base directory to determine the file location. * Additionally, the use of AT_SYMLINK_NOFOLLOW in flags will prevent a symlink * at the given name from being referenced. * * 0 will be returned on success, and a -ve error code if unsuccessful. */ static int vfs_statx(int dfd, struct filename *filename, int flags, struct kstat *stat, u32 request_mask) { struct path path; unsigned int lookup_flags = statx_lookup_flags(flags); int error; if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT | AT_EMPTY_PATH | AT_STATX_SYNC_TYPE)) return -EINVAL; retry: error = filename_lookup(dfd, filename, lookup_flags, &path, NULL); if (error) return error; error = vfs_statx_path(&path, flags, stat, request_mask); path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } return error; } int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat, int flags) { int ret; int statx_flags = flags | AT_NO_AUTOMOUNT; struct filename *name = getname_maybe_null(filename, flags); if (!name && dfd >= 0) return vfs_fstat(dfd, stat); ret = vfs_statx(dfd, name, statx_flags, stat, STATX_BASIC_STATS); putname(name); return ret; } #ifdef __ARCH_WANT_OLD_STAT /* * For backward compatibility? Maybe this should be moved * into arch/i386 instead? */ static int cp_old_stat(struct kstat *stat, struct __old_kernel_stat __user * statbuf) { static int warncount = 5; struct __old_kernel_stat tmp; if (warncount > 0) { warncount--; printk(KERN_WARNING "VFS: Warning: %s using old stat() call. Recompile your binary.\n", current->comm); } else if (warncount < 0) { /* it's laughable, but... */ warncount = 0; } memset(&tmp, 0, sizeof(struct __old_kernel_stat)); tmp.st_dev = old_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = old_encode_dev(stat->rdev); #if BITS_PER_LONG == 32 if (stat->size > MAX_NON_LFS) return -EOVERFLOW; #endif tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_ctime = stat->ctime.tv_sec; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(stat, const char __user *, filename, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } SYSCALL_DEFINE2(lstat, const char __user *, filename, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } SYSCALL_DEFINE2(fstat, unsigned int, fd, struct __old_kernel_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_fstat(fd, &stat); if (unlikely(error)) return error; return cp_old_stat(&stat, statbuf); } #endif /* __ARCH_WANT_OLD_STAT */ #ifdef __ARCH_WANT_NEW_STAT #ifndef INIT_STRUCT_STAT_PADDING # define INIT_STRUCT_STAT_PADDING(st) memset(&st, 0, sizeof(st)) #endif static int cp_new_stat(struct kstat *stat, struct stat __user *statbuf) { struct stat tmp; if (sizeof(tmp.st_dev) < 4 && !old_valid_dev(stat->dev)) return -EOVERFLOW; if (sizeof(tmp.st_rdev) < 4 && !old_valid_dev(stat->rdev)) return -EOVERFLOW; #if BITS_PER_LONG == 32 if (stat->size > MAX_NON_LFS) return -EOVERFLOW; #endif INIT_STRUCT_STAT_PADDING(tmp); tmp.st_dev = new_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = new_encode_dev(stat->rdev); tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_ctime = stat->ctime.tv_sec; #ifdef STAT_HAVE_NSEC tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; #endif tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(newstat, const char __user *, filename, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } SYSCALL_DEFINE2(newlstat, const char __user *, filename, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #if !defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_SYS_NEWFSTATAT) SYSCALL_DEFINE4(newfstatat, int, dfd, const char __user *, filename, struct stat __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #endif SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf) { struct kstat stat; int error; error = vfs_fstat(fd, &stat); if (unlikely(error)) return error; return cp_new_stat(&stat, statbuf); } #endif static int do_readlinkat(int dfd, const char __user *pathname, char __user *buf, int bufsiz) { struct path path; struct filename *name; int error; unsigned int lookup_flags = LOOKUP_EMPTY; if (bufsiz <= 0) return -EINVAL; retry: name = getname_flags(pathname, lookup_flags); error = filename_lookup(dfd, name, lookup_flags, &path, NULL); if (unlikely(error)) { putname(name); return error; } /* * AFS mountpoints allow readlink(2) but are not symlinks */ if (d_is_symlink(path.dentry) || d_backing_inode(path.dentry)->i_op->readlink) { error = security_inode_readlink(path.dentry); if (!error) { touch_atime(&path); error = vfs_readlink(path.dentry, buf, bufsiz); } } else { error = (name->name[0] == '\0') ? -ENOENT : -EINVAL; } path_put(&path); putname(name); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } return error; } SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname, char __user *, buf, int, bufsiz) { return do_readlinkat(dfd, pathname, buf, bufsiz); } SYSCALL_DEFINE3(readlink, const char __user *, path, char __user *, buf, int, bufsiz) { return do_readlinkat(AT_FDCWD, path, buf, bufsiz); } /* ---------- LFS-64 ----------- */ #if defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_COMPAT_STAT64) #ifndef INIT_STRUCT_STAT64_PADDING # define INIT_STRUCT_STAT64_PADDING(st) memset(&st, 0, sizeof(st)) #endif static long cp_new_stat64(struct kstat *stat, struct stat64 __user *statbuf) { struct stat64 tmp; INIT_STRUCT_STAT64_PADDING(tmp); #ifdef CONFIG_MIPS /* mips has weird padding, so we don't get 64 bits there */ tmp.st_dev = new_encode_dev(stat->dev); tmp.st_rdev = new_encode_dev(stat->rdev); #else tmp.st_dev = huge_encode_dev(stat->dev); tmp.st_rdev = huge_encode_dev(stat->rdev); #endif tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; #ifdef STAT64_HAS_BROKEN_ST_INO tmp.__st_ino = stat->ino; #endif tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; tmp.st_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.st_gid = from_kgid_munged(current_user_ns(), stat->gid); tmp.st_atime = stat->atime.tv_sec; tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime = stat->ctime.tv_sec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; tmp.st_size = stat->size; tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0; } SYSCALL_DEFINE2(stat64, const char __user *, filename, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_stat(filename, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE2(lstat64, const char __user *, filename, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_lstat(filename, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE2(fstat64, unsigned long, fd, struct stat64 __user *, statbuf) { struct kstat stat; int error = vfs_fstat(fd, &stat); if (!error) error = cp_new_stat64(&stat, statbuf); return error; } SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename, struct stat64 __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (error) return error; return cp_new_stat64(&stat, statbuf); } #endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */ static noinline_for_stack int cp_statx(const struct kstat *stat, struct statx __user *buffer) { struct statx tmp; memset(&tmp, 0, sizeof(tmp)); /* STATX_CHANGE_COOKIE is kernel-only for now */ tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; tmp.stx_blksize = stat->blksize; /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; tmp.stx_nlink = stat->nlink; tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); tmp.stx_mode = stat->mode; tmp.stx_ino = stat->ino; tmp.stx_size = stat->size; tmp.stx_blocks = stat->blocks; tmp.stx_attributes_mask = stat->attributes_mask; tmp.stx_atime.tv_sec = stat->atime.tv_sec; tmp.stx_atime.tv_nsec = stat->atime.tv_nsec; tmp.stx_btime.tv_sec = stat->btime.tv_sec; tmp.stx_btime.tv_nsec = stat->btime.tv_nsec; tmp.stx_ctime.tv_sec = stat->ctime.tv_sec; tmp.stx_ctime.tv_nsec = stat->ctime.tv_nsec; tmp.stx_mtime.tv_sec = stat->mtime.tv_sec; tmp.stx_mtime.tv_nsec = stat->mtime.tv_nsec; tmp.stx_rdev_major = MAJOR(stat->rdev); tmp.stx_rdev_minor = MINOR(stat->rdev); tmp.stx_dev_major = MAJOR(stat->dev); tmp.stx_dev_minor = MINOR(stat->dev); tmp.stx_mnt_id = stat->mnt_id; tmp.stx_dio_mem_align = stat->dio_mem_align; tmp.stx_dio_offset_align = stat->dio_offset_align; tmp.stx_dio_read_offset_align = stat->dio_read_offset_align; tmp.stx_subvol = stat->subvol; tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min; tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max; tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max; tmp.stx_atomic_write_unit_max_opt = stat->atomic_write_unit_max_opt; return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; } int do_statx(int dfd, struct filename *filename, unsigned int flags, unsigned int mask, struct statx __user *buffer) { struct kstat stat; int error; if (mask & STATX__RESERVED) return -EINVAL; if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; /* * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests * from userland. */ mask &= ~STATX_CHANGE_COOKIE; error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; return cp_statx(&stat, buffer); } int do_statx_fd(int fd, unsigned int flags, unsigned int mask, struct statx __user *buffer) { struct kstat stat; int error; if (mask & STATX__RESERVED) return -EINVAL; if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; /* * STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests * from userland. */ mask &= ~STATX_CHANGE_COOKIE; error = vfs_statx_fd(fd, flags, &stat, mask); if (error) return error; return cp_statx(&stat, buffer); } /** * sys_statx - System call to get enhanced stats * @dfd: Base directory to pathwalk from *or* fd to stat. * @filename: File to stat or either NULL or "" with AT_EMPTY_PATH * @flags: AT_* flags to control pathwalk. * @mask: Parts of statx struct actually required. * @buffer: Result buffer. * * Note that fstat() can be emulated by setting dfd to the fd of interest, * supplying "" (or preferably NULL) as the filename and setting AT_EMPTY_PATH * in the flags. */ SYSCALL_DEFINE5(statx, int, dfd, const char __user *, filename, unsigned, flags, unsigned int, mask, struct statx __user *, buffer) { int ret; struct filename *name = getname_maybe_null(filename, flags); if (!name && dfd >= 0) return do_statx_fd(dfd, flags & ~AT_NO_AUTOMOUNT, mask, buffer); ret = do_statx(dfd, name, flags, mask, buffer); putname(name); return ret; } #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_STAT) static int cp_compat_stat(struct kstat *stat, struct compat_stat __user *ubuf) { struct compat_stat tmp; if (sizeof(tmp.st_dev) < 4 && !old_valid_dev(stat->dev)) return -EOVERFLOW; if (sizeof(tmp.st_rdev) < 4 && !old_valid_dev(stat->rdev)) return -EOVERFLOW; memset(&tmp, 0, sizeof(tmp)); tmp.st_dev = new_encode_dev(stat->dev); tmp.st_ino = stat->ino; if (sizeof(tmp.st_ino) < sizeof(stat->ino) && tmp.st_ino != stat->ino) return -EOVERFLOW; tmp.st_mode = stat->mode; tmp.st_nlink = stat->nlink; if (tmp.st_nlink != stat->nlink) return -EOVERFLOW; SET_UID(tmp.st_uid, from_kuid_munged(current_user_ns(), stat->uid)); SET_GID(tmp.st_gid, from_kgid_munged(current_user_ns(), stat->gid)); tmp.st_rdev = new_encode_dev(stat->rdev); if ((u64) stat->size > MAX_NON_LFS) return -EOVERFLOW; tmp.st_size = stat->size; tmp.st_atime = stat->atime.tv_sec; tmp.st_atime_nsec = stat->atime.tv_nsec; tmp.st_mtime = stat->mtime.tv_sec; tmp.st_mtime_nsec = stat->mtime.tv_nsec; tmp.st_ctime = stat->ctime.tv_sec; tmp.st_ctime_nsec = stat->ctime.tv_nsec; tmp.st_blocks = stat->blocks; tmp.st_blksize = stat->blksize; return copy_to_user(ubuf, &tmp, sizeof(tmp)) ? -EFAULT : 0; } COMPAT_SYSCALL_DEFINE2(newstat, const char __user *, filename, struct compat_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_stat(filename, &stat); if (error) return error; return cp_compat_stat(&stat, statbuf); } COMPAT_SYSCALL_DEFINE2(newlstat, const char __user *, filename, struct compat_stat __user *, statbuf) { struct kstat stat; int error; error = vfs_lstat(filename, &stat); if (error) return error; return cp_compat_stat(&stat, statbuf); } #ifndef __ARCH_WANT_STAT64 COMPAT_SYSCALL_DEFINE4(newfstatat, unsigned int, dfd, const char __user *, filename, struct compat_stat __user *, statbuf, int, flag) { struct kstat stat; int error; error = vfs_fstatat(dfd, filename, &stat, flag); if (error) return error; return cp_compat_stat(&stat, statbuf); } #endif COMPAT_SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct compat_stat __user *, statbuf) { struct kstat stat; int error = vfs_fstat(fd, &stat); if (!error) error = cp_compat_stat(&stat, statbuf); return error; } #endif /* Caller is here responsible for sufficient locking (ie. inode->i_lock) */ void __inode_add_bytes(struct inode *inode, loff_t bytes) { inode->i_blocks += bytes >> 9; bytes &= 511; inode->i_bytes += bytes; if (inode->i_bytes >= 512) { inode->i_blocks++; inode->i_bytes -= 512; } } EXPORT_SYMBOL(__inode_add_bytes); void inode_add_bytes(struct inode *inode, loff_t bytes) { spin_lock(&inode->i_lock); __inode_add_bytes(inode, bytes); spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(inode_add_bytes); void __inode_sub_bytes(struct inode *inode, loff_t bytes) { inode->i_blocks -= bytes >> 9; bytes &= 511; if (inode->i_bytes < bytes) { inode->i_blocks--; inode->i_bytes += 512; } inode->i_bytes -= bytes; } EXPORT_SYMBOL(__inode_sub_bytes); void inode_sub_bytes(struct inode *inode, loff_t bytes) { spin_lock(&inode->i_lock); __inode_sub_bytes(inode, bytes); spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(inode_sub_bytes); loff_t inode_get_bytes(struct inode *inode) { loff_t ret; spin_lock(&inode->i_lock); ret = __inode_get_bytes(inode); spin_unlock(&inode->i_lock); return ret; } EXPORT_SYMBOL(inode_get_bytes); void inode_set_bytes(struct inode *inode, loff_t bytes) { /* Caller is here responsible for sufficient locking * (ie. inode->i_lock) */ inode->i_blocks = bytes >> 9; inode->i_bytes = bytes & 511; } EXPORT_SYMBOL(inode_set_bytes); |
| 109 105 111 111 38 143 101 76 125 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 | /* * Copyright (C) 2014 Red Hat * Copyright (C) 2014 Intel Corp. * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * * Authors: * Rob Clark <robdclark@gmail.com> * Daniel Vetter <daniel.vetter@ffwll.ch> */ #ifndef DRM_ATOMIC_H_ #define DRM_ATOMIC_H_ #include <drm/drm_crtc.h> #include <drm/drm_util.h> /** * struct drm_crtc_commit - track modeset commits on a CRTC * * This structure is used to track pending modeset changes and atomic commit on * a per-CRTC basis. Since updating the list should never block, this structure * is reference counted to allow waiters to safely wait on an event to complete, * without holding any locks. * * It has 3 different events in total to allow a fine-grained synchronization * between outstanding updates:: * * atomic commit thread hardware * * write new state into hardware ----> ... * signal hw_done * switch to new state on next * ... v/hblank * * wait for buffers to show up ... * * ... send completion irq * irq handler signals flip_done * cleanup old buffers * * signal cleanup_done * * wait for flip_done <---- * clean up atomic state * * The important bit to know is that &cleanup_done is the terminal event, but the * ordering between &flip_done and &hw_done is entirely up to the specific driver * and modeset state change. * * For an implementation of how to use this look at * drm_atomic_helper_setup_commit() from the atomic helper library. * * See also drm_crtc_commit_wait(). */ struct drm_crtc_commit { /** * @crtc: * * DRM CRTC for this commit. */ struct drm_crtc *crtc; /** * @ref: * * Reference count for this structure. Needed to allow blocking on * completions without the risk of the completion disappearing * meanwhile. */ struct kref ref; /** * @flip_done: * * Will be signaled when the hardware has flipped to the new set of * buffers. Signals at the same time as when the drm event for this * commit is sent to userspace, or when an out-fence is singalled. Note * that for most hardware, in most cases this happens after @hw_done is * signalled. * * Completion of this stage is signalled implicitly by calling * drm_crtc_send_vblank_event() on &drm_crtc_state.event. */ struct completion flip_done; /** * @hw_done: * * Will be signalled when all hw register changes for this commit have * been written out. Especially when disabling a pipe this can be much * later than @flip_done, since that can signal already when the * screen goes black, whereas to fully shut down a pipe more register * I/O is required. * * Note that this does not need to include separately reference-counted * resources like backing storage buffer pinning, or runtime pm * management. * * Drivers should call drm_atomic_helper_commit_hw_done() to signal * completion of this stage. */ struct completion hw_done; /** * @cleanup_done: * * Will be signalled after old buffers have been cleaned up by calling * drm_atomic_helper_cleanup_planes(). Since this can only happen after * a vblank wait completed it might be a bit later. This completion is * useful to throttle updates and avoid hardware updates getting ahead * of the buffer cleanup too much. * * Drivers should call drm_atomic_helper_commit_cleanup_done() to signal * completion of this stage. */ struct completion cleanup_done; /** * @commit_entry: * * Entry on the per-CRTC &drm_crtc.commit_list. Protected by * $drm_crtc.commit_lock. */ struct list_head commit_entry; /** * @event: * * &drm_pending_vblank_event pointer to clean up private events. */ struct drm_pending_vblank_event *event; /** * @abort_completion: * * A flag that's set after drm_atomic_helper_setup_commit() takes a * second reference for the completion of $drm_crtc_state.event. It's * used by the free code to remove the second reference if commit fails. */ bool abort_completion; }; struct __drm_planes_state { struct drm_plane *ptr; struct drm_plane_state *state, *old_state, *new_state; }; struct __drm_crtcs_state { struct drm_crtc *ptr; struct drm_crtc_state *state, *old_state, *new_state; /** * @commit: * * A reference to the CRTC commit object that is kept for use by * drm_atomic_helper_wait_for_flip_done() after * drm_atomic_helper_commit_hw_done() is called. This ensures that a * concurrent commit won't free a commit object that is still in use. */ struct drm_crtc_commit *commit; s32 __user *out_fence_ptr; u64 last_vblank_count; }; struct __drm_connnectors_state { struct drm_connector *ptr; struct drm_connector_state *state, *old_state, *new_state; /** * @out_fence_ptr: * * User-provided pointer which the kernel uses to return a sync_file * file descriptor. Used by writeback connectors to signal completion of * the writeback. */ s32 __user *out_fence_ptr; }; struct drm_private_obj; struct drm_private_state; /** * struct drm_private_state_funcs - atomic state functions for private objects * * These hooks are used by atomic helpers to create, swap and destroy states of * private objects. The structure itself is used as a vtable to identify the * associated private object type. Each private object type that needs to be * added to the atomic states is expected to have an implementation of these * hooks and pass a pointer to its drm_private_state_funcs struct to * drm_atomic_get_private_obj_state(). */ struct drm_private_state_funcs { /** * @atomic_duplicate_state: * * Duplicate the current state of the private object and return it. It * is an error to call this before obj->state has been initialized. * * RETURNS: * * Duplicated atomic state or NULL when obj->state is not * initialized or allocation failed. */ struct drm_private_state *(*atomic_duplicate_state)(struct drm_private_obj *obj); /** * @atomic_destroy_state: * * Frees the private object state created with @atomic_duplicate_state. */ void (*atomic_destroy_state)(struct drm_private_obj *obj, struct drm_private_state *state); /** * @atomic_print_state: * * If driver subclasses &struct drm_private_state, it should implement * this optional hook for printing additional driver specific state. * * Do not call this directly, use drm_atomic_private_obj_print_state() * instead. */ void (*atomic_print_state)(struct drm_printer *p, const struct drm_private_state *state); }; /** * struct drm_private_obj - base struct for driver private atomic object * * A driver private object is initialized by calling * drm_atomic_private_obj_init() and cleaned up by calling * drm_atomic_private_obj_fini(). * * Currently only tracks the state update functions and the opaque driver * private state itself, but in the future might also track which * &drm_modeset_lock is required to duplicate and update this object's state. * * All private objects must be initialized before the DRM device they are * attached to is registered to the DRM subsystem (call to drm_dev_register()) * and should stay around until this DRM device is unregistered (call to * drm_dev_unregister()). In other words, private objects lifetime is tied * to the DRM device lifetime. This implies that: * * 1/ all calls to drm_atomic_private_obj_init() must be done before calling * drm_dev_register() * 2/ all calls to drm_atomic_private_obj_fini() must be done after calling * drm_dev_unregister() * * If that private object is used to store a state shared by multiple * CRTCs, proper care must be taken to ensure that non-blocking commits are * properly ordered to avoid a use-after-free issue. * * Indeed, assuming a sequence of two non-blocking &drm_atomic_commit on two * different &drm_crtc using different &drm_plane and &drm_connector, so with no * resources shared, there's no guarantee on which commit is going to happen * first. However, the second &drm_atomic_commit will consider the first * &drm_private_obj its old state, and will be in charge of freeing it whenever * the second &drm_atomic_commit is done. * * If the first &drm_atomic_commit happens after it, it will consider its * &drm_private_obj the new state and will be likely to access it, resulting in * an access to a freed memory region. Drivers should store (and get a reference * to) the &drm_crtc_commit structure in our private state in * &drm_mode_config_helper_funcs.atomic_commit_setup, and then wait for that * commit to complete as the first step of * &drm_mode_config_helper_funcs.atomic_commit_tail, similar to * drm_atomic_helper_wait_for_dependencies(). */ struct drm_private_obj { /** * @head: List entry used to attach a private object to a &drm_device * (queued to &drm_mode_config.privobj_list). */ struct list_head head; /** * @lock: Modeset lock to protect the state object. */ struct drm_modeset_lock lock; /** * @state: Current atomic state for this driver private object. */ struct drm_private_state *state; /** * @funcs: * * Functions to manipulate the state of this driver private object, see * &drm_private_state_funcs. */ const struct drm_private_state_funcs *funcs; }; /** * drm_for_each_privobj() - private object iterator * * @privobj: pointer to the current private object. Updated after each * iteration * @dev: the DRM device we want get private objects from * * Allows one to iterate over all private objects attached to @dev */ #define drm_for_each_privobj(privobj, dev) \ list_for_each_entry(privobj, &(dev)->mode_config.privobj_list, head) /** * struct drm_private_state - base struct for driver private object state * * Currently only contains a backpointer to the overall atomic update, * and the relevant private object but in the future also might hold * synchronization information similar to e.g. &drm_crtc.commit. */ struct drm_private_state { /** * @state: backpointer to global drm_atomic_state */ struct drm_atomic_state *state; /** * @obj: backpointer to the private object */ struct drm_private_obj *obj; }; struct __drm_private_objs_state { struct drm_private_obj *ptr; struct drm_private_state *state, *old_state, *new_state; }; /** * struct drm_atomic_state - Atomic commit structure * * This structure is the kernel counterpart of @drm_mode_atomic and represents * an atomic commit that transitions from an old to a new display state. It * contains all the objects affected by the atomic commit and both the new * state structures and pointers to the old state structures for * these. * * States are added to an atomic update by calling drm_atomic_get_crtc_state(), * drm_atomic_get_plane_state(), drm_atomic_get_connector_state(), or for * private state structures, drm_atomic_get_private_obj_state(). * * NOTE: struct drm_atomic_state first started as a single collection of * entities state pointers (drm_plane_state, drm_crtc_state, etc.). * * At atomic_check time, you could get the state about to be committed * from drm_atomic_state, and the one currently running from the * entities state pointer (drm_crtc.state, for example). After the call * to drm_atomic_helper_swap_state(), the entities state pointer would * contain the state previously checked, and the drm_atomic_state * structure the old state. * * Over time, and in order to avoid confusion, drm_atomic_state has * grown to have both the old state (ie, the state we replace) and the * new state (ie, the state we want to apply). Those names are stable * during the commit process, which makes it easier to reason about. * * You can still find some traces of that evolution through some hooks * or callbacks taking a drm_atomic_state parameter called names like * "old_state". This doesn't necessarily mean that the previous * drm_atomic_state is passed, but rather that this used to be the state * collection we were replacing after drm_atomic_helper_swap_state(), * but the variable name was never updated. * * Some atomic operations implementations followed a similar process. We * first started to pass the entity state only. However, it was pretty * cumbersome for drivers, and especially CRTCs, to retrieve the states * of other components. Thus, we switched to passing the whole * drm_atomic_state as a parameter to those operations. Similarly, the * transition isn't complete yet, and one might still find atomic * operations taking a drm_atomic_state pointer, or a component state * pointer. The former is the preferred form. */ struct drm_atomic_state { /** * @ref: * * Count of all references to this update (will not be freed until zero). */ struct kref ref; /** * @dev: Parent DRM Device. */ struct drm_device *dev; /** * @allow_modeset: * * Allow full modeset. This is used by the ATOMIC IOCTL handler to * implement the DRM_MODE_ATOMIC_ALLOW_MODESET flag. Drivers should * generally not consult this flag, but instead look at the output of * drm_atomic_crtc_needs_modeset(). The detailed rules are: * * - Drivers must not consult @allow_modeset in the atomic commit path. * Use drm_atomic_crtc_needs_modeset() instead. * * - Drivers must consult @allow_modeset before adding unrelated struct * drm_crtc_state to this commit by calling * drm_atomic_get_crtc_state(). See also the warning in the * documentation for that function. * * - Drivers must never change this flag, it is under the exclusive * control of userspace. * * - Drivers may consult @allow_modeset in the atomic check path, if * they have the choice between an optimal hardware configuration * which requires a modeset, and a less optimal configuration which * can be committed without a modeset. An example would be suboptimal * scanout FIFO allocation resulting in increased idle power * consumption. This allows userspace to avoid flickering and delays * for the normal composition loop at reasonable cost. */ bool allow_modeset : 1; /** * @legacy_cursor_update: * * Hint to enforce legacy cursor IOCTL semantics. * * WARNING: This is thoroughly broken and pretty much impossible to * implement correctly. Drivers must ignore this and should instead * implement &drm_plane_helper_funcs.atomic_async_check and * &drm_plane_helper_funcs.atomic_async_commit hooks. New users of this * flag are not allowed. */ bool legacy_cursor_update : 1; /** * @async_update: hint for asynchronous plane update */ bool async_update : 1; /** * @duplicated: * * Indicates whether or not this atomic state was duplicated using * drm_atomic_helper_duplicate_state(). Drivers and atomic helpers * should use this to fixup normal inconsistencies in duplicated * states. */ bool duplicated : 1; /** * @planes: * * Pointer to array of @drm_plane and @drm_plane_state part of this * update. */ struct __drm_planes_state *planes; /** * @crtcs: * * Pointer to array of @drm_crtc and @drm_crtc_state part of this * update. */ struct __drm_crtcs_state *crtcs; /** * @num_connector: size of the @connectors array */ int num_connector; /** * @connectors: * * Pointer to array of @drm_connector and @drm_connector_state part of * this update. */ struct __drm_connnectors_state *connectors; /** * @num_private_objs: size of the @private_objs array */ int num_private_objs; /** * @private_objs: * * Pointer to array of @drm_private_obj and @drm_private_obj_state part * of this update. */ struct __drm_private_objs_state *private_objs; /** * @acquire_ctx: acquire context for this atomic modeset state update */ struct drm_modeset_acquire_ctx *acquire_ctx; /** * @fake_commit: * * Used for signaling unbound planes/connectors. * When a connector or plane is not bound to any CRTC, it's still important * to preserve linearity to prevent the atomic states from being freed too early. * * This commit (if set) is not bound to any CRTC, but will be completed when * drm_atomic_helper_commit_hw_done() is called. */ struct drm_crtc_commit *fake_commit; /** * @commit_work: * * Work item which can be used by the driver or helpers to execute the * commit without blocking. */ struct work_struct commit_work; }; void __drm_crtc_commit_free(struct kref *kref); /** * drm_crtc_commit_get - acquire a reference to the CRTC commit * @commit: CRTC commit * * Increases the reference of @commit. * * Returns: * The pointer to @commit, with reference increased. */ static inline struct drm_crtc_commit *drm_crtc_commit_get(struct drm_crtc_commit *commit) { kref_get(&commit->ref); return commit; } /** * drm_crtc_commit_put - release a reference to the CRTC commmit * @commit: CRTC commit * * This releases a reference to @commit which is freed after removing the * final reference. No locking required and callable from any context. */ static inline void drm_crtc_commit_put(struct drm_crtc_commit *commit) { kref_put(&commit->ref, __drm_crtc_commit_free); } int drm_crtc_commit_wait(struct drm_crtc_commit *commit); struct drm_atomic_state * __must_check drm_atomic_state_alloc(struct drm_device *dev); void drm_atomic_state_clear(struct drm_atomic_state *state); /** * drm_atomic_state_get - acquire a reference to the atomic state * @state: The atomic state * * Returns a new reference to the @state */ static inline struct drm_atomic_state * drm_atomic_state_get(struct drm_atomic_state *state) { kref_get(&state->ref); return state; } void __drm_atomic_state_free(struct kref *ref); /** * drm_atomic_state_put - release a reference to the atomic state * @state: The atomic state * * This releases a reference to @state which is freed after removing the * final reference. No locking required and callable from any context. */ static inline void drm_atomic_state_put(struct drm_atomic_state *state) { kref_put(&state->ref, __drm_atomic_state_free); } int __must_check drm_atomic_state_init(struct drm_device *dev, struct drm_atomic_state *state); void drm_atomic_state_default_clear(struct drm_atomic_state *state); void drm_atomic_state_default_release(struct drm_atomic_state *state); struct drm_crtc_state * __must_check drm_atomic_get_crtc_state(struct drm_atomic_state *state, struct drm_crtc *crtc); struct drm_plane_state * __must_check drm_atomic_get_plane_state(struct drm_atomic_state *state, struct drm_plane *plane); struct drm_connector_state * __must_check drm_atomic_get_connector_state(struct drm_atomic_state *state, struct drm_connector *connector); void drm_atomic_private_obj_init(struct drm_device *dev, struct drm_private_obj *obj, struct drm_private_state *state, const struct drm_private_state_funcs *funcs); void drm_atomic_private_obj_fini(struct drm_private_obj *obj); struct drm_private_state * __must_check drm_atomic_get_private_obj_state(struct drm_atomic_state *state, struct drm_private_obj *obj); struct drm_private_state * drm_atomic_get_old_private_obj_state(const struct drm_atomic_state *state, struct drm_private_obj *obj); struct drm_private_state * drm_atomic_get_new_private_obj_state(const struct drm_atomic_state *state, struct drm_private_obj *obj); struct drm_connector * drm_atomic_get_old_connector_for_encoder(const struct drm_atomic_state *state, struct drm_encoder *encoder); struct drm_connector * drm_atomic_get_new_connector_for_encoder(const struct drm_atomic_state *state, struct drm_encoder *encoder); struct drm_connector * drm_atomic_get_connector_for_encoder(const struct drm_encoder *encoder, struct drm_modeset_acquire_ctx *ctx); struct drm_crtc * drm_atomic_get_old_crtc_for_encoder(struct drm_atomic_state *state, struct drm_encoder *encoder); struct drm_crtc * drm_atomic_get_new_crtc_for_encoder(struct drm_atomic_state *state, struct drm_encoder *encoder); /** * drm_atomic_get_existing_crtc_state - get CRTC state, if it exists * @state: global atomic state object * @crtc: CRTC to grab * * This function returns the CRTC state for the given CRTC, or NULL * if the CRTC is not part of the global atomic state. * * This function is deprecated, @drm_atomic_get_old_crtc_state or * @drm_atomic_get_new_crtc_state should be used instead. */ static inline struct drm_crtc_state * drm_atomic_get_existing_crtc_state(const struct drm_atomic_state *state, struct drm_crtc *crtc) { return state->crtcs[drm_crtc_index(crtc)].state; } /** * drm_atomic_get_old_crtc_state - get old CRTC state, if it exists * @state: global atomic state object * @crtc: CRTC to grab * * This function returns the old CRTC state for the given CRTC, or * NULL if the CRTC is not part of the global atomic state. */ static inline struct drm_crtc_state * drm_atomic_get_old_crtc_state(const struct drm_atomic_state *state, struct drm_crtc *crtc) { return state->crtcs[drm_crtc_index(crtc)].old_state; } /** * drm_atomic_get_new_crtc_state - get new CRTC state, if it exists * @state: global atomic state object * @crtc: CRTC to grab * * This function returns the new CRTC state for the given CRTC, or * NULL if the CRTC is not part of the global atomic state. */ static inline struct drm_crtc_state * drm_atomic_get_new_crtc_state(const struct drm_atomic_state *state, struct drm_crtc *crtc) { return state->crtcs[drm_crtc_index(crtc)].new_state; } /** * drm_atomic_get_existing_plane_state - get plane state, if it exists * @state: global atomic state object * @plane: plane to grab * * This function returns the plane state for the given plane, or NULL * if the plane is not part of the global atomic state. * * This function is deprecated, @drm_atomic_get_old_plane_state or * @drm_atomic_get_new_plane_state should be used instead. */ static inline struct drm_plane_state * drm_atomic_get_existing_plane_state(const struct drm_atomic_state *state, struct drm_plane *plane) { return state->planes[drm_plane_index(plane)].state; } /** * drm_atomic_get_old_plane_state - get plane state, if it exists * @state: global atomic state object * @plane: plane to grab * * This function returns the old plane state for the given plane, or * NULL if the plane is not part of the global atomic state. */ static inline struct drm_plane_state * drm_atomic_get_old_plane_state(const struct drm_atomic_state *state, struct drm_plane *plane) { return state->planes[drm_plane_index(plane)].old_state; } /** * drm_atomic_get_new_plane_state - get plane state, if it exists * @state: global atomic state object * @plane: plane to grab * * This function returns the new plane state for the given plane, or * NULL if the plane is not part of the global atomic state. */ static inline struct drm_plane_state * drm_atomic_get_new_plane_state(const struct drm_atomic_state *state, struct drm_plane *plane) { return state->planes[drm_plane_index(plane)].new_state; } /** * drm_atomic_get_existing_connector_state - get connector state, if it exists * @state: global atomic state object * @connector: connector to grab * * This function returns the connector state for the given connector, * or NULL if the connector is not part of the global atomic state. * * This function is deprecated, @drm_atomic_get_old_connector_state or * @drm_atomic_get_new_connector_state should be used instead. */ static inline struct drm_connector_state * drm_atomic_get_existing_connector_state(const struct drm_atomic_state *state, struct drm_connector *connector) { int index = drm_connector_index(connector); if (index >= state->num_connector) return NULL; return state->connectors[index].state; } /** * drm_atomic_get_old_connector_state - get connector state, if it exists * @state: global atomic state object * @connector: connector to grab * * This function returns the old connector state for the given connector, * or NULL if the connector is not part of the global atomic state. */ static inline struct drm_connector_state * drm_atomic_get_old_connector_state(const struct drm_atomic_state *state, struct drm_connector *connector) { int index = drm_connector_index(connector); if (index >= state->num_connector) return NULL; return state->connectors[index].old_state; } /** * drm_atomic_get_new_connector_state - get connector state, if it exists * @state: global atomic state object * @connector: connector to grab * * This function returns the new connector state for the given connector, * or NULL if the connector is not part of the global atomic state. */ static inline struct drm_connector_state * drm_atomic_get_new_connector_state(const struct drm_atomic_state *state, struct drm_connector *connector) { int index = drm_connector_index(connector); if (index >= state->num_connector) return NULL; return state->connectors[index].new_state; } /** * __drm_atomic_get_current_plane_state - get current plane state * @state: global atomic state object * @plane: plane to grab * * This function returns the plane state for the given plane, either from * @state, or if the plane isn't part of the atomic state update, from @plane. * This is useful in atomic check callbacks, when drivers need to peek at, but * not change, state of other planes, since it avoids threading an error code * back up the call chain. * * WARNING: * * Note that this function is in general unsafe since it doesn't check for the * required locking for access state structures. Drivers must ensure that it is * safe to access the returned state structure through other means. One common * example is when planes are fixed to a single CRTC, and the driver knows that * the CRTC lock is held already. In that case holding the CRTC lock gives a * read-lock on all planes connected to that CRTC. But if planes can be * reassigned things get more tricky. In that case it's better to use * drm_atomic_get_plane_state and wire up full error handling. * * Returns: * * Read-only pointer to the current plane state. */ static inline const struct drm_plane_state * __drm_atomic_get_current_plane_state(const struct drm_atomic_state *state, struct drm_plane *plane) { if (state->planes[drm_plane_index(plane)].state) return state->planes[drm_plane_index(plane)].state; return plane->state; } int __must_check drm_atomic_add_encoder_bridges(struct drm_atomic_state *state, struct drm_encoder *encoder); int __must_check drm_atomic_add_affected_connectors(struct drm_atomic_state *state, struct drm_crtc *crtc); int __must_check drm_atomic_add_affected_planes(struct drm_atomic_state *state, struct drm_crtc *crtc); int __must_check drm_atomic_check_only(struct drm_atomic_state *state); int __must_check drm_atomic_commit(struct drm_atomic_state *state); int __must_check drm_atomic_nonblocking_commit(struct drm_atomic_state *state); void drm_state_dump(struct drm_device *dev, struct drm_printer *p); /** * for_each_oldnew_connector_in_state - iterate over all connectors in an atomic update * @__state: &struct drm_atomic_state pointer * @connector: &struct drm_connector iteration cursor * @old_connector_state: &struct drm_connector_state iteration cursor for the * old state * @new_connector_state: &struct drm_connector_state iteration cursor for the * new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all connectors in an atomic update, tracking both old and * new state. This is useful in places where the state delta needs to be * considered, for example in atomic check functions. */ #define for_each_oldnew_connector_in_state(__state, connector, old_connector_state, new_connector_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_connector; \ (__i)++) \ for_each_if ((__state)->connectors[__i].ptr && \ ((connector) = (__state)->connectors[__i].ptr, \ (void)(connector) /* Only to avoid unused-but-set-variable warning */, \ (old_connector_state) = (__state)->connectors[__i].old_state, \ (new_connector_state) = (__state)->connectors[__i].new_state, 1)) /** * for_each_old_connector_in_state - iterate over all connectors in an atomic update * @__state: &struct drm_atomic_state pointer * @connector: &struct drm_connector iteration cursor * @old_connector_state: &struct drm_connector_state iteration cursor for the * old state * @__i: int iteration cursor, for macro-internal use * * This iterates over all connectors in an atomic update, tracking only the old * state. This is useful in disable functions, where we need the old state the * hardware is still in. */ #define for_each_old_connector_in_state(__state, connector, old_connector_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_connector; \ (__i)++) \ for_each_if ((__state)->connectors[__i].ptr && \ ((connector) = (__state)->connectors[__i].ptr, \ (void)(connector) /* Only to avoid unused-but-set-variable warning */, \ (old_connector_state) = (__state)->connectors[__i].old_state, 1)) /** * for_each_new_connector_in_state - iterate over all connectors in an atomic update * @__state: &struct drm_atomic_state pointer * @connector: &struct drm_connector iteration cursor * @new_connector_state: &struct drm_connector_state iteration cursor for the * new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all connectors in an atomic update, tracking only the new * state. This is useful in enable functions, where we need the new state the * hardware should be in when the atomic commit operation has completed. */ #define for_each_new_connector_in_state(__state, connector, new_connector_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_connector; \ (__i)++) \ for_each_if ((__state)->connectors[__i].ptr && \ ((connector) = (__state)->connectors[__i].ptr, \ (void)(connector) /* Only to avoid unused-but-set-variable warning */, \ (new_connector_state) = (__state)->connectors[__i].new_state, \ (void)(new_connector_state) /* Only to avoid unused-but-set-variable warning */, 1)) /** * for_each_oldnew_crtc_in_state - iterate over all CRTCs in an atomic update * @__state: &struct drm_atomic_state pointer * @crtc: &struct drm_crtc iteration cursor * @old_crtc_state: &struct drm_crtc_state iteration cursor for the old state * @new_crtc_state: &struct drm_crtc_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all CRTCs in an atomic update, tracking both old and * new state. This is useful in places where the state delta needs to be * considered, for example in atomic check functions. */ #define for_each_oldnew_crtc_in_state(__state, crtc, old_crtc_state, new_crtc_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_crtc; \ (__i)++) \ for_each_if ((__state)->crtcs[__i].ptr && \ ((crtc) = (__state)->crtcs[__i].ptr, \ (void)(crtc) /* Only to avoid unused-but-set-variable warning */, \ (old_crtc_state) = (__state)->crtcs[__i].old_state, \ (void)(old_crtc_state) /* Only to avoid unused-but-set-variable warning */, \ (new_crtc_state) = (__state)->crtcs[__i].new_state, \ (void)(new_crtc_state) /* Only to avoid unused-but-set-variable warning */, 1)) /** * for_each_old_crtc_in_state - iterate over all CRTCs in an atomic update * @__state: &struct drm_atomic_state pointer * @crtc: &struct drm_crtc iteration cursor * @old_crtc_state: &struct drm_crtc_state iteration cursor for the old state * @__i: int iteration cursor, for macro-internal use * * This iterates over all CRTCs in an atomic update, tracking only the old * state. This is useful in disable functions, where we need the old state the * hardware is still in. */ #define for_each_old_crtc_in_state(__state, crtc, old_crtc_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_crtc; \ (__i)++) \ for_each_if ((__state)->crtcs[__i].ptr && \ ((crtc) = (__state)->crtcs[__i].ptr, \ (void)(crtc) /* Only to avoid unused-but-set-variable warning */, \ (old_crtc_state) = (__state)->crtcs[__i].old_state, 1)) /** * for_each_new_crtc_in_state - iterate over all CRTCs in an atomic update * @__state: &struct drm_atomic_state pointer * @crtc: &struct drm_crtc iteration cursor * @new_crtc_state: &struct drm_crtc_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all CRTCs in an atomic update, tracking only the new * state. This is useful in enable functions, where we need the new state the * hardware should be in when the atomic commit operation has completed. */ #define for_each_new_crtc_in_state(__state, crtc, new_crtc_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_crtc; \ (__i)++) \ for_each_if ((__state)->crtcs[__i].ptr && \ ((crtc) = (__state)->crtcs[__i].ptr, \ (void)(crtc) /* Only to avoid unused-but-set-variable warning */, \ (new_crtc_state) = (__state)->crtcs[__i].new_state, \ (void)(new_crtc_state) /* Only to avoid unused-but-set-variable warning */, 1)) /** * for_each_oldnew_plane_in_state - iterate over all planes in an atomic update * @__state: &struct drm_atomic_state pointer * @plane: &struct drm_plane iteration cursor * @old_plane_state: &struct drm_plane_state iteration cursor for the old state * @new_plane_state: &struct drm_plane_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all planes in an atomic update, tracking both old and * new state. This is useful in places where the state delta needs to be * considered, for example in atomic check functions. */ #define for_each_oldnew_plane_in_state(__state, plane, old_plane_state, new_plane_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_total_plane; \ (__i)++) \ for_each_if ((__state)->planes[__i].ptr && \ ((plane) = (__state)->planes[__i].ptr, \ (void)(plane) /* Only to avoid unused-but-set-variable warning */, \ (old_plane_state) = (__state)->planes[__i].old_state,\ (new_plane_state) = (__state)->planes[__i].new_state, 1)) /** * for_each_oldnew_plane_in_state_reverse - iterate over all planes in an atomic * update in reverse order * @__state: &struct drm_atomic_state pointer * @plane: &struct drm_plane iteration cursor * @old_plane_state: &struct drm_plane_state iteration cursor for the old state * @new_plane_state: &struct drm_plane_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all planes in an atomic update in reverse order, * tracking both old and new state. This is useful in places where the * state delta needs to be considered, for example in atomic check functions. */ #define for_each_oldnew_plane_in_state_reverse(__state, plane, old_plane_state, new_plane_state, __i) \ for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1); \ (__i) >= 0; \ (__i)--) \ for_each_if ((__state)->planes[__i].ptr && \ ((plane) = (__state)->planes[__i].ptr, \ (old_plane_state) = (__state)->planes[__i].old_state,\ (new_plane_state) = (__state)->planes[__i].new_state, 1)) /** * for_each_new_plane_in_state_reverse - other than only tracking new state, * it's the same as for_each_oldnew_plane_in_state_reverse * @__state: &struct drm_atomic_state pointer * @plane: &struct drm_plane iteration cursor * @new_plane_state: &struct drm_plane_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use */ #define for_each_new_plane_in_state_reverse(__state, plane, new_plane_state, __i) \ for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1); \ (__i) >= 0; \ (__i)--) \ for_each_if ((__state)->planes[__i].ptr && \ ((plane) = (__state)->planes[__i].ptr, \ (new_plane_state) = (__state)->planes[__i].new_state, 1)) /** * for_each_old_plane_in_state - iterate over all planes in an atomic update * @__state: &struct drm_atomic_state pointer * @plane: &struct drm_plane iteration cursor * @old_plane_state: &struct drm_plane_state iteration cursor for the old state * @__i: int iteration cursor, for macro-internal use * * This iterates over all planes in an atomic update, tracking only the old * state. This is useful in disable functions, where we need the old state the * hardware is still in. */ #define for_each_old_plane_in_state(__state, plane, old_plane_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_total_plane; \ (__i)++) \ for_each_if ((__state)->planes[__i].ptr && \ ((plane) = (__state)->planes[__i].ptr, \ (old_plane_state) = (__state)->planes[__i].old_state, 1)) /** * for_each_new_plane_in_state - iterate over all planes in an atomic update * @__state: &struct drm_atomic_state pointer * @plane: &struct drm_plane iteration cursor * @new_plane_state: &struct drm_plane_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all planes in an atomic update, tracking only the new * state. This is useful in enable functions, where we need the new state the * hardware should be in when the atomic commit operation has completed. */ #define for_each_new_plane_in_state(__state, plane, new_plane_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->dev->mode_config.num_total_plane; \ (__i)++) \ for_each_if ((__state)->planes[__i].ptr && \ ((plane) = (__state)->planes[__i].ptr, \ (void)(plane) /* Only to avoid unused-but-set-variable warning */, \ (new_plane_state) = (__state)->planes[__i].new_state, \ (void)(new_plane_state) /* Only to avoid unused-but-set-variable warning */, 1)) /** * for_each_oldnew_private_obj_in_state - iterate over all private objects in an atomic update * @__state: &struct drm_atomic_state pointer * @obj: &struct drm_private_obj iteration cursor * @old_obj_state: &struct drm_private_state iteration cursor for the old state * @new_obj_state: &struct drm_private_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all private objects in an atomic update, tracking both * old and new state. This is useful in places where the state delta needs * to be considered, for example in atomic check functions. */ #define for_each_oldnew_private_obj_in_state(__state, obj, old_obj_state, new_obj_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_private_objs && \ ((obj) = (__state)->private_objs[__i].ptr, \ (old_obj_state) = (__state)->private_objs[__i].old_state, \ (new_obj_state) = (__state)->private_objs[__i].new_state, 1); \ (__i)++) /** * for_each_old_private_obj_in_state - iterate over all private objects in an atomic update * @__state: &struct drm_atomic_state pointer * @obj: &struct drm_private_obj iteration cursor * @old_obj_state: &struct drm_private_state iteration cursor for the old state * @__i: int iteration cursor, for macro-internal use * * This iterates over all private objects in an atomic update, tracking only * the old state. This is useful in disable functions, where we need the old * state the hardware is still in. */ #define for_each_old_private_obj_in_state(__state, obj, old_obj_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_private_objs && \ ((obj) = (__state)->private_objs[__i].ptr, \ (old_obj_state) = (__state)->private_objs[__i].old_state, 1); \ (__i)++) /** * for_each_new_private_obj_in_state - iterate over all private objects in an atomic update * @__state: &struct drm_atomic_state pointer * @obj: &struct drm_private_obj iteration cursor * @new_obj_state: &struct drm_private_state iteration cursor for the new state * @__i: int iteration cursor, for macro-internal use * * This iterates over all private objects in an atomic update, tracking only * the new state. This is useful in enable functions, where we need the new state the * hardware should be in when the atomic commit operation has completed. */ #define for_each_new_private_obj_in_state(__state, obj, new_obj_state, __i) \ for ((__i) = 0; \ (__i) < (__state)->num_private_objs && \ ((obj) = (__state)->private_objs[__i].ptr, \ (void)(obj) /* Only to avoid unused-but-set-variable warning */, \ (new_obj_state) = (__state)->private_objs[__i].new_state, 1); \ (__i)++) /** * drm_atomic_crtc_needs_modeset - compute combined modeset need * @state: &drm_crtc_state for the CRTC * * To give drivers flexibility &struct drm_crtc_state has 3 booleans to track * whether the state CRTC changed enough to need a full modeset cycle: * mode_changed, active_changed and connectors_changed. This helper simply * combines these three to compute the overall need for a modeset for @state. * * The atomic helper code sets these booleans, but drivers can and should * change them appropriately to accurately represent whether a modeset is * really needed. In general, drivers should avoid full modesets whenever * possible. * * For example if the CRTC mode has changed, and the hardware is able to enact * the requested mode change without going through a full modeset, the driver * should clear mode_changed in its &drm_mode_config_funcs.atomic_check * implementation. */ static inline bool drm_atomic_crtc_needs_modeset(const struct drm_crtc_state *state) { return state->mode_changed || state->active_changed || state->connectors_changed; } /** * drm_atomic_crtc_effectively_active - compute whether CRTC is actually active * @state: &drm_crtc_state for the CRTC * * When in self refresh mode, the crtc_state->active value will be false, since * the CRTC is off. However in some cases we're interested in whether the CRTC * is active, or effectively active (ie: it's connected to an active display). * In these cases, use this function instead of just checking active. */ static inline bool drm_atomic_crtc_effectively_active(const struct drm_crtc_state *state) { return state->active || state->self_refresh_active; } /** * struct drm_bus_cfg - bus configuration * * This structure stores the configuration of a physical bus between two * components in an output pipeline, usually between two bridges, an encoder * and a bridge, or a bridge and a connector. * * The bus configuration is stored in &drm_bridge_state separately for the * input and output buses, as seen from the point of view of each bridge. The * bus configuration of a bridge output is usually identical to the * configuration of the next bridge's input, but may differ if the signals are * modified between the two bridges, for instance by an inverter on the board. * The input and output configurations of a bridge may differ if the bridge * modifies the signals internally, for instance by performing format * conversion, or modifying signals polarities. */ struct drm_bus_cfg { /** * @format: format used on this bus (one of the MEDIA_BUS_FMT_* format) * * This field should not be directly modified by drivers * (drm_atomic_bridge_chain_select_bus_fmts() takes care of the bus * format negotiation). */ u32 format; /** * @flags: DRM_BUS_* flags used on this bus */ u32 flags; }; /** * struct drm_bridge_state - Atomic bridge state object */ struct drm_bridge_state { /** * @base: inherit from &drm_private_state */ struct drm_private_state base; /** * @bridge: the bridge this state refers to */ struct drm_bridge *bridge; /** * @input_bus_cfg: input bus configuration */ struct drm_bus_cfg input_bus_cfg; /** * @output_bus_cfg: output bus configuration */ struct drm_bus_cfg output_bus_cfg; }; static inline struct drm_bridge_state * drm_priv_to_bridge_state(struct drm_private_state *priv) { return container_of(priv, struct drm_bridge_state, base); } struct drm_bridge_state * drm_atomic_get_bridge_state(struct drm_atomic_state *state, struct drm_bridge *bridge); struct drm_bridge_state * drm_atomic_get_old_bridge_state(const struct drm_atomic_state *state, struct drm_bridge *bridge); struct drm_bridge_state * drm_atomic_get_new_bridge_state(const struct drm_atomic_state *state, struct drm_bridge *bridge); #endif /* DRM_ATOMIC_H_ */ |
| 28 28 28 28 28 28 28 108 35 108 108 22 6 16 136 132 118 109 22 135 78 71 56 17 71 110 107 56 9 9 7 1 1 9 9 8 9 8 1 8 9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 | // SPDX-License-Identifier: GPL-2.0-or-later /* SCTP kernel implementation * Copyright (c) 1999-2000 Cisco, Inc. * Copyright (c) 1999-2001 Motorola, Inc. * Copyright (c) 2001-2002 International Business Machines, Corp. * Copyright (c) 2001 Intel Corp. * Copyright (c) 2001 Nokia, Inc. * Copyright (c) 2001 La Monte H.P. Yarroll * * This file is part of the SCTP kernel implementation * * This abstraction represents an SCTP endpoint. * * Please send any bug reports or fixes you make to the * email address(es): * lksctp developers <linux-sctp@vger.kernel.org> * * Written or modified by: * La Monte H.P. Yarroll <piggy@acm.org> * Karl Knutson <karl@athena.chicago.il.us> * Jon Grimm <jgrimm@austin.ibm.com> * Daisy Chang <daisyc@us.ibm.com> * Dajiang Zhang <dajiang.zhang@nokia.com> */ #include <linux/types.h> #include <linux/slab.h> #include <linux/in.h> #include <linux/random.h> /* get_random_bytes() */ #include <net/sock.h> #include <net/ipv6.h> #include <net/sctp/sctp.h> #include <net/sctp/sm.h> /* Forward declarations for internal helpers. */ static void sctp_endpoint_bh_rcv(struct work_struct *work); /* * Initialize the base fields of the endpoint structure. */ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep, struct sock *sk, gfp_t gfp) { struct net *net = sock_net(sk); struct sctp_shared_key *null_key; ep->digest = kzalloc(SCTP_SIGNATURE_SIZE, gfp); if (!ep->digest) return NULL; ep->asconf_enable = net->sctp.addip_enable; ep->auth_enable = net->sctp.auth_enable; if (ep->auth_enable) { if (sctp_auth_init(ep, gfp)) goto nomem; if (ep->asconf_enable) { sctp_auth_ep_add_chunkid(ep, SCTP_CID_ASCONF); sctp_auth_ep_add_chunkid(ep, SCTP_CID_ASCONF_ACK); } } /* Initialize the base structure. */ /* What type of endpoint are we? */ ep->base.type = SCTP_EP_TYPE_SOCKET; /* Initialize the basic object fields. */ refcount_set(&ep->base.refcnt, 1); ep->base.dead = false; /* Create an input queue. */ sctp_inq_init(&ep->base.inqueue); /* Set its top-half handler */ sctp_inq_set_th_handler(&ep->base.inqueue, sctp_endpoint_bh_rcv); /* Initialize the bind addr area */ sctp_bind_addr_init(&ep->base.bind_addr, 0); /* Create the lists of associations. */ INIT_LIST_HEAD(&ep->asocs); /* Use SCTP specific send buffer space queues. */ ep->sndbuf_policy = net->sctp.sndbuf_policy; sk->sk_data_ready = sctp_data_ready; sk->sk_write_space = sctp_write_space; sock_set_flag(sk, SOCK_USE_WRITE_QUEUE); /* Get the receive buffer policy for this endpoint */ ep->rcvbuf_policy = net->sctp.rcvbuf_policy; /* Initialize the secret key used with cookie. */ get_random_bytes(ep->secret_key, sizeof(ep->secret_key)); /* SCTP-AUTH extensions*/ INIT_LIST_HEAD(&ep->endpoint_shared_keys); null_key = sctp_auth_shkey_create(0, gfp); if (!null_key) goto nomem_shkey; list_add(&null_key->key_list, &ep->endpoint_shared_keys); /* Add the null key to the endpoint shared keys list and * set the hmcas and chunks pointers. */ ep->prsctp_enable = net->sctp.prsctp_enable; ep->reconf_enable = net->sctp.reconf_enable; ep->ecn_enable = net->sctp.ecn_enable; /* Remember who we are attached to. */ ep->base.sk = sk; ep->base.net = sock_net(sk); sock_hold(ep->base.sk); return ep; nomem_shkey: sctp_auth_free(ep); nomem: kfree(ep->digest); return NULL; } /* Create a sctp_endpoint with all that boring stuff initialized. * Returns NULL if there isn't enough memory. */ struct sctp_endpoint *sctp_endpoint_new(struct sock *sk, gfp_t gfp) { struct sctp_endpoint *ep; /* Build a local endpoint. */ ep = kzalloc(sizeof(*ep), gfp); if (!ep) goto fail; if (!sctp_endpoint_init(ep, sk, gfp)) goto fail_init; SCTP_DBG_OBJCNT_INC(ep); return ep; fail_init: kfree(ep); fail: return NULL; } /* Add an association to an endpoint. */ void sctp_endpoint_add_asoc(struct sctp_endpoint *ep, struct sctp_association *asoc) { struct sock *sk = ep->base.sk; /* If this is a temporary association, don't bother * since we'll be removing it shortly and don't * want anyone to find it anyway. */ if (asoc->temp) return; /* Now just add it to our list of asocs */ list_add_tail(&asoc->asocs, &ep->asocs); /* Increment the backlog value for a TCP-style listening socket. */ if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)) sk_acceptq_added(sk); } /* Free the endpoint structure. Delay cleanup until * all users have released their reference count on this structure. */ void sctp_endpoint_free(struct sctp_endpoint *ep) { ep->base.dead = true; inet_sk_set_state(ep->base.sk, SCTP_SS_CLOSED); /* Unlink this endpoint, so we can't find it again! */ sctp_unhash_endpoint(ep); sctp_endpoint_put(ep); } /* Final destructor for endpoint. */ static void sctp_endpoint_destroy_rcu(struct rcu_head *head) { struct sctp_endpoint *ep = container_of(head, struct sctp_endpoint, rcu); struct sock *sk = ep->base.sk; sctp_sk(sk)->ep = NULL; sock_put(sk); kfree(ep); SCTP_DBG_OBJCNT_DEC(ep); } static void sctp_endpoint_destroy(struct sctp_endpoint *ep) { struct sock *sk; if (unlikely(!ep->base.dead)) { WARN(1, "Attempt to destroy undead endpoint %p!\n", ep); return; } /* Free the digest buffer */ kfree(ep->digest); /* SCTP-AUTH: Free up AUTH releated data such as shared keys * chunks and hmacs arrays that were allocated */ sctp_auth_destroy_keys(&ep->endpoint_shared_keys); sctp_auth_free(ep); /* Cleanup. */ sctp_inq_free(&ep->base.inqueue); sctp_bind_addr_free(&ep->base.bind_addr); memset(ep->secret_key, 0, sizeof(ep->secret_key)); sk = ep->base.sk; /* Remove and free the port */ if (sctp_sk(sk)->bind_hash) sctp_put_port(sk); call_rcu(&ep->rcu, sctp_endpoint_destroy_rcu); } /* Hold a reference to an endpoint. */ int sctp_endpoint_hold(struct sctp_endpoint *ep) { return refcount_inc_not_zero(&ep->base.refcnt); } /* Release a reference to an endpoint and clean up if there are * no more references. */ void sctp_endpoint_put(struct sctp_endpoint *ep) { if (refcount_dec_and_test(&ep->base.refcnt)) sctp_endpoint_destroy(ep); } /* Is this the endpoint we are looking for? */ struct sctp_endpoint *sctp_endpoint_is_match(struct sctp_endpoint *ep, struct net *net, const union sctp_addr *laddr, int dif, int sdif) { int bound_dev_if = READ_ONCE(ep->base.sk->sk_bound_dev_if); struct sctp_endpoint *retval = NULL; if (net_eq(ep->base.net, net) && sctp_sk_bound_dev_eq(net, bound_dev_if, dif, sdif) && (htons(ep->base.bind_addr.port) == laddr->v4.sin_port)) { if (sctp_bind_addr_match(&ep->base.bind_addr, laddr, sctp_sk(ep->base.sk))) retval = ep; } return retval; } /* Find the association that goes with this chunk. * We lookup the transport from hashtable at first, then get association * through t->assoc. */ struct sctp_association *sctp_endpoint_lookup_assoc( const struct sctp_endpoint *ep, const union sctp_addr *paddr, struct sctp_transport **transport) { struct sctp_association *asoc = NULL; struct sctp_transport *t; *transport = NULL; /* If the local port is not set, there can't be any associations * on this endpoint. */ if (!ep->base.bind_addr.port) return NULL; rcu_read_lock(); t = sctp_epaddr_lookup_transport(ep, paddr); if (!t) goto out; *transport = t; asoc = t->asoc; out: rcu_read_unlock(); return asoc; } /* Look for any peeled off association from the endpoint that matches the * given peer address. */ bool sctp_endpoint_is_peeled_off(struct sctp_endpoint *ep, const union sctp_addr *paddr) { int bound_dev_if = READ_ONCE(ep->base.sk->sk_bound_dev_if); struct sctp_sockaddr_entry *addr; struct net *net = ep->base.net; struct sctp_bind_addr *bp; bp = &ep->base.bind_addr; /* This function is called with the socket lock held, * so the address_list can not change. */ list_for_each_entry(addr, &bp->address_list, list) { if (sctp_has_association(net, &addr->a, paddr, bound_dev_if, bound_dev_if)) return true; } return false; } /* Do delayed input processing. This is scheduled by sctp_rcv(). * This may be called on BH or task time. */ static void sctp_endpoint_bh_rcv(struct work_struct *work) { struct sctp_endpoint *ep = container_of(work, struct sctp_endpoint, base.inqueue.immediate); struct sctp_association *asoc; struct sock *sk; struct net *net; struct sctp_transport *transport; struct sctp_chunk *chunk; struct sctp_inq *inqueue; union sctp_subtype subtype; enum sctp_state state; int error = 0; int first_time = 1; /* is this the first time through the loop */ if (ep->base.dead) return; asoc = NULL; inqueue = &ep->base.inqueue; sk = ep->base.sk; net = sock_net(sk); while (NULL != (chunk = sctp_inq_pop(inqueue))) { subtype = SCTP_ST_CHUNK(chunk->chunk_hdr->type); /* If the first chunk in the packet is AUTH, do special * processing specified in Section 6.3 of SCTP-AUTH spec */ if (first_time && (subtype.chunk == SCTP_CID_AUTH)) { struct sctp_chunkhdr *next_hdr; next_hdr = sctp_inq_peek(inqueue); if (!next_hdr) goto normal; /* If the next chunk is COOKIE-ECHO, skip the AUTH * chunk while saving a pointer to it so we can do * Authentication later (during cookie-echo * processing). */ if (next_hdr->type == SCTP_CID_COOKIE_ECHO) { chunk->auth_chunk = skb_clone(chunk->skb, GFP_ATOMIC); chunk->auth = 1; continue; } } normal: /* We might have grown an association since last we * looked, so try again. * * This happens when we've just processed our * COOKIE-ECHO chunk. */ if (NULL == chunk->asoc) { asoc = sctp_endpoint_lookup_assoc(ep, sctp_source(chunk), &transport); chunk->asoc = asoc; chunk->transport = transport; } state = asoc ? asoc->state : SCTP_STATE_CLOSED; if (sctp_auth_recv_cid(subtype.chunk, asoc) && !chunk->auth) continue; /* Remember where the last DATA chunk came from so we * know where to send the SACK. */ if (asoc && sctp_chunk_is_data(chunk)) asoc->peer.last_data_from = chunk->transport; else { SCTP_INC_STATS(ep->base.net, SCTP_MIB_INCTRLCHUNKS); if (asoc) asoc->stats.ictrlchunks++; } if (chunk->transport) chunk->transport->last_time_heard = ktime_get(); error = sctp_do_sm(net, SCTP_EVENT_T_CHUNK, subtype, state, ep, asoc, chunk, GFP_ATOMIC); if (error && chunk) chunk->pdiscard = 1; /* Check to see if the endpoint is freed in response to * the incoming chunk. If so, get out of the while loop. */ if (!sctp_sk(sk)->ep) break; if (first_time) first_time = 0; } } |
| 2 2 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 | // SPDX-License-Identifier: GPL-2.0-or-later /* * authencesn.c - AEAD wrapper for IPsec with extended sequence numbers, * derived from authenc.c * * Copyright (C) 2010 secunet Security Networks AG * Copyright (C) 2010 Steffen Klassert <steffen.klassert@secunet.com> * Copyright (c) 2015 Herbert Xu <herbert@gondor.apana.org.au> */ #include <crypto/internal/aead.h> #include <crypto/internal/hash.h> #include <crypto/internal/skcipher.h> #include <crypto/authenc.h> #include <crypto/scatterwalk.h> #include <linux/err.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/rtnetlink.h> #include <linux/slab.h> #include <linux/spinlock.h> struct authenc_esn_instance_ctx { struct crypto_ahash_spawn auth; struct crypto_skcipher_spawn enc; }; struct crypto_authenc_esn_ctx { unsigned int reqoff; struct crypto_ahash *auth; struct crypto_skcipher *enc; }; struct authenc_esn_request_ctx { struct scatterlist src[2]; struct scatterlist dst[2]; char tail[]; }; static void authenc_esn_request_complete(struct aead_request *req, int err) { if (err != -EINPROGRESS) aead_request_complete(req, err); } static int crypto_authenc_esn_setauthsize(struct crypto_aead *authenc_esn, unsigned int authsize) { if (authsize > 0 && authsize < 4) return -EINVAL; return 0; } static int crypto_authenc_esn_setkey(struct crypto_aead *authenc_esn, const u8 *key, unsigned int keylen) { struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(authenc_esn); struct crypto_ahash *auth = ctx->auth; struct crypto_skcipher *enc = ctx->enc; struct crypto_authenc_keys keys; int err = -EINVAL; if (crypto_authenc_extractkeys(&keys, key, keylen) != 0) goto out; crypto_ahash_clear_flags(auth, CRYPTO_TFM_REQ_MASK); crypto_ahash_set_flags(auth, crypto_aead_get_flags(authenc_esn) & CRYPTO_TFM_REQ_MASK); err = crypto_ahash_setkey(auth, keys.authkey, keys.authkeylen); if (err) goto out; crypto_skcipher_clear_flags(enc, CRYPTO_TFM_REQ_MASK); crypto_skcipher_set_flags(enc, crypto_aead_get_flags(authenc_esn) & CRYPTO_TFM_REQ_MASK); err = crypto_skcipher_setkey(enc, keys.enckey, keys.enckeylen); out: memzero_explicit(&keys, sizeof(keys)); return err; } static int crypto_authenc_esn_genicv_tail(struct aead_request *req, unsigned int flags) { struct crypto_aead *authenc_esn = crypto_aead_reqtfm(req); struct authenc_esn_request_ctx *areq_ctx = aead_request_ctx(req); u8 *hash = areq_ctx->tail; unsigned int authsize = crypto_aead_authsize(authenc_esn); unsigned int assoclen = req->assoclen; unsigned int cryptlen = req->cryptlen; struct scatterlist *dst = req->dst; u32 tmp[2]; /* Move high-order bits of sequence number back. */ scatterwalk_map_and_copy(tmp, dst, 4, 4, 0); scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 0); scatterwalk_map_and_copy(tmp, dst, 0, 8, 1); scatterwalk_map_and_copy(hash, dst, assoclen + cryptlen, authsize, 1); return 0; } static void authenc_esn_geniv_ahash_done(void *data, int err) { struct aead_request *req = data; err = err ?: crypto_authenc_esn_genicv_tail(req, 0); aead_request_complete(req, err); } static int crypto_authenc_esn_genicv(struct aead_request *req, unsigned int flags) { struct crypto_aead *authenc_esn = crypto_aead_reqtfm(req); struct authenc_esn_request_ctx *areq_ctx = aead_request_ctx(req); struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(authenc_esn); struct crypto_ahash *auth = ctx->auth; u8 *hash = areq_ctx->tail; struct ahash_request *ahreq = (void *)(areq_ctx->tail + ctx->reqoff); unsigned int authsize = crypto_aead_authsize(authenc_esn); unsigned int assoclen = req->assoclen; unsigned int cryptlen = req->cryptlen; struct scatterlist *dst = req->dst; u32 tmp[2]; if (!authsize) return 0; /* Move high-order bits of sequence number to the end. */ scatterwalk_map_and_copy(tmp, dst, 0, 8, 0); scatterwalk_map_and_copy(tmp, dst, 4, 4, 1); scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1); sg_init_table(areq_ctx->dst, 2); dst = scatterwalk_ffwd(areq_ctx->dst, dst, 4); ahash_request_set_tfm(ahreq, auth); ahash_request_set_crypt(ahreq, dst, hash, assoclen + cryptlen); ahash_request_set_callback(ahreq, flags, authenc_esn_geniv_ahash_done, req); return crypto_ahash_digest(ahreq) ?: crypto_authenc_esn_genicv_tail(req, aead_request_flags(req)); } static void crypto_authenc_esn_encrypt_done(void *data, int err) { struct aead_request *areq = data; if (!err) err = crypto_authenc_esn_genicv(areq, 0); authenc_esn_request_complete(areq, err); } static int crypto_authenc_esn_encrypt(struct aead_request *req) { struct crypto_aead *authenc_esn = crypto_aead_reqtfm(req); struct authenc_esn_request_ctx *areq_ctx = aead_request_ctx(req); struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(authenc_esn); struct skcipher_request *skreq = (void *)(areq_ctx->tail + ctx->reqoff); struct crypto_skcipher *enc = ctx->enc; unsigned int assoclen = req->assoclen; unsigned int cryptlen = req->cryptlen; struct scatterlist *src, *dst; int err; sg_init_table(areq_ctx->src, 2); src = scatterwalk_ffwd(areq_ctx->src, req->src, assoclen); dst = src; if (req->src != req->dst) { memcpy_sglist(req->dst, req->src, assoclen); sg_init_table(areq_ctx->dst, 2); dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, assoclen); } skcipher_request_set_tfm(skreq, enc); skcipher_request_set_callback(skreq, aead_request_flags(req), crypto_authenc_esn_encrypt_done, req); skcipher_request_set_crypt(skreq, src, dst, cryptlen, req->iv); err = crypto_skcipher_encrypt(skreq); if (err) return err; return crypto_authenc_esn_genicv(req, aead_request_flags(req)); } static int crypto_authenc_esn_decrypt_tail(struct aead_request *req, unsigned int flags) { struct crypto_aead *authenc_esn = crypto_aead_reqtfm(req); unsigned int authsize = crypto_aead_authsize(authenc_esn); struct authenc_esn_request_ctx *areq_ctx = aead_request_ctx(req); struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(authenc_esn); struct skcipher_request *skreq = (void *)(areq_ctx->tail + ctx->reqoff); struct crypto_ahash *auth = ctx->auth; u8 *ohash = areq_ctx->tail; unsigned int cryptlen = req->cryptlen - authsize; unsigned int assoclen = req->assoclen; struct scatterlist *dst = req->dst; u8 *ihash = ohash + crypto_ahash_digestsize(auth); u32 tmp[2]; if (!authsize) goto decrypt; /* Move high-order bits of sequence number back. */ scatterwalk_map_and_copy(tmp, dst, 4, 4, 0); scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 0); scatterwalk_map_and_copy(tmp, dst, 0, 8, 1); if (crypto_memneq(ihash, ohash, authsize)) return -EBADMSG; decrypt: sg_init_table(areq_ctx->dst, 2); dst = scatterwalk_ffwd(areq_ctx->dst, dst, assoclen); skcipher_request_set_tfm(skreq, ctx->enc); skcipher_request_set_callback(skreq, flags, req->base.complete, req->base.data); skcipher_request_set_crypt(skreq, dst, dst, cryptlen, req->iv); return crypto_skcipher_decrypt(skreq); } static void authenc_esn_verify_ahash_done(void *data, int err) { struct aead_request *req = data; err = err ?: crypto_authenc_esn_decrypt_tail(req, 0); authenc_esn_request_complete(req, err); } static int crypto_authenc_esn_decrypt(struct aead_request *req) { struct crypto_aead *authenc_esn = crypto_aead_reqtfm(req); struct authenc_esn_request_ctx *areq_ctx = aead_request_ctx(req); struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(authenc_esn); struct ahash_request *ahreq = (void *)(areq_ctx->tail + ctx->reqoff); unsigned int authsize = crypto_aead_authsize(authenc_esn); struct crypto_ahash *auth = ctx->auth; u8 *ohash = areq_ctx->tail; unsigned int assoclen = req->assoclen; unsigned int cryptlen = req->cryptlen; u8 *ihash = ohash + crypto_ahash_digestsize(auth); struct scatterlist *dst = req->dst; u32 tmp[2]; int err; cryptlen -= authsize; if (req->src != dst) memcpy_sglist(dst, req->src, assoclen + cryptlen); scatterwalk_map_and_copy(ihash, req->src, assoclen + cryptlen, authsize, 0); if (!authsize) goto tail; /* Move high-order bits of sequence number to the end. */ scatterwalk_map_and_copy(tmp, dst, 0, 8, 0); scatterwalk_map_and_copy(tmp, dst, 4, 4, 1); scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1); sg_init_table(areq_ctx->dst, 2); dst = scatterwalk_ffwd(areq_ctx->dst, dst, 4); ahash_request_set_tfm(ahreq, auth); ahash_request_set_crypt(ahreq, dst, ohash, assoclen + cryptlen); ahash_request_set_callback(ahreq, aead_request_flags(req), authenc_esn_verify_ahash_done, req); err = crypto_ahash_digest(ahreq); if (err) return err; tail: return crypto_authenc_esn_decrypt_tail(req, aead_request_flags(req)); } static int crypto_authenc_esn_init_tfm(struct crypto_aead *tfm) { struct aead_instance *inst = aead_alg_instance(tfm); struct authenc_esn_instance_ctx *ictx = aead_instance_ctx(inst); struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(tfm); struct crypto_ahash *auth; struct crypto_skcipher *enc; int err; auth = crypto_spawn_ahash(&ictx->auth); if (IS_ERR(auth)) return PTR_ERR(auth); enc = crypto_spawn_skcipher(&ictx->enc); err = PTR_ERR(enc); if (IS_ERR(enc)) goto err_free_ahash; ctx->auth = auth; ctx->enc = enc; ctx->reqoff = 2 * crypto_ahash_digestsize(auth); crypto_aead_set_reqsize( tfm, sizeof(struct authenc_esn_request_ctx) + ctx->reqoff + max_t(unsigned int, crypto_ahash_reqsize(auth) + sizeof(struct ahash_request), sizeof(struct skcipher_request) + crypto_skcipher_reqsize(enc))); return 0; err_free_ahash: crypto_free_ahash(auth); return err; } static void crypto_authenc_esn_exit_tfm(struct crypto_aead *tfm) { struct crypto_authenc_esn_ctx *ctx = crypto_aead_ctx(tfm); crypto_free_ahash(ctx->auth); crypto_free_skcipher(ctx->enc); } static void crypto_authenc_esn_free(struct aead_instance *inst) { struct authenc_esn_instance_ctx *ctx = aead_instance_ctx(inst); crypto_drop_skcipher(&ctx->enc); crypto_drop_ahash(&ctx->auth); kfree(inst); } static int crypto_authenc_esn_create(struct crypto_template *tmpl, struct rtattr **tb) { u32 mask; struct aead_instance *inst; struct authenc_esn_instance_ctx *ctx; struct skcipher_alg_common *enc; struct hash_alg_common *auth; struct crypto_alg *auth_base; int err; err = crypto_check_attr_type(tb, CRYPTO_ALG_TYPE_AEAD, &mask); if (err) return err; inst = kzalloc(sizeof(*inst) + sizeof(*ctx), GFP_KERNEL); if (!inst) return -ENOMEM; ctx = aead_instance_ctx(inst); err = crypto_grab_ahash(&ctx->auth, aead_crypto_instance(inst), crypto_attr_alg_name(tb[1]), 0, mask); if (err) goto err_free_inst; auth = crypto_spawn_ahash_alg(&ctx->auth); auth_base = &auth->base; err = crypto_grab_skcipher(&ctx->enc, aead_crypto_instance(inst), crypto_attr_alg_name(tb[2]), 0, mask); if (err) goto err_free_inst; enc = crypto_spawn_skcipher_alg_common(&ctx->enc); err = -ENAMETOOLONG; if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME, "authencesn(%s,%s)", auth_base->cra_name, enc->base.cra_name) >= CRYPTO_MAX_ALG_NAME) goto err_free_inst; if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME, "authencesn(%s,%s)", auth_base->cra_driver_name, enc->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME) goto err_free_inst; inst->alg.base.cra_priority = enc->base.cra_priority * 10 + auth_base->cra_priority; inst->alg.base.cra_blocksize = enc->base.cra_blocksize; inst->alg.base.cra_alignmask = enc->base.cra_alignmask; inst->alg.base.cra_ctxsize = sizeof(struct crypto_authenc_esn_ctx); inst->alg.ivsize = enc->ivsize; inst->alg.chunksize = enc->chunksize; inst->alg.maxauthsize = auth->digestsize; inst->alg.init = crypto_authenc_esn_init_tfm; inst->alg.exit = crypto_authenc_esn_exit_tfm; inst->alg.setkey = crypto_authenc_esn_setkey; inst->alg.setauthsize = crypto_authenc_esn_setauthsize; inst->alg.encrypt = crypto_authenc_esn_encrypt; inst->alg.decrypt = crypto_authenc_esn_decrypt; inst->free = crypto_authenc_esn_free; err = aead_register_instance(tmpl, inst); if (err) { err_free_inst: crypto_authenc_esn_free(inst); } return err; } static struct crypto_template crypto_authenc_esn_tmpl = { .name = "authencesn", .create = crypto_authenc_esn_create, .module = THIS_MODULE, }; static int __init crypto_authenc_esn_module_init(void) { return crypto_register_template(&crypto_authenc_esn_tmpl); } static void __exit crypto_authenc_esn_module_exit(void) { crypto_unregister_template(&crypto_authenc_esn_tmpl); } module_init(crypto_authenc_esn_module_init); module_exit(crypto_authenc_esn_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Steffen Klassert <steffen.klassert@secunet.com>"); MODULE_DESCRIPTION("AEAD wrapper for IPsec with extended sequence numbers"); MODULE_ALIAS_CRYPTO("authencesn"); |
| 5 5 5 5 5 5 5 1 4 5 1 4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | // SPDX-License-Identifier: GPL-2.0 /* * Parts of this file are * Copyright (C) 2022-2023 Intel Corporation */ #include <linux/ieee80211.h> #include <linux/export.h> #include <net/cfg80211.h> #include "nl80211.h" #include "core.h" #include "rdev-ops.h" static int ___cfg80211_stop_ap(struct cfg80211_registered_device *rdev, struct net_device *dev, unsigned int link_id, bool notify) { struct wireless_dev *wdev = dev->ieee80211_ptr; int err; lockdep_assert_wiphy(wdev->wiphy); if (!rdev->ops->stop_ap) return -EOPNOTSUPP; if (dev->ieee80211_ptr->iftype != NL80211_IFTYPE_AP && dev->ieee80211_ptr->iftype != NL80211_IFTYPE_P2P_GO) return -EOPNOTSUPP; if (!wdev->links[link_id].ap.beacon_interval) return -ENOENT; err = rdev_stop_ap(rdev, dev, link_id); if (!err) { wdev->conn_owner_nlportid = 0; wdev->links[link_id].ap.beacon_interval = 0; memset(&wdev->links[link_id].ap.chandef, 0, sizeof(wdev->links[link_id].ap.chandef)); wdev->u.ap.ssid_len = 0; rdev_set_qos_map(rdev, dev, NULL); if (notify) nl80211_send_ap_stopped(wdev, link_id); /* Should we apply the grace period during beaconing interface * shutdown also? */ cfg80211_sched_dfs_chan_update(rdev); } schedule_work(&cfg80211_disconnect_work); return err; } int cfg80211_stop_ap(struct cfg80211_registered_device *rdev, struct net_device *dev, int link_id, bool notify) { unsigned int link; int ret = 0; if (link_id >= 0) return ___cfg80211_stop_ap(rdev, dev, link_id, notify); for_each_valid_link(dev->ieee80211_ptr, link) { int ret1 = ___cfg80211_stop_ap(rdev, dev, link, notify); if (ret1) ret = ret1; /* try the next one also if one errored */ } return ret; } |
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __FS_CEPH_BUFFER_H #define __FS_CEPH_BUFFER_H #include <linux/kref.h> #include <linux/mm.h> #include <linux/vmalloc.h> #include <linux/types.h> #include <linux/uio.h> /* * a simple reference counted buffer. * * use kmalloc for smaller sizes, vmalloc for larger sizes. */ struct ceph_buffer { struct kref kref; struct kvec vec; size_t alloc_len; }; extern struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp); extern void ceph_buffer_release(struct kref *kref); static inline struct ceph_buffer *ceph_buffer_get(struct ceph_buffer *b) { kref_get(&b->kref); return b; } static inline void ceph_buffer_put(struct ceph_buffer *b) { if (b) kref_put(&b->kref, ceph_buffer_release); } extern int ceph_decode_buffer(struct ceph_buffer **b, void **p, void *end); #endif |
| 7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | /* SPDX-License-Identifier: GPL-2.0-only */ #ifndef __KVM_IODEV_H__ #define __KVM_IODEV_H__ #include <linux/kvm_types.h> #include <linux/errno.h> struct kvm_io_device; struct kvm_vcpu; /** * kvm_io_device_ops are called under kvm slots_lock. * read and write handlers return 0 if the transaction has been handled, * or non-zero to have it passed to the next device. **/ struct kvm_io_device_ops { int (*read)(struct kvm_vcpu *vcpu, struct kvm_io_device *this, gpa_t addr, int len, void *val); int (*write)(struct kvm_vcpu *vcpu, struct kvm_io_device *this, gpa_t addr, int len, const void *val); void (*destructor)(struct kvm_io_device *this); }; struct kvm_io_device { const struct kvm_io_device_ops *ops; }; static inline void kvm_iodevice_init(struct kvm_io_device *dev, const struct kvm_io_device_ops *ops) { dev->ops = ops; } static inline int kvm_iodevice_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev, gpa_t addr, int l, void *v) { return dev->ops->read ? dev->ops->read(vcpu, dev, addr, l, v) : -EOPNOTSUPP; } static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu, struct kvm_io_device *dev, gpa_t addr, int l, const void *v) { return dev->ops->write ? dev->ops->write(vcpu, dev, addr, l, v) : -EOPNOTSUPP; } #endif /* __KVM_IODEV_H__ */ |
| 12 12 12 1 1 1 1 1 30 30 2 2 1 1 2 2 15 15 1 14 15 15 1 1 1 15 6 6 15 5 15 6 15 14 3 14 15 3 5 8 1 2 6 10 5 5 2 3 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 | // SPDX-License-Identifier: GPL-2.0-only /* Miscellaneous routines. * * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/swap.h> #include "internal.h" /** * netfs_alloc_folioq_buffer - Allocate buffer space into a folio queue * @mapping: Address space to set on the folio (or NULL). * @_buffer: Pointer to the folio queue to add to (may point to a NULL; updated). * @_cur_size: Current size of the buffer (updated). * @size: Target size of the buffer. * @gfp: The allocation constraints. */ int netfs_alloc_folioq_buffer(struct address_space *mapping, struct folio_queue **_buffer, size_t *_cur_size, ssize_t size, gfp_t gfp) { struct folio_queue *tail = *_buffer, *p; size = round_up(size, PAGE_SIZE); if (*_cur_size >= size) return 0; if (tail) while (tail->next) tail = tail->next; do { struct folio *folio; int order = 0, slot; if (!tail || folioq_full(tail)) { p = netfs_folioq_alloc(0, GFP_NOFS, netfs_trace_folioq_alloc_buffer); if (!p) return -ENOMEM; if (tail) { tail->next = p; p->prev = tail; } else { *_buffer = p; } tail = p; } if (size - *_cur_size > PAGE_SIZE) order = umin(ilog2(size - *_cur_size) - PAGE_SHIFT, MAX_PAGECACHE_ORDER); folio = folio_alloc(gfp, order); if (!folio && order > 0) folio = folio_alloc(gfp, 0); if (!folio) return -ENOMEM; folio->mapping = mapping; folio->index = *_cur_size / PAGE_SIZE; trace_netfs_folio(folio, netfs_folio_trace_alloc_buffer); slot = folioq_append_mark(tail, folio); *_cur_size += folioq_folio_size(tail, slot); } while (*_cur_size < size); return 0; } EXPORT_SYMBOL(netfs_alloc_folioq_buffer); /** * netfs_free_folioq_buffer - Free a folio queue. * @fq: The start of the folio queue to free * * Free up a chain of folio_queues and, if marked, the marked folios they point * to. */ void netfs_free_folioq_buffer(struct folio_queue *fq) { struct folio_queue *next; struct folio_batch fbatch; folio_batch_init(&fbatch); for (; fq; fq = next) { for (int slot = 0; slot < folioq_count(fq); slot++) { struct folio *folio = folioq_folio(fq, slot); if (!folio || !folioq_is_marked(fq, slot)) continue; trace_netfs_folio(folio, netfs_folio_trace_put); if (folio_batch_add(&fbatch, folio)) folio_batch_release(&fbatch); } netfs_stat_d(&netfs_n_folioq); next = fq->next; kfree(fq); } folio_batch_release(&fbatch); } EXPORT_SYMBOL(netfs_free_folioq_buffer); /* * Reset the subrequest iterator to refer just to the region remaining to be * read. The iterator may or may not have been advanced by socket ops or * extraction ops to an extent that may or may not match the amount actually * read. */ void netfs_reset_iter(struct netfs_io_subrequest *subreq) { struct iov_iter *io_iter = &subreq->io_iter; size_t remain = subreq->len - subreq->transferred; if (io_iter->count > remain) iov_iter_advance(io_iter, io_iter->count - remain); else if (io_iter->count < remain) iov_iter_revert(io_iter, remain - io_iter->count); iov_iter_truncate(&subreq->io_iter, remain); } /** * netfs_dirty_folio - Mark folio dirty and pin a cache object for writeback * @mapping: The mapping the folio belongs to. * @folio: The folio being dirtied. * * Set the dirty flag on a folio and pin an in-use cache object in memory so * that writeback can later write to it. This is intended to be called from * the filesystem's ->dirty_folio() method. * * Return: true if the dirty flag was set on the folio, false otherwise. */ bool netfs_dirty_folio(struct address_space *mapping, struct folio *folio) { struct inode *inode = mapping->host; struct netfs_inode *ictx = netfs_inode(inode); struct fscache_cookie *cookie = netfs_i_cookie(ictx); bool need_use = false; _enter(""); if (!filemap_dirty_folio(mapping, folio)) return false; if (!fscache_cookie_valid(cookie)) return true; if (!(inode->i_state & I_PINNING_NETFS_WB)) { spin_lock(&inode->i_lock); if (!(inode->i_state & I_PINNING_NETFS_WB)) { inode->i_state |= I_PINNING_NETFS_WB; need_use = true; } spin_unlock(&inode->i_lock); if (need_use) fscache_use_cookie(cookie, true); } return true; } EXPORT_SYMBOL(netfs_dirty_folio); /** * netfs_unpin_writeback - Unpin writeback resources * @inode: The inode on which the cookie resides * @wbc: The writeback control * * Unpin the writeback resources pinned by netfs_dirty_folio(). This is * intended to be called as/by the netfs's ->write_inode() method. */ int netfs_unpin_writeback(struct inode *inode, struct writeback_control *wbc) { struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode)); if (wbc->unpinned_netfs_wb) fscache_unuse_cookie(cookie, NULL, NULL); return 0; } EXPORT_SYMBOL(netfs_unpin_writeback); /** * netfs_clear_inode_writeback - Clear writeback resources pinned by an inode * @inode: The inode to clean up * @aux: Auxiliary data to apply to the inode * * Clear any writeback resources held by an inode when the inode is evicted. * This must be called before clear_inode() is called. */ void netfs_clear_inode_writeback(struct inode *inode, const void *aux) { struct fscache_cookie *cookie = netfs_i_cookie(netfs_inode(inode)); if (inode->i_state & I_PINNING_NETFS_WB) { loff_t i_size = i_size_read(inode); fscache_unuse_cookie(cookie, aux, &i_size); } } EXPORT_SYMBOL(netfs_clear_inode_writeback); /** * netfs_invalidate_folio - Invalidate or partially invalidate a folio * @folio: Folio proposed for release * @offset: Offset of the invalidated region * @length: Length of the invalidated region * * Invalidate part or all of a folio for a network filesystem. The folio will * be removed afterwards if the invalidated region covers the entire folio. */ void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length) { struct netfs_folio *finfo; struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); size_t flen = folio_size(folio); _enter("{%lx},%zx,%zx", folio->index, offset, length); if (offset == 0 && length == flen) { unsigned long long i_size = i_size_read(&ctx->inode); unsigned long long fpos = folio_pos(folio), end; end = umin(fpos + flen, i_size); if (fpos < i_size && end > ctx->zero_point) ctx->zero_point = end; } folio_wait_private_2(folio); /* [DEPRECATED] */ if (!folio_test_private(folio)) return; finfo = netfs_folio_info(folio); if (offset == 0 && length >= flen) goto erase_completely; if (finfo) { /* We have a partially uptodate page from a streaming write. */ unsigned int fstart = finfo->dirty_offset; unsigned int fend = fstart + finfo->dirty_len; unsigned int iend = offset + length; if (offset >= fend) return; if (iend <= fstart) return; /* The invalidation region overlaps the data. If the region * covers the start of the data, we either move along the start * or just erase the data entirely. */ if (offset <= fstart) { if (iend >= fend) goto erase_completely; /* Move the start of the data. */ finfo->dirty_len = fend - iend; finfo->dirty_offset = offset; return; } /* Reduce the length of the data if the invalidation region * covers the tail part. */ if (iend >= fend) { finfo->dirty_len = offset - fstart; return; } /* A partial write was split. The caller has already zeroed * it, so just absorb the hole. */ } return; erase_completely: netfs_put_group(netfs_folio_group(folio)); folio_detach_private(folio); folio_clear_uptodate(folio); kfree(finfo); return; } EXPORT_SYMBOL(netfs_invalidate_folio); /** * netfs_release_folio - Try to release a folio * @folio: Folio proposed for release * @gfp: Flags qualifying the release * * Request release of a folio and clean up its private state if it's not busy. * Returns true if the folio can now be released, false if not */ bool netfs_release_folio(struct folio *folio, gfp_t gfp) { struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); unsigned long long end; if (folio_test_dirty(folio)) return false; end = umin(folio_pos(folio) + folio_size(folio), i_size_read(&ctx->inode)); if (end > ctx->zero_point) ctx->zero_point = end; if (folio_test_private(folio)) return false; if (unlikely(folio_test_private_2(folio))) { /* [DEPRECATED] */ if (current_is_kswapd() || !(gfp & __GFP_FS)) return false; folio_wait_private_2(folio); } fscache_note_page_release(netfs_i_cookie(ctx)); return true; } EXPORT_SYMBOL(netfs_release_folio); /* * Wake the collection work item. */ void netfs_wake_collector(struct netfs_io_request *rreq) { if (test_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &rreq->flags) && !test_bit(NETFS_RREQ_RETRYING, &rreq->flags)) { queue_work(system_unbound_wq, &rreq->work); } else { trace_netfs_rreq(rreq, netfs_rreq_trace_wake_queue); wake_up(&rreq->waitq); } } /* * Mark a subrequest as no longer being in progress and, if need be, wake the * collector. */ void netfs_subreq_clear_in_progress(struct netfs_io_subrequest *subreq) { struct netfs_io_request *rreq = subreq->rreq; struct netfs_io_stream *stream = &rreq->io_streams[subreq->stream_nr]; clear_bit_unlock(NETFS_SREQ_IN_PROGRESS, &subreq->flags); smp_mb__after_atomic(); /* Clear IN_PROGRESS before task state */ /* If we are at the head of the queue, wake up the collector. */ if (list_is_first(&subreq->rreq_link, &stream->subrequests) || test_bit(NETFS_RREQ_RETRYING, &rreq->flags)) netfs_wake_collector(rreq); } /* * Wait for all outstanding I/O in a stream to quiesce. */ void netfs_wait_for_in_progress_stream(struct netfs_io_request *rreq, struct netfs_io_stream *stream) { struct netfs_io_subrequest *subreq; DEFINE_WAIT(myself); list_for_each_entry(subreq, &stream->subrequests, rreq_link) { if (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags)) continue; trace_netfs_rreq(rreq, netfs_rreq_trace_wait_queue); for (;;) { prepare_to_wait(&rreq->waitq, &myself, TASK_UNINTERRUPTIBLE); if (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags)) break; trace_netfs_sreq(subreq, netfs_sreq_trace_wait_for); schedule(); trace_netfs_rreq(rreq, netfs_rreq_trace_woke_queue); } } finish_wait(&rreq->waitq, &myself); } /* * Perform collection in app thread if not offloaded to workqueue. */ static int netfs_collect_in_app(struct netfs_io_request *rreq, bool (*collector)(struct netfs_io_request *rreq)) { bool need_collect = false, inactive = true; for (int i = 0; i < NR_IO_STREAMS; i++) { struct netfs_io_subrequest *subreq; struct netfs_io_stream *stream = &rreq->io_streams[i]; if (!stream->active) continue; inactive = false; trace_netfs_collect_stream(rreq, stream); subreq = list_first_entry_or_null(&stream->subrequests, struct netfs_io_subrequest, rreq_link); if (subreq && (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags) || test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags))) { need_collect = true; break; } } if (!need_collect && !inactive) return 0; /* Sleep */ __set_current_state(TASK_RUNNING); if (collector(rreq)) { /* Drop the ref from the NETFS_RREQ_IN_PROGRESS flag. */ netfs_put_request(rreq, netfs_rreq_trace_put_work_ip); return 1; /* Done */ } if (inactive) { WARN(true, "Failed to collect inactive req R=%08x\n", rreq->debug_id); cond_resched(); } return 2; /* Again */ } /* * Wait for a request to complete, successfully or otherwise. */ static ssize_t netfs_wait_for_request(struct netfs_io_request *rreq, bool (*collector)(struct netfs_io_request *rreq)) { DEFINE_WAIT(myself); ssize_t ret; for (;;) { trace_netfs_rreq(rreq, netfs_rreq_trace_wait_queue); prepare_to_wait(&rreq->waitq, &myself, TASK_UNINTERRUPTIBLE); if (!test_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &rreq->flags)) { switch (netfs_collect_in_app(rreq, collector)) { case 0: break; case 1: goto all_collected; case 2: continue; } } if (!test_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags)) break; schedule(); trace_netfs_rreq(rreq, netfs_rreq_trace_woke_queue); } all_collected: finish_wait(&rreq->waitq, &myself); ret = rreq->error; if (ret == 0) { ret = rreq->transferred; switch (rreq->origin) { case NETFS_DIO_READ: case NETFS_DIO_WRITE: case NETFS_READ_SINGLE: case NETFS_UNBUFFERED_READ: case NETFS_UNBUFFERED_WRITE: break; default: if (rreq->submitted < rreq->len) { trace_netfs_failure(rreq, NULL, ret, netfs_fail_short_read); ret = -EIO; } break; } } return ret; } ssize_t netfs_wait_for_read(struct netfs_io_request *rreq) { return netfs_wait_for_request(rreq, netfs_read_collection); } ssize_t netfs_wait_for_write(struct netfs_io_request *rreq) { return netfs_wait_for_request(rreq, netfs_write_collection); } /* * Wait for a paused operation to unpause or complete in some manner. */ static void netfs_wait_for_pause(struct netfs_io_request *rreq, bool (*collector)(struct netfs_io_request *rreq)) { DEFINE_WAIT(myself); trace_netfs_rreq(rreq, netfs_rreq_trace_wait_pause); for (;;) { trace_netfs_rreq(rreq, netfs_rreq_trace_wait_queue); prepare_to_wait(&rreq->waitq, &myself, TASK_UNINTERRUPTIBLE); if (!test_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &rreq->flags)) { switch (netfs_collect_in_app(rreq, collector)) { case 0: break; case 1: goto all_collected; case 2: continue; } } if (!test_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags) || !test_bit(NETFS_RREQ_PAUSE, &rreq->flags)) break; schedule(); trace_netfs_rreq(rreq, netfs_rreq_trace_woke_queue); } all_collected: finish_wait(&rreq->waitq, &myself); } void netfs_wait_for_paused_read(struct netfs_io_request *rreq) { return netfs_wait_for_pause(rreq, netfs_read_collection); } void netfs_wait_for_paused_write(struct netfs_io_request *rreq) { return netfs_wait_for_pause(rreq, netfs_write_collection); } |
| 282 285 17 3 7 1 7 1 2 4 4 3 1 1 115 115 111 114 470 471 115 61 63 63 2 68 51 16 16 1 1 16 3 2 1 1 6 3 2 1 1 2 2 1 16 5 11 11 1 1 2 13 11 3 2 1 1 7 1 1 3 5 7 1 7 1 13 16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 | // SPDX-License-Identifier: GPL-2.0-only /* * This is a module which is used for queueing packets and communicating with * userspace via nfnetlink. * * (C) 2005 by Harald Welte <laforge@netfilter.org> * (C) 2007 by Patrick McHardy <kaber@trash.net> * * Based on the old ipv4-only ip_queue.c: * (C) 2000-2002 James Morris <jmorris@intercode.com.au> * (C) 2003-2005 Netfilter Core Team <coreteam@netfilter.org> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/module.h> #include <linux/skbuff.h> #include <linux/init.h> #include <linux/spinlock.h> #include <linux/slab.h> #include <linux/notifier.h> #include <linux/netdevice.h> #include <linux/netfilter.h> #include <linux/proc_fs.h> #include <linux/netfilter_ipv4.h> #include <linux/netfilter_ipv6.h> #include <linux/netfilter_bridge.h> #include <linux/netfilter/nfnetlink.h> #include <linux/netfilter/nfnetlink_queue.h> #include <linux/netfilter/nf_conntrack_common.h> #include <linux/list.h> #include <linux/cgroup-defs.h> #include <net/gso.h> #include <net/sock.h> #include <net/tcp_states.h> #include <net/netfilter/nf_queue.h> #include <net/netns/generic.h> #include <linux/atomic.h> #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) #include "../bridge/br_private.h" #endif #if IS_ENABLED(CONFIG_NF_CONNTRACK) #include <net/netfilter/nf_conntrack.h> #endif #define NFQNL_QMAX_DEFAULT 1024 /* We're using struct nlattr which has 16bit nla_len. Note that nla_len * includes the header length. Thus, the maximum packet length that we * support is 65531 bytes. We send truncated packets if the specified length * is larger than that. Userspace can check for presence of NFQA_CAP_LEN * attribute to detect truncation. */ #define NFQNL_MAX_COPY_RANGE (0xffff - NLA_HDRLEN) struct nfqnl_instance { struct hlist_node hlist; /* global list of queues */ struct rcu_head rcu; u32 peer_portid; unsigned int queue_maxlen; unsigned int copy_range; unsigned int queue_dropped; unsigned int queue_user_dropped; u_int16_t queue_num; /* number of this queue */ u_int8_t copy_mode; u_int32_t flags; /* Set using NFQA_CFG_FLAGS */ /* * Following fields are dirtied for each queued packet, * keep them in same cache line if possible. */ spinlock_t lock ____cacheline_aligned_in_smp; unsigned int queue_total; unsigned int id_sequence; /* 'sequence' of pkt ids */ struct list_head queue_list; /* packets in queue */ }; typedef int (*nfqnl_cmpfn)(struct nf_queue_entry *, unsigned long); static unsigned int nfnl_queue_net_id __read_mostly; #define INSTANCE_BUCKETS 16 struct nfnl_queue_net { spinlock_t instances_lock; struct hlist_head instance_table[INSTANCE_BUCKETS]; }; static struct nfnl_queue_net *nfnl_queue_pernet(struct net *net) { return net_generic(net, nfnl_queue_net_id); } static inline u_int8_t instance_hashfn(u_int16_t queue_num) { return ((queue_num >> 8) ^ queue_num) % INSTANCE_BUCKETS; } static struct nfqnl_instance * instance_lookup(struct nfnl_queue_net *q, u_int16_t queue_num) { struct hlist_head *head; struct nfqnl_instance *inst; head = &q->instance_table[instance_hashfn(queue_num)]; hlist_for_each_entry_rcu(inst, head, hlist) { if (inst->queue_num == queue_num) return inst; } return NULL; } static struct nfqnl_instance * instance_create(struct nfnl_queue_net *q, u_int16_t queue_num, u32 portid) { struct nfqnl_instance *inst; unsigned int h; int err; spin_lock(&q->instances_lock); if (instance_lookup(q, queue_num)) { err = -EEXIST; goto out_unlock; } inst = kzalloc(sizeof(*inst), GFP_ATOMIC); if (!inst) { err = -ENOMEM; goto out_unlock; } inst->queue_num = queue_num; inst->peer_portid = portid; inst->queue_maxlen = NFQNL_QMAX_DEFAULT; inst->copy_range = NFQNL_MAX_COPY_RANGE; inst->copy_mode = NFQNL_COPY_NONE; spin_lock_init(&inst->lock); INIT_LIST_HEAD(&inst->queue_list); if (!try_module_get(THIS_MODULE)) { err = -EAGAIN; goto out_free; } h = instance_hashfn(queue_num); hlist_add_head_rcu(&inst->hlist, &q->instance_table[h]); spin_unlock(&q->instances_lock); return inst; out_free: kfree(inst); out_unlock: spin_unlock(&q->instances_lock); return ERR_PTR(err); } static void nfqnl_flush(struct nfqnl_instance *queue, nfqnl_cmpfn cmpfn, unsigned long data); static void instance_destroy_rcu(struct rcu_head *head) { struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance, rcu); rcu_read_lock(); nfqnl_flush(inst, NULL, 0); rcu_read_unlock(); kfree(inst); module_put(THIS_MODULE); } static void __instance_destroy(struct nfqnl_instance *inst) { hlist_del_rcu(&inst->hlist); call_rcu(&inst->rcu, instance_destroy_rcu); } static void instance_destroy(struct nfnl_queue_net *q, struct nfqnl_instance *inst) { spin_lock(&q->instances_lock); __instance_destroy(inst); spin_unlock(&q->instances_lock); } static inline void __enqueue_entry(struct nfqnl_instance *queue, struct nf_queue_entry *entry) { list_add_tail(&entry->list, &queue->queue_list); queue->queue_total++; } static void __dequeue_entry(struct nfqnl_instance *queue, struct nf_queue_entry *entry) { list_del(&entry->list); queue->queue_total--; } static struct nf_queue_entry * find_dequeue_entry(struct nfqnl_instance *queue, unsigned int id) { struct nf_queue_entry *entry = NULL, *i; spin_lock_bh(&queue->lock); list_for_each_entry(i, &queue->queue_list, list) { if (i->id == id) { entry = i; break; } } if (entry) __dequeue_entry(queue, entry); spin_unlock_bh(&queue->lock); return entry; } static unsigned int nf_iterate(struct sk_buff *skb, struct nf_hook_state *state, const struct nf_hook_entries *hooks, unsigned int *index) { const struct nf_hook_entry *hook; unsigned int verdict, i = *index; while (i < hooks->num_hook_entries) { hook = &hooks->hooks[i]; repeat: verdict = nf_hook_entry_hookfn(hook, skb, state); if (verdict != NF_ACCEPT) { *index = i; if (verdict != NF_REPEAT) return verdict; goto repeat; } i++; } *index = i; return NF_ACCEPT; } static struct nf_hook_entries *nf_hook_entries_head(const struct net *net, u8 pf, u8 hooknum) { switch (pf) { #ifdef CONFIG_NETFILTER_FAMILY_BRIDGE case NFPROTO_BRIDGE: return rcu_dereference(net->nf.hooks_bridge[hooknum]); #endif case NFPROTO_IPV4: return rcu_dereference(net->nf.hooks_ipv4[hooknum]); case NFPROTO_IPV6: return rcu_dereference(net->nf.hooks_ipv6[hooknum]); default: WARN_ON_ONCE(1); return NULL; } return NULL; } static int nf_ip_reroute(struct sk_buff *skb, const struct nf_queue_entry *entry) { #ifdef CONFIG_INET const struct ip_rt_info *rt_info = nf_queue_entry_reroute(entry); if (entry->state.hook == NF_INET_LOCAL_OUT) { const struct iphdr *iph = ip_hdr(skb); if (!(iph->tos == rt_info->tos && skb->mark == rt_info->mark && iph->daddr == rt_info->daddr && iph->saddr == rt_info->saddr)) return ip_route_me_harder(entry->state.net, entry->state.sk, skb, RTN_UNSPEC); } #endif return 0; } static int nf_reroute(struct sk_buff *skb, struct nf_queue_entry *entry) { const struct nf_ipv6_ops *v6ops; int ret = 0; switch (entry->state.pf) { case AF_INET: ret = nf_ip_reroute(skb, entry); break; case AF_INET6: v6ops = rcu_dereference(nf_ipv6_ops); if (v6ops) ret = v6ops->reroute(skb, entry); break; } return ret; } /* caller must hold rcu read-side lock */ static void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict) { const struct nf_hook_entry *hook_entry; const struct nf_hook_entries *hooks; struct sk_buff *skb = entry->skb; const struct net *net; unsigned int i; int err; u8 pf; net = entry->state.net; pf = entry->state.pf; hooks = nf_hook_entries_head(net, pf, entry->state.hook); i = entry->hook_index; if (!hooks || i >= hooks->num_hook_entries) { kfree_skb_reason(skb, SKB_DROP_REASON_NETFILTER_DROP); nf_queue_entry_free(entry); return; } hook_entry = &hooks->hooks[i]; /* Continue traversal iff userspace said ok... */ if (verdict == NF_REPEAT) verdict = nf_hook_entry_hookfn(hook_entry, skb, &entry->state); if (verdict == NF_ACCEPT) { if (nf_reroute(skb, entry) < 0) verdict = NF_DROP; } if (verdict == NF_ACCEPT) { next_hook: ++i; verdict = nf_iterate(skb, &entry->state, hooks, &i); } switch (verdict & NF_VERDICT_MASK) { case NF_ACCEPT: case NF_STOP: local_bh_disable(); entry->state.okfn(entry->state.net, entry->state.sk, skb); local_bh_enable(); break; case NF_QUEUE: err = nf_queue(skb, &entry->state, i, verdict); if (err == 1) goto next_hook; break; case NF_STOLEN: break; default: kfree_skb(skb); } nf_queue_entry_free(entry); } static void nfqnl_reinject(struct nf_queue_entry *entry, unsigned int verdict) { const struct nf_ct_hook *ct_hook; if (verdict == NF_ACCEPT || verdict == NF_REPEAT || verdict == NF_STOP) { unsigned int ct_verdict = verdict; rcu_read_lock(); ct_hook = rcu_dereference(nf_ct_hook); if (ct_hook) ct_verdict = ct_hook->update(entry->state.net, entry->skb); rcu_read_unlock(); switch (ct_verdict & NF_VERDICT_MASK) { case NF_ACCEPT: /* follow userspace verdict, could be REPEAT */ break; case NF_STOLEN: nf_queue_entry_free(entry); return; default: verdict = ct_verdict & NF_VERDICT_MASK; break; } } nf_reinject(entry, verdict); } static void nfqnl_flush(struct nfqnl_instance *queue, nfqnl_cmpfn cmpfn, unsigned long data) { struct nf_queue_entry *entry, *next; spin_lock_bh(&queue->lock); list_for_each_entry_safe(entry, next, &queue->queue_list, list) { if (!cmpfn || cmpfn(entry, data)) { list_del(&entry->list); queue->queue_total--; nfqnl_reinject(entry, NF_DROP); } } spin_unlock_bh(&queue->lock); } static int nfqnl_put_packet_info(struct sk_buff *nlskb, struct sk_buff *packet, bool csum_verify) { __u32 flags = 0; if (packet->ip_summed == CHECKSUM_PARTIAL) flags = NFQA_SKB_CSUMNOTREADY; else if (csum_verify) flags = NFQA_SKB_CSUM_NOTVERIFIED; if (skb_is_gso(packet)) flags |= NFQA_SKB_GSO; return flags ? nla_put_be32(nlskb, NFQA_SKB_INFO, htonl(flags)) : 0; } static int nfqnl_put_sk_uidgid(struct sk_buff *skb, struct sock *sk) { const struct cred *cred; if (!sk_fullsock(sk)) return 0; read_lock_bh(&sk->sk_callback_lock); if (sk->sk_socket && sk->sk_socket->file) { cred = sk->sk_socket->file->f_cred; if (nla_put_be32(skb, NFQA_UID, htonl(from_kuid_munged(&init_user_ns, cred->fsuid)))) goto nla_put_failure; if (nla_put_be32(skb, NFQA_GID, htonl(from_kgid_munged(&init_user_ns, cred->fsgid)))) goto nla_put_failure; } read_unlock_bh(&sk->sk_callback_lock); return 0; nla_put_failure: read_unlock_bh(&sk->sk_callback_lock); return -1; } static int nfqnl_put_sk_classid(struct sk_buff *skb, struct sock *sk) { #if IS_ENABLED(CONFIG_CGROUP_NET_CLASSID) if (sk && sk_fullsock(sk)) { u32 classid = sock_cgroup_classid(&sk->sk_cgrp_data); if (classid && nla_put_be32(skb, NFQA_CGROUP_CLASSID, htonl(classid))) return -1; } #endif return 0; } static int nfqnl_get_sk_secctx(struct sk_buff *skb, struct lsm_context *ctx) { int seclen = 0; #if IS_ENABLED(CONFIG_NETWORK_SECMARK) if (!skb || !sk_fullsock(skb->sk)) return 0; read_lock_bh(&skb->sk->sk_callback_lock); if (skb->secmark) seclen = security_secid_to_secctx(skb->secmark, ctx); read_unlock_bh(&skb->sk->sk_callback_lock); #endif return seclen; } static u32 nfqnl_get_bridge_size(struct nf_queue_entry *entry) { struct sk_buff *entskb = entry->skb; u32 nlalen = 0; if (entry->state.pf != PF_BRIDGE || !skb_mac_header_was_set(entskb)) return 0; if (skb_vlan_tag_present(entskb)) nlalen += nla_total_size(nla_total_size(sizeof(__be16)) + nla_total_size(sizeof(__be16))); if (entskb->network_header > entskb->mac_header) nlalen += nla_total_size((entskb->network_header - entskb->mac_header)); return nlalen; } static int nfqnl_put_bridge(struct nf_queue_entry *entry, struct sk_buff *skb) { struct sk_buff *entskb = entry->skb; if (entry->state.pf != PF_BRIDGE || !skb_mac_header_was_set(entskb)) return 0; if (skb_vlan_tag_present(entskb)) { struct nlattr *nest; nest = nla_nest_start(skb, NFQA_VLAN); if (!nest) goto nla_put_failure; if (nla_put_be16(skb, NFQA_VLAN_TCI, htons(entskb->vlan_tci)) || nla_put_be16(skb, NFQA_VLAN_PROTO, entskb->vlan_proto)) goto nla_put_failure; nla_nest_end(skb, nest); } if (entskb->mac_header < entskb->network_header) { int len = (int)(entskb->network_header - entskb->mac_header); if (nla_put(skb, NFQA_L2HDR, len, skb_mac_header(entskb))) goto nla_put_failure; } return 0; nla_put_failure: return -1; } static int nf_queue_checksum_help(struct sk_buff *entskb) { if (skb_csum_is_sctp(entskb)) return skb_crc32c_csum_help(entskb); return skb_checksum_help(entskb); } static struct sk_buff * nfqnl_build_packet_message(struct net *net, struct nfqnl_instance *queue, struct nf_queue_entry *entry, __be32 **packet_id_ptr) { size_t size; size_t data_len = 0, cap_len = 0; unsigned int hlen = 0; struct sk_buff *skb; struct nlattr *nla; struct nfqnl_msg_packet_hdr *pmsg; struct nlmsghdr *nlh; struct sk_buff *entskb = entry->skb; struct net_device *indev; struct net_device *outdev; struct nf_conn *ct = NULL; enum ip_conntrack_info ctinfo = 0; const struct nfnl_ct_hook *nfnl_ct; bool csum_verify; struct lsm_context ctx = { NULL, 0, 0 }; int seclen = 0; ktime_t tstamp; size = nlmsg_total_size(sizeof(struct nfgenmsg)) + nla_total_size(sizeof(struct nfqnl_msg_packet_hdr)) + nla_total_size(sizeof(u_int32_t)) /* ifindex */ + nla_total_size(sizeof(u_int32_t)) /* ifindex */ #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) + nla_total_size(sizeof(u_int32_t)) /* ifindex */ + nla_total_size(sizeof(u_int32_t)) /* ifindex */ #endif + nla_total_size(sizeof(u_int32_t)) /* mark */ + nla_total_size(sizeof(u_int32_t)) /* priority */ + nla_total_size(sizeof(struct nfqnl_msg_packet_hw)) + nla_total_size(sizeof(u_int32_t)) /* skbinfo */ #if IS_ENABLED(CONFIG_CGROUP_NET_CLASSID) + nla_total_size(sizeof(u_int32_t)) /* classid */ #endif + nla_total_size(sizeof(u_int32_t)); /* cap_len */ tstamp = skb_tstamp_cond(entskb, false); if (tstamp) size += nla_total_size(sizeof(struct nfqnl_msg_packet_timestamp)); size += nfqnl_get_bridge_size(entry); if (entry->state.hook <= NF_INET_FORWARD || (entry->state.hook == NF_INET_POST_ROUTING && entskb->sk == NULL)) csum_verify = !skb_csum_unnecessary(entskb); else csum_verify = false; outdev = entry->state.out; switch ((enum nfqnl_config_mode)READ_ONCE(queue->copy_mode)) { case NFQNL_COPY_META: case NFQNL_COPY_NONE: break; case NFQNL_COPY_PACKET: if (!(queue->flags & NFQA_CFG_F_GSO) && entskb->ip_summed == CHECKSUM_PARTIAL && nf_queue_checksum_help(entskb)) return NULL; data_len = READ_ONCE(queue->copy_range); if (data_len > entskb->len) data_len = entskb->len; hlen = skb_zerocopy_headlen(entskb); hlen = min_t(unsigned int, hlen, data_len); size += sizeof(struct nlattr) + hlen; cap_len = entskb->len; break; } nfnl_ct = rcu_dereference(nfnl_ct_hook); #if IS_ENABLED(CONFIG_NF_CONNTRACK) if (queue->flags & NFQA_CFG_F_CONNTRACK) { if (nfnl_ct != NULL) { ct = nf_ct_get(entskb, &ctinfo); if (ct != NULL) size += nfnl_ct->build_size(ct); } } #endif if (queue->flags & NFQA_CFG_F_UID_GID) { size += (nla_total_size(sizeof(u_int32_t)) /* uid */ + nla_total_size(sizeof(u_int32_t))); /* gid */ } if ((queue->flags & NFQA_CFG_F_SECCTX) && entskb->sk) { seclen = nfqnl_get_sk_secctx(entskb, &ctx); if (seclen < 0) return NULL; if (seclen) size += nla_total_size(seclen); } skb = alloc_skb(size, GFP_ATOMIC); if (!skb) { skb_tx_error(entskb); goto nlmsg_failure; } nlh = nfnl_msg_put(skb, 0, 0, nfnl_msg_type(NFNL_SUBSYS_QUEUE, NFQNL_MSG_PACKET), 0, entry->state.pf, NFNETLINK_V0, htons(queue->queue_num)); if (!nlh) { skb_tx_error(entskb); kfree_skb(skb); goto nlmsg_failure; } nla = __nla_reserve(skb, NFQA_PACKET_HDR, sizeof(*pmsg)); pmsg = nla_data(nla); pmsg->hw_protocol = entskb->protocol; pmsg->hook = entry->state.hook; *packet_id_ptr = &pmsg->packet_id; indev = entry->state.in; if (indev) { #if !IS_ENABLED(CONFIG_BRIDGE_NETFILTER) if (nla_put_be32(skb, NFQA_IFINDEX_INDEV, htonl(indev->ifindex))) goto nla_put_failure; #else if (entry->state.pf == PF_BRIDGE) { /* Case 1: indev is physical input device, we need to * look for bridge group (when called from * netfilter_bridge) */ if (nla_put_be32(skb, NFQA_IFINDEX_PHYSINDEV, htonl(indev->ifindex)) || /* this is the bridge group "brX" */ /* rcu_read_lock()ed by __nf_queue */ nla_put_be32(skb, NFQA_IFINDEX_INDEV, htonl(br_port_get_rcu(indev)->br->dev->ifindex))) goto nla_put_failure; } else { int physinif; /* Case 2: indev is bridge group, we need to look for * physical device (when called from ipv4) */ if (nla_put_be32(skb, NFQA_IFINDEX_INDEV, htonl(indev->ifindex))) goto nla_put_failure; physinif = nf_bridge_get_physinif(entskb); if (physinif && nla_put_be32(skb, NFQA_IFINDEX_PHYSINDEV, htonl(physinif))) goto nla_put_failure; } #endif } if (outdev) { #if !IS_ENABLED(CONFIG_BRIDGE_NETFILTER) if (nla_put_be32(skb, NFQA_IFINDEX_OUTDEV, htonl(outdev->ifindex))) goto nla_put_failure; #else if (entry->state.pf == PF_BRIDGE) { /* Case 1: outdev is physical output device, we need to * look for bridge group (when called from * netfilter_bridge) */ if (nla_put_be32(skb, NFQA_IFINDEX_PHYSOUTDEV, htonl(outdev->ifindex)) || /* this is the bridge group "brX" */ /* rcu_read_lock()ed by __nf_queue */ nla_put_be32(skb, NFQA_IFINDEX_OUTDEV, htonl(br_port_get_rcu(outdev)->br->dev->ifindex))) goto nla_put_failure; } else { int physoutif; /* Case 2: outdev is bridge group, we need to look for * physical output device (when called from ipv4) */ if (nla_put_be32(skb, NFQA_IFINDEX_OUTDEV, htonl(outdev->ifindex))) goto nla_put_failure; physoutif = nf_bridge_get_physoutif(entskb); if (physoutif && nla_put_be32(skb, NFQA_IFINDEX_PHYSOUTDEV, htonl(physoutif))) goto nla_put_failure; } #endif } if (entskb->mark && nla_put_be32(skb, NFQA_MARK, htonl(entskb->mark))) goto nla_put_failure; if (entskb->priority && nla_put_be32(skb, NFQA_PRIORITY, htonl(entskb->priority))) goto nla_put_failure; if (indev && entskb->dev && skb_mac_header_was_set(entskb) && skb_mac_header_len(entskb) != 0) { struct nfqnl_msg_packet_hw phw; int len; memset(&phw, 0, sizeof(phw)); len = dev_parse_header(entskb, phw.hw_addr); if (len) { phw.hw_addrlen = htons(len); if (nla_put(skb, NFQA_HWADDR, sizeof(phw), &phw)) goto nla_put_failure; } } if (nfqnl_put_bridge(entry, skb) < 0) goto nla_put_failure; if (entry->state.hook <= NF_INET_FORWARD && tstamp) { struct nfqnl_msg_packet_timestamp ts; struct timespec64 kts = ktime_to_timespec64(tstamp); ts.sec = cpu_to_be64(kts.tv_sec); ts.usec = cpu_to_be64(kts.tv_nsec / NSEC_PER_USEC); if (nla_put(skb, NFQA_TIMESTAMP, sizeof(ts), &ts)) goto nla_put_failure; } if ((queue->flags & NFQA_CFG_F_UID_GID) && entskb->sk && nfqnl_put_sk_uidgid(skb, entskb->sk) < 0) goto nla_put_failure; if (nfqnl_put_sk_classid(skb, entskb->sk) < 0) goto nla_put_failure; if (seclen > 0 && nla_put(skb, NFQA_SECCTX, ctx.len, ctx.context)) goto nla_put_failure; if (ct && nfnl_ct->build(skb, ct, ctinfo, NFQA_CT, NFQA_CT_INFO) < 0) goto nla_put_failure; if (cap_len > data_len && nla_put_be32(skb, NFQA_CAP_LEN, htonl(cap_len))) goto nla_put_failure; if (nfqnl_put_packet_info(skb, entskb, csum_verify)) goto nla_put_failure; if (data_len) { struct nlattr *nla; if (skb_tailroom(skb) < sizeof(*nla) + hlen) goto nla_put_failure; nla = skb_put(skb, sizeof(*nla)); nla->nla_type = NFQA_PAYLOAD; nla->nla_len = nla_attr_size(data_len); if (skb_zerocopy(skb, entskb, data_len, hlen)) goto nla_put_failure; } nlh->nlmsg_len = skb->len; if (seclen >= 0) security_release_secctx(&ctx); return skb; nla_put_failure: skb_tx_error(entskb); kfree_skb(skb); net_err_ratelimited("nf_queue: error creating packet message\n"); nlmsg_failure: if (seclen >= 0) security_release_secctx(&ctx); return NULL; } static bool nf_ct_drop_unconfirmed(const struct nf_queue_entry *entry) { #if IS_ENABLED(CONFIG_NF_CONNTRACK) static const unsigned long flags = IPS_CONFIRMED | IPS_DYING; struct nf_conn *ct = (void *)skb_nfct(entry->skb); unsigned long status; unsigned int use; if (!ct) return false; status = READ_ONCE(ct->status); if ((status & flags) == IPS_DYING) return true; if (status & IPS_CONFIRMED) return false; /* in some cases skb_clone() can occur after initial conntrack * pickup, but conntrack assumes exclusive skb->_nfct ownership for * unconfirmed entries. * * This happens for br_netfilter and with ip multicast routing. * We can't be solved with serialization here because one clone could * have been queued for local delivery. */ use = refcount_read(&ct->ct_general.use); if (likely(use == 1)) return false; /* Can't decrement further? Exclusive ownership. */ if (!refcount_dec_not_one(&ct->ct_general.use)) return false; skb_set_nfct(entry->skb, 0); /* No nf_ct_put(): we already decremented .use and it cannot * drop down to 0. */ return true; #endif return false; } static int __nfqnl_enqueue_packet(struct net *net, struct nfqnl_instance *queue, struct nf_queue_entry *entry) { struct sk_buff *nskb; int err = -ENOBUFS; __be32 *packet_id_ptr; int failopen = 0; nskb = nfqnl_build_packet_message(net, queue, entry, &packet_id_ptr); if (nskb == NULL) { err = -ENOMEM; goto err_out; } spin_lock_bh(&queue->lock); if (nf_ct_drop_unconfirmed(entry)) goto err_out_free_nskb; if (queue->queue_total >= queue->queue_maxlen) { if (queue->flags & NFQA_CFG_F_FAIL_OPEN) { failopen = 1; err = 0; } else { queue->queue_dropped++; net_warn_ratelimited("nf_queue: full at %d entries, dropping packets(s)\n", queue->queue_total); } goto err_out_free_nskb; } entry->id = ++queue->id_sequence; *packet_id_ptr = htonl(entry->id); /* nfnetlink_unicast will either free the nskb or add it to a socket */ err = nfnetlink_unicast(nskb, net, queue->peer_portid); if (err < 0) { if (queue->flags & NFQA_CFG_F_FAIL_OPEN) { failopen = 1; err = 0; } else { queue->queue_user_dropped++; } goto err_out_unlock; } __enqueue_entry(queue, entry); spin_unlock_bh(&queue->lock); return 0; err_out_free_nskb: kfree_skb(nskb); err_out_unlock: spin_unlock_bh(&queue->lock); if (failopen) nfqnl_reinject(entry, NF_ACCEPT); err_out: return err; } static struct nf_queue_entry * nf_queue_entry_dup(struct nf_queue_entry *e) { struct nf_queue_entry *entry = kmemdup(e, e->size, GFP_ATOMIC); if (!entry) return NULL; if (nf_queue_entry_get_refs(entry)) return entry; kfree(entry); return NULL; } #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) /* When called from bridge netfilter, skb->data must point to MAC header * before calling skb_gso_segment(). Else, original MAC header is lost * and segmented skbs will be sent to wrong destination. */ static void nf_bridge_adjust_skb_data(struct sk_buff *skb) { if (nf_bridge_info_get(skb)) __skb_push(skb, skb->network_header - skb->mac_header); } static void nf_bridge_adjust_segmented_data(struct sk_buff *skb) { if (nf_bridge_info_get(skb)) __skb_pull(skb, skb->network_header - skb->mac_header); } #else #define nf_bridge_adjust_skb_data(s) do {} while (0) #define nf_bridge_adjust_segmented_data(s) do {} while (0) #endif static int __nfqnl_enqueue_packet_gso(struct net *net, struct nfqnl_instance *queue, struct sk_buff *skb, struct nf_queue_entry *entry) { int ret = -ENOMEM; struct nf_queue_entry *entry_seg; nf_bridge_adjust_segmented_data(skb); if (skb->next == NULL) { /* last packet, no need to copy entry */ struct sk_buff *gso_skb = entry->skb; entry->skb = skb; ret = __nfqnl_enqueue_packet(net, queue, entry); if (ret) entry->skb = gso_skb; return ret; } skb_mark_not_on_list(skb); entry_seg = nf_queue_entry_dup(entry); if (entry_seg) { entry_seg->skb = skb; ret = __nfqnl_enqueue_packet(net, queue, entry_seg); if (ret) nf_queue_entry_free(entry_seg); } return ret; } static int nfqnl_enqueue_packet(struct nf_queue_entry *entry, unsigned int queuenum) { unsigned int queued; struct nfqnl_instance *queue; struct sk_buff *skb, *segs, *nskb; int err = -ENOBUFS; struct net *net = entry->state.net; struct nfnl_queue_net *q = nfnl_queue_pernet(net); /* rcu_read_lock()ed by nf_hook_thresh */ queue = instance_lookup(q, queuenum); if (!queue) return -ESRCH; if (queue->copy_mode == NFQNL_COPY_NONE) return -EINVAL; skb = entry->skb; switch (entry->state.pf) { case NFPROTO_IPV4: skb->protocol = htons(ETH_P_IP); break; case NFPROTO_IPV6: skb->protocol = htons(ETH_P_IPV6); break; } if (!skb_is_gso(skb) || ((queue->flags & NFQA_CFG_F_GSO) && !skb_is_gso_sctp(skb))) return __nfqnl_enqueue_packet(net, queue, entry); nf_bridge_adjust_skb_data(skb); segs = skb_gso_segment(skb, 0); /* Does not use PTR_ERR to limit the number of error codes that can be * returned by nf_queue. For instance, callers rely on -ESRCH to * mean 'ignore this hook'. */ if (IS_ERR_OR_NULL(segs)) goto out_err; queued = 0; err = 0; skb_list_walk_safe(segs, segs, nskb) { if (err == 0) err = __nfqnl_enqueue_packet_gso(net, queue, segs, entry); if (err == 0) queued++; else kfree_skb(segs); } if (queued) { if (err) /* some segments are already queued */ nf_queue_entry_free(entry); kfree_skb(skb); return 0; } out_err: nf_bridge_adjust_segmented_data(skb); return err; } static int nfqnl_mangle(void *data, unsigned int data_len, struct nf_queue_entry *e, int diff) { struct sk_buff *nskb; if (diff < 0) { unsigned int min_len = skb_transport_offset(e->skb); if (data_len < min_len) return -EINVAL; if (pskb_trim(e->skb, data_len)) return -ENOMEM; } else if (diff > 0) { if (data_len > 0xFFFF) return -EINVAL; if (diff > skb_tailroom(e->skb)) { nskb = skb_copy_expand(e->skb, skb_headroom(e->skb), diff, GFP_ATOMIC); if (!nskb) return -ENOMEM; kfree_skb(e->skb); e->skb = nskb; } skb_put(e->skb, diff); } if (skb_ensure_writable(e->skb, data_len)) return -ENOMEM; skb_copy_to_linear_data(e->skb, data, data_len); e->skb->ip_summed = CHECKSUM_NONE; return 0; } static int nfqnl_set_mode(struct nfqnl_instance *queue, unsigned char mode, unsigned int range) { int status = 0; spin_lock_bh(&queue->lock); switch (mode) { case NFQNL_COPY_NONE: case NFQNL_COPY_META: queue->copy_mode = mode; queue->copy_range = 0; break; case NFQNL_COPY_PACKET: queue->copy_mode = mode; if (range == 0 || range > NFQNL_MAX_COPY_RANGE) queue->copy_range = NFQNL_MAX_COPY_RANGE; else queue->copy_range = range; break; default: status = -EINVAL; } spin_unlock_bh(&queue->lock); return status; } static int dev_cmp(struct nf_queue_entry *entry, unsigned long ifindex) { #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) int physinif, physoutif; physinif = nf_bridge_get_physinif(entry->skb); physoutif = nf_bridge_get_physoutif(entry->skb); if (physinif == ifindex || physoutif == ifindex) return 1; #endif if (entry->state.in) if (entry->state.in->ifindex == ifindex) return 1; if (entry->state.out) if (entry->state.out->ifindex == ifindex) return 1; return 0; } /* drop all packets with either indev or outdev == ifindex from all queue * instances */ static void nfqnl_dev_drop(struct net *net, int ifindex) { int i; struct nfnl_queue_net *q = nfnl_queue_pernet(net); rcu_read_lock(); for (i = 0; i < INSTANCE_BUCKETS; i++) { struct nfqnl_instance *inst; struct hlist_head *head = &q->instance_table[i]; hlist_for_each_entry_rcu(inst, head, hlist) nfqnl_flush(inst, dev_cmp, ifindex); } rcu_read_unlock(); } static int nfqnl_rcv_dev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); /* Drop any packets associated with the downed device */ if (event == NETDEV_DOWN) nfqnl_dev_drop(dev_net(dev), dev->ifindex); return NOTIFY_DONE; } static struct notifier_block nfqnl_dev_notifier = { .notifier_call = nfqnl_rcv_dev_event, }; static void nfqnl_nf_hook_drop(struct net *net) { struct nfnl_queue_net *q = nfnl_queue_pernet(net); int i; /* This function is also called on net namespace error unwind, * when pernet_ops->init() failed and ->exit() functions of the * previous pernet_ops gets called. * * This may result in a call to nfqnl_nf_hook_drop() before * struct nfnl_queue_net was allocated. */ if (!q) return; for (i = 0; i < INSTANCE_BUCKETS; i++) { struct nfqnl_instance *inst; struct hlist_head *head = &q->instance_table[i]; hlist_for_each_entry_rcu(inst, head, hlist) nfqnl_flush(inst, NULL, 0); } } static int nfqnl_rcv_nl_event(struct notifier_block *this, unsigned long event, void *ptr) { struct netlink_notify *n = ptr; struct nfnl_queue_net *q = nfnl_queue_pernet(n->net); if (event == NETLINK_URELEASE && n->protocol == NETLINK_NETFILTER) { int i; /* destroy all instances for this portid */ spin_lock(&q->instances_lock); for (i = 0; i < INSTANCE_BUCKETS; i++) { struct hlist_node *t2; struct nfqnl_instance *inst; struct hlist_head *head = &q->instance_table[i]; hlist_for_each_entry_safe(inst, t2, head, hlist) { if (n->portid == inst->peer_portid) __instance_destroy(inst); } } spin_unlock(&q->instances_lock); } return NOTIFY_DONE; } static struct notifier_block nfqnl_rtnl_notifier = { .notifier_call = nfqnl_rcv_nl_event, }; static const struct nla_policy nfqa_vlan_policy[NFQA_VLAN_MAX + 1] = { [NFQA_VLAN_TCI] = { .type = NLA_U16}, [NFQA_VLAN_PROTO] = { .type = NLA_U16}, }; static const struct nla_policy nfqa_verdict_policy[NFQA_MAX+1] = { [NFQA_VERDICT_HDR] = { .len = sizeof(struct nfqnl_msg_verdict_hdr) }, [NFQA_MARK] = { .type = NLA_U32 }, [NFQA_PAYLOAD] = { .type = NLA_UNSPEC }, [NFQA_CT] = { .type = NLA_UNSPEC }, [NFQA_EXP] = { .type = NLA_UNSPEC }, [NFQA_VLAN] = { .type = NLA_NESTED }, [NFQA_PRIORITY] = { .type = NLA_U32 }, }; static const struct nla_policy nfqa_verdict_batch_policy[NFQA_MAX+1] = { [NFQA_VERDICT_HDR] = { .len = sizeof(struct nfqnl_msg_verdict_hdr) }, [NFQA_MARK] = { .type = NLA_U32 }, [NFQA_PRIORITY] = { .type = NLA_U32 }, }; static struct nfqnl_instance * verdict_instance_lookup(struct nfnl_queue_net *q, u16 queue_num, u32 nlportid) { struct nfqnl_instance *queue; queue = instance_lookup(q, queue_num); if (!queue) return ERR_PTR(-ENODEV); if (queue->peer_portid != nlportid) return ERR_PTR(-EPERM); return queue; } static struct nfqnl_msg_verdict_hdr* verdicthdr_get(const struct nlattr * const nfqa[]) { struct nfqnl_msg_verdict_hdr *vhdr; unsigned int verdict; if (!nfqa[NFQA_VERDICT_HDR]) return NULL; vhdr = nla_data(nfqa[NFQA_VERDICT_HDR]); verdict = ntohl(vhdr->verdict) & NF_VERDICT_MASK; if (verdict > NF_MAX_VERDICT || verdict == NF_STOLEN) return NULL; return vhdr; } static int nfq_id_after(unsigned int id, unsigned int max) { return (int)(id - max) > 0; } static int nfqnl_recv_verdict_batch(struct sk_buff *skb, const struct nfnl_info *info, const struct nlattr * const nfqa[]) { struct nfnl_queue_net *q = nfnl_queue_pernet(info->net); u16 queue_num = ntohs(info->nfmsg->res_id); struct nf_queue_entry *entry, *tmp; struct nfqnl_msg_verdict_hdr *vhdr; struct nfqnl_instance *queue; unsigned int verdict, maxid; LIST_HEAD(batch_list); queue = verdict_instance_lookup(q, queue_num, NETLINK_CB(skb).portid); if (IS_ERR(queue)) return PTR_ERR(queue); vhdr = verdicthdr_get(nfqa); if (!vhdr) return -EINVAL; verdict = ntohl(vhdr->verdict); maxid = ntohl(vhdr->id); spin_lock_bh(&queue->lock); list_for_each_entry_safe(entry, tmp, &queue->queue_list, list) { if (nfq_id_after(entry->id, maxid)) break; __dequeue_entry(queue, entry); list_add_tail(&entry->list, &batch_list); } spin_unlock_bh(&queue->lock); if (list_empty(&batch_list)) return -ENOENT; list_for_each_entry_safe(entry, tmp, &batch_list, list) { if (nfqa[NFQA_MARK]) entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK])); if (nfqa[NFQA_PRIORITY]) entry->skb->priority = ntohl(nla_get_be32(nfqa[NFQA_PRIORITY])); nfqnl_reinject(entry, verdict); } return 0; } static struct nf_conn *nfqnl_ct_parse(const struct nfnl_ct_hook *nfnl_ct, const struct nlmsghdr *nlh, const struct nlattr * const nfqa[], struct nf_queue_entry *entry, enum ip_conntrack_info *ctinfo) { #if IS_ENABLED(CONFIG_NF_CONNTRACK) struct nf_conn *ct; ct = nf_ct_get(entry->skb, ctinfo); if (ct == NULL) return NULL; if (nfnl_ct->parse(nfqa[NFQA_CT], ct) < 0) return NULL; if (nfqa[NFQA_EXP]) nfnl_ct->attach_expect(nfqa[NFQA_EXP], ct, NETLINK_CB(entry->skb).portid, nlmsg_report(nlh)); return ct; #else return NULL; #endif } static int nfqa_parse_bridge(struct nf_queue_entry *entry, const struct nlattr * const nfqa[]) { if (nfqa[NFQA_VLAN]) { struct nlattr *tb[NFQA_VLAN_MAX + 1]; int err; err = nla_parse_nested_deprecated(tb, NFQA_VLAN_MAX, nfqa[NFQA_VLAN], nfqa_vlan_policy, NULL); if (err < 0) return err; if (!tb[NFQA_VLAN_TCI] || !tb[NFQA_VLAN_PROTO]) return -EINVAL; __vlan_hwaccel_put_tag(entry->skb, nla_get_be16(tb[NFQA_VLAN_PROTO]), ntohs(nla_get_be16(tb[NFQA_VLAN_TCI]))); } if (nfqa[NFQA_L2HDR]) { int mac_header_len = entry->skb->network_header - entry->skb->mac_header; if (mac_header_len != nla_len(nfqa[NFQA_L2HDR])) return -EINVAL; else if (mac_header_len > 0) memcpy(skb_mac_header(entry->skb), nla_data(nfqa[NFQA_L2HDR]), mac_header_len); } return 0; } static int nfqnl_recv_verdict(struct sk_buff *skb, const struct nfnl_info *info, const struct nlattr * const nfqa[]) { struct nfnl_queue_net *q = nfnl_queue_pernet(info->net); u_int16_t queue_num = ntohs(info->nfmsg->res_id); const struct nfnl_ct_hook *nfnl_ct; struct nfqnl_msg_verdict_hdr *vhdr; enum ip_conntrack_info ctinfo; struct nfqnl_instance *queue; struct nf_queue_entry *entry; struct nf_conn *ct = NULL; unsigned int verdict; int err; queue = verdict_instance_lookup(q, queue_num, NETLINK_CB(skb).portid); if (IS_ERR(queue)) return PTR_ERR(queue); vhdr = verdicthdr_get(nfqa); if (!vhdr) return -EINVAL; verdict = ntohl(vhdr->verdict); entry = find_dequeue_entry(queue, ntohl(vhdr->id)); if (entry == NULL) return -ENOENT; /* rcu lock already held from nfnl->call_rcu. */ nfnl_ct = rcu_dereference(nfnl_ct_hook); if (nfqa[NFQA_CT]) { if (nfnl_ct != NULL) ct = nfqnl_ct_parse(nfnl_ct, info->nlh, nfqa, entry, &ctinfo); } if (entry->state.pf == PF_BRIDGE) { err = nfqa_parse_bridge(entry, nfqa); if (err < 0) return err; } if (nfqa[NFQA_PAYLOAD]) { u16 payload_len = nla_len(nfqa[NFQA_PAYLOAD]); int diff = payload_len - entry->skb->len; if (nfqnl_mangle(nla_data(nfqa[NFQA_PAYLOAD]), payload_len, entry, diff) < 0) verdict = NF_DROP; if (ct && diff) nfnl_ct->seq_adjust(entry->skb, ct, ctinfo, diff); } if (nfqa[NFQA_MARK]) entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK])); if (nfqa[NFQA_PRIORITY]) entry->skb->priority = ntohl(nla_get_be32(nfqa[NFQA_PRIORITY])); nfqnl_reinject(entry, verdict); return 0; } static int nfqnl_recv_unsupp(struct sk_buff *skb, const struct nfnl_info *info, const struct nlattr * const cda[]) { return -ENOTSUPP; } static const struct nla_policy nfqa_cfg_policy[NFQA_CFG_MAX+1] = { [NFQA_CFG_CMD] = { .len = sizeof(struct nfqnl_msg_config_cmd) }, [NFQA_CFG_PARAMS] = { .len = sizeof(struct nfqnl_msg_config_params) }, [NFQA_CFG_QUEUE_MAXLEN] = { .type = NLA_U32 }, [NFQA_CFG_MASK] = { .type = NLA_U32 }, [NFQA_CFG_FLAGS] = { .type = NLA_U32 }, }; static const struct nf_queue_handler nfqh = { .outfn = nfqnl_enqueue_packet, .nf_hook_drop = nfqnl_nf_hook_drop, }; static int nfqnl_recv_config(struct sk_buff *skb, const struct nfnl_info *info, const struct nlattr * const nfqa[]) { struct nfnl_queue_net *q = nfnl_queue_pernet(info->net); u_int16_t queue_num = ntohs(info->nfmsg->res_id); struct nfqnl_msg_config_cmd *cmd = NULL; struct nfqnl_instance *queue; __u32 flags = 0, mask = 0; int ret = 0; if (nfqa[NFQA_CFG_CMD]) { cmd = nla_data(nfqa[NFQA_CFG_CMD]); /* Obsolete commands without queue context */ switch (cmd->command) { case NFQNL_CFG_CMD_PF_BIND: return 0; case NFQNL_CFG_CMD_PF_UNBIND: return 0; } } /* Check if we support these flags in first place, dependencies should * be there too not to break atomicity. */ if (nfqa[NFQA_CFG_FLAGS]) { if (!nfqa[NFQA_CFG_MASK]) { /* A mask is needed to specify which flags are being * changed. */ return -EINVAL; } flags = ntohl(nla_get_be32(nfqa[NFQA_CFG_FLAGS])); mask = ntohl(nla_get_be32(nfqa[NFQA_CFG_MASK])); if (flags >= NFQA_CFG_F_MAX) return -EOPNOTSUPP; #if !IS_ENABLED(CONFIG_NETWORK_SECMARK) if (flags & mask & NFQA_CFG_F_SECCTX) return -EOPNOTSUPP; #endif if ((flags & mask & NFQA_CFG_F_CONNTRACK) && !rcu_access_pointer(nfnl_ct_hook)) { #ifdef CONFIG_MODULES nfnl_unlock(NFNL_SUBSYS_QUEUE); request_module("ip_conntrack_netlink"); nfnl_lock(NFNL_SUBSYS_QUEUE); if (rcu_access_pointer(nfnl_ct_hook)) return -EAGAIN; #endif return -EOPNOTSUPP; } } rcu_read_lock(); queue = instance_lookup(q, queue_num); if (queue && queue->peer_portid != NETLINK_CB(skb).portid) { ret = -EPERM; goto err_out_unlock; } if (cmd != NULL) { switch (cmd->command) { case NFQNL_CFG_CMD_BIND: if (queue) { ret = -EBUSY; goto err_out_unlock; } queue = instance_create(q, queue_num, NETLINK_CB(skb).portid); if (IS_ERR(queue)) { ret = PTR_ERR(queue); goto err_out_unlock; } break; case NFQNL_CFG_CMD_UNBIND: if (!queue) { ret = -ENODEV; goto err_out_unlock; } instance_destroy(q, queue); goto err_out_unlock; case NFQNL_CFG_CMD_PF_BIND: case NFQNL_CFG_CMD_PF_UNBIND: break; default: ret = -ENOTSUPP; goto err_out_unlock; } } if (!queue) { ret = -ENODEV; goto err_out_unlock; } if (nfqa[NFQA_CFG_PARAMS]) { struct nfqnl_msg_config_params *params = nla_data(nfqa[NFQA_CFG_PARAMS]); nfqnl_set_mode(queue, params->copy_mode, ntohl(params->copy_range)); } if (nfqa[NFQA_CFG_QUEUE_MAXLEN]) { __be32 *queue_maxlen = nla_data(nfqa[NFQA_CFG_QUEUE_MAXLEN]); spin_lock_bh(&queue->lock); queue->queue_maxlen = ntohl(*queue_maxlen); spin_unlock_bh(&queue->lock); } if (nfqa[NFQA_CFG_FLAGS]) { spin_lock_bh(&queue->lock); queue->flags &= ~mask; queue->flags |= flags & mask; spin_unlock_bh(&queue->lock); } err_out_unlock: rcu_read_unlock(); return ret; } static const struct nfnl_callback nfqnl_cb[NFQNL_MSG_MAX] = { [NFQNL_MSG_PACKET] = { .call = nfqnl_recv_unsupp, .type = NFNL_CB_RCU, .attr_count = NFQA_MAX, }, [NFQNL_MSG_VERDICT] = { .call = nfqnl_recv_verdict, .type = NFNL_CB_RCU, .attr_count = NFQA_MAX, .policy = nfqa_verdict_policy }, [NFQNL_MSG_CONFIG] = { .call = nfqnl_recv_config, .type = NFNL_CB_MUTEX, .attr_count = NFQA_CFG_MAX, .policy = nfqa_cfg_policy }, [NFQNL_MSG_VERDICT_BATCH] = { .call = nfqnl_recv_verdict_batch, .type = NFNL_CB_RCU, .attr_count = NFQA_MAX, .policy = nfqa_verdict_batch_policy }, }; static const struct nfnetlink_subsystem nfqnl_subsys = { .name = "nf_queue", .subsys_id = NFNL_SUBSYS_QUEUE, .cb_count = NFQNL_MSG_MAX, .cb = nfqnl_cb, }; #ifdef CONFIG_PROC_FS struct iter_state { struct seq_net_private p; unsigned int bucket; }; static struct hlist_node *get_first(struct seq_file *seq) { struct iter_state *st = seq->private; struct net *net; struct nfnl_queue_net *q; if (!st) return NULL; net = seq_file_net(seq); q = nfnl_queue_pernet(net); for (st->bucket = 0; st->bucket < INSTANCE_BUCKETS; st->bucket++) { if (!hlist_empty(&q->instance_table[st->bucket])) return q->instance_table[st->bucket].first; } return NULL; } static struct hlist_node *get_next(struct seq_file *seq, struct hlist_node *h) { struct iter_state *st = seq->private; struct net *net = seq_file_net(seq); h = h->next; while (!h) { struct nfnl_queue_net *q; if (++st->bucket >= INSTANCE_BUCKETS) return NULL; q = nfnl_queue_pernet(net); h = q->instance_table[st->bucket].first; } return h; } static struct hlist_node *get_idx(struct seq_file *seq, loff_t pos) { struct hlist_node *head; head = get_first(seq); if (head) while (pos && (head = get_next(seq, head))) pos--; return pos ? NULL : head; } static void *seq_start(struct seq_file *s, loff_t *pos) __acquires(nfnl_queue_pernet(seq_file_net(s))->instances_lock) { spin_lock(&nfnl_queue_pernet(seq_file_net(s))->instances_lock); return get_idx(s, *pos); } static void *seq_next(struct seq_file *s, void *v, loff_t *pos) { (*pos)++; return get_next(s, v); } static void seq_stop(struct seq_file *s, void *v) __releases(nfnl_queue_pernet(seq_file_net(s))->instances_lock) { spin_unlock(&nfnl_queue_pernet(seq_file_net(s))->instances_lock); } static int seq_show(struct seq_file *s, void *v) { const struct nfqnl_instance *inst = v; seq_printf(s, "%5u %6u %5u %1u %5u %5u %5u %8u %2d\n", inst->queue_num, inst->peer_portid, inst->queue_total, inst->copy_mode, inst->copy_range, inst->queue_dropped, inst->queue_user_dropped, inst->id_sequence, 1); return 0; } static const struct seq_operations nfqnl_seq_ops = { .start = seq_start, .next = seq_next, .stop = seq_stop, .show = seq_show, }; #endif /* PROC_FS */ static int __net_init nfnl_queue_net_init(struct net *net) { unsigned int i; struct nfnl_queue_net *q = nfnl_queue_pernet(net); for (i = 0; i < INSTANCE_BUCKETS; i++) INIT_HLIST_HEAD(&q->instance_table[i]); spin_lock_init(&q->instances_lock); #ifdef CONFIG_PROC_FS if (!proc_create_net("nfnetlink_queue", 0440, net->nf.proc_netfilter, &nfqnl_seq_ops, sizeof(struct iter_state))) return -ENOMEM; #endif return 0; } static void __net_exit nfnl_queue_net_exit(struct net *net) { struct nfnl_queue_net *q = nfnl_queue_pernet(net); unsigned int i; #ifdef CONFIG_PROC_FS remove_proc_entry("nfnetlink_queue", net->nf.proc_netfilter); #endif for (i = 0; i < INSTANCE_BUCKETS; i++) WARN_ON_ONCE(!hlist_empty(&q->instance_table[i])); } static struct pernet_operations nfnl_queue_net_ops = { .init = nfnl_queue_net_init, .exit = nfnl_queue_net_exit, .id = &nfnl_queue_net_id, .size = sizeof(struct nfnl_queue_net), }; static int __init nfnetlink_queue_init(void) { int status; status = register_pernet_subsys(&nfnl_queue_net_ops); if (status < 0) { pr_err("failed to register pernet ops\n"); goto out; } netlink_register_notifier(&nfqnl_rtnl_notifier); status = nfnetlink_subsys_register(&nfqnl_subsys); if (status < 0) { pr_err("failed to create netlink socket\n"); goto cleanup_netlink_notifier; } status = register_netdevice_notifier(&nfqnl_dev_notifier); if (status < 0) { pr_err("failed to register netdevice notifier\n"); goto cleanup_netlink_subsys; } nf_register_queue_handler(&nfqh); return status; cleanup_netlink_subsys: nfnetlink_subsys_unregister(&nfqnl_subsys); cleanup_netlink_notifier: netlink_unregister_notifier(&nfqnl_rtnl_notifier); unregister_pernet_subsys(&nfnl_queue_net_ops); out: return status; } static void __exit nfnetlink_queue_fini(void) { nf_unregister_queue_handler(); unregister_netdevice_notifier(&nfqnl_dev_notifier); nfnetlink_subsys_unregister(&nfqnl_subsys); netlink_unregister_notifier(&nfqnl_rtnl_notifier); unregister_pernet_subsys(&nfnl_queue_net_ops); rcu_barrier(); /* Wait for completion of call_rcu()'s */ } MODULE_DESCRIPTION("netfilter packet queue handler"); MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>"); MODULE_LICENSE("GPL"); MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_QUEUE); module_init(nfnetlink_queue_init); module_exit(nfnetlink_queue_fini); |
| 60 60 4 13 43 40 16 37 56 35 44 34 16 4 12 12 2 3 3 8 3 3 11 16 9 5 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 4 1 3 1 5 1 5 1 4 1 1 3 2 3 2 2 4 3 1 1 3 1 3 3 3 3 1 3 3 3 1 2 2 1 1 3 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 5 1 2 1 1 21 3 1 1 1 5 1 5 1 3 8 1 7 7 6 1 2 5 2 1 1 1 2 1 1 1 10 2 2 1 1 4 2 2 2 6 1 4 1 2 1 1 1 1 2 1 1 4 2 1 1 5 1 1 3 1 1 7 1 1 2 1 1 1 1 7 5 2 1 1 67 67 16 7 10 51 4 46 2 48 48 58 10 2 46 57 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 | // SPDX-License-Identifier: GPL-2.0-only /* * * Authors: * Alexander Aring <aar@pengutronix.de> * * Based on: net/wireless/nl80211.c */ #include <linux/rtnetlink.h> #include <net/cfg802154.h> #include <net/genetlink.h> #include <net/mac802154.h> #include <net/netlink.h> #include <net/nl802154.h> #include <net/sock.h> #include "nl802154.h" #include "rdev-ops.h" #include "core.h" /* the netlink family */ static struct genl_family nl802154_fam; /* multicast groups */ enum nl802154_multicast_groups { NL802154_MCGRP_CONFIG, NL802154_MCGRP_SCAN, }; static const struct genl_multicast_group nl802154_mcgrps[] = { [NL802154_MCGRP_CONFIG] = { .name = "config", }, [NL802154_MCGRP_SCAN] = { .name = "scan", }, }; /* returns ERR_PTR values */ static struct wpan_dev * __cfg802154_wpan_dev_from_attrs(struct net *netns, struct nlattr **attrs) { struct cfg802154_registered_device *rdev; struct wpan_dev *result = NULL; bool have_ifidx = attrs[NL802154_ATTR_IFINDEX]; bool have_wpan_dev_id = attrs[NL802154_ATTR_WPAN_DEV]; u64 wpan_dev_id; int wpan_phy_idx = -1; int ifidx = -1; ASSERT_RTNL(); if (!have_ifidx && !have_wpan_dev_id) return ERR_PTR(-EINVAL); if (have_ifidx) ifidx = nla_get_u32(attrs[NL802154_ATTR_IFINDEX]); if (have_wpan_dev_id) { wpan_dev_id = nla_get_u64(attrs[NL802154_ATTR_WPAN_DEV]); wpan_phy_idx = wpan_dev_id >> 32; } list_for_each_entry(rdev, &cfg802154_rdev_list, list) { struct wpan_dev *wpan_dev; if (wpan_phy_net(&rdev->wpan_phy) != netns) continue; if (have_wpan_dev_id && rdev->wpan_phy_idx != wpan_phy_idx) continue; list_for_each_entry(wpan_dev, &rdev->wpan_dev_list, list) { if (have_ifidx && wpan_dev->netdev && wpan_dev->netdev->ifindex == ifidx) { result = wpan_dev; break; } if (have_wpan_dev_id && wpan_dev->identifier == (u32)wpan_dev_id) { result = wpan_dev; break; } } if (result) break; } if (result) return result; return ERR_PTR(-ENODEV); } static struct cfg802154_registered_device * __cfg802154_rdev_from_attrs(struct net *netns, struct nlattr **attrs) { struct cfg802154_registered_device *rdev = NULL, *tmp; struct net_device *netdev; ASSERT_RTNL(); if (!attrs[NL802154_ATTR_WPAN_PHY] && !attrs[NL802154_ATTR_IFINDEX] && !attrs[NL802154_ATTR_WPAN_DEV]) return ERR_PTR(-EINVAL); if (attrs[NL802154_ATTR_WPAN_PHY]) rdev = cfg802154_rdev_by_wpan_phy_idx( nla_get_u32(attrs[NL802154_ATTR_WPAN_PHY])); if (attrs[NL802154_ATTR_WPAN_DEV]) { u64 wpan_dev_id = nla_get_u64(attrs[NL802154_ATTR_WPAN_DEV]); struct wpan_dev *wpan_dev; bool found = false; tmp = cfg802154_rdev_by_wpan_phy_idx(wpan_dev_id >> 32); if (tmp) { /* make sure wpan_dev exists */ list_for_each_entry(wpan_dev, &tmp->wpan_dev_list, list) { if (wpan_dev->identifier != (u32)wpan_dev_id) continue; found = true; break; } if (!found) tmp = NULL; if (rdev && tmp != rdev) return ERR_PTR(-EINVAL); rdev = tmp; } } if (attrs[NL802154_ATTR_IFINDEX]) { int ifindex = nla_get_u32(attrs[NL802154_ATTR_IFINDEX]); netdev = __dev_get_by_index(netns, ifindex); if (netdev) { if (netdev->ieee802154_ptr) tmp = wpan_phy_to_rdev( netdev->ieee802154_ptr->wpan_phy); else tmp = NULL; /* not wireless device -- return error */ if (!tmp) return ERR_PTR(-EINVAL); /* mismatch -- return error */ if (rdev && tmp != rdev) return ERR_PTR(-EINVAL); rdev = tmp; } } if (!rdev) return ERR_PTR(-ENODEV); if (netns != wpan_phy_net(&rdev->wpan_phy)) return ERR_PTR(-ENODEV); return rdev; } /* This function returns a pointer to the driver * that the genl_info item that is passed refers to. * * The result of this can be a PTR_ERR and hence must * be checked with IS_ERR() for errors. */ static struct cfg802154_registered_device * cfg802154_get_dev_from_info(struct net *netns, struct genl_info *info) { return __cfg802154_rdev_from_attrs(netns, info->attrs); } /* policy for the attributes */ static const struct nla_policy nl802154_policy[NL802154_ATTR_MAX+1] = { [NL802154_ATTR_WPAN_PHY] = { .type = NLA_U32 }, [NL802154_ATTR_WPAN_PHY_NAME] = { .type = NLA_NUL_STRING, .len = 20-1 }, [NL802154_ATTR_IFINDEX] = { .type = NLA_U32 }, [NL802154_ATTR_IFTYPE] = { .type = NLA_U32 }, [NL802154_ATTR_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ-1 }, [NL802154_ATTR_WPAN_DEV] = { .type = NLA_U64 }, [NL802154_ATTR_PAGE] = NLA_POLICY_MAX(NLA_U8, IEEE802154_MAX_PAGE), [NL802154_ATTR_CHANNEL] = NLA_POLICY_MAX(NLA_U8, IEEE802154_MAX_CHANNEL), [NL802154_ATTR_TX_POWER] = { .type = NLA_S32, }, [NL802154_ATTR_CCA_MODE] = { .type = NLA_U32, }, [NL802154_ATTR_CCA_OPT] = { .type = NLA_U32, }, [NL802154_ATTR_CCA_ED_LEVEL] = { .type = NLA_S32, }, [NL802154_ATTR_SUPPORTED_CHANNEL] = { .type = NLA_U32, }, [NL802154_ATTR_PAN_ID] = { .type = NLA_U16, }, [NL802154_ATTR_EXTENDED_ADDR] = { .type = NLA_U64 }, [NL802154_ATTR_SHORT_ADDR] = { .type = NLA_U16, }, [NL802154_ATTR_MIN_BE] = { .type = NLA_U8, }, [NL802154_ATTR_MAX_BE] = { .type = NLA_U8, }, [NL802154_ATTR_MAX_CSMA_BACKOFFS] = { .type = NLA_U8, }, [NL802154_ATTR_MAX_FRAME_RETRIES] = { .type = NLA_S8, }, [NL802154_ATTR_LBT_MODE] = { .type = NLA_U8, }, [NL802154_ATTR_WPAN_PHY_CAPS] = { .type = NLA_NESTED }, [NL802154_ATTR_SUPPORTED_COMMANDS] = { .type = NLA_NESTED }, [NL802154_ATTR_ACKREQ_DEFAULT] = { .type = NLA_U8 }, [NL802154_ATTR_PID] = { .type = NLA_U32 }, [NL802154_ATTR_NETNS_FD] = { .type = NLA_U32 }, [NL802154_ATTR_COORDINATOR] = { .type = NLA_NESTED }, [NL802154_ATTR_SCAN_TYPE] = NLA_POLICY_RANGE(NLA_U8, NL802154_SCAN_ED, NL802154_SCAN_RIT_PASSIVE), [NL802154_ATTR_SCAN_CHANNELS] = NLA_POLICY_MASK(NLA_U32, GENMASK(IEEE802154_MAX_CHANNEL, 0)), [NL802154_ATTR_SCAN_PREAMBLE_CODES] = { .type = NLA_REJECT }, [NL802154_ATTR_SCAN_MEAN_PRF] = { .type = NLA_REJECT }, [NL802154_ATTR_SCAN_DURATION] = NLA_POLICY_MAX(NLA_U8, IEEE802154_MAX_SCAN_DURATION), [NL802154_ATTR_SCAN_DONE_REASON] = NLA_POLICY_RANGE(NLA_U8, NL802154_SCAN_DONE_REASON_FINISHED, NL802154_SCAN_DONE_REASON_ABORTED), [NL802154_ATTR_BEACON_INTERVAL] = NLA_POLICY_MAX(NLA_U8, IEEE802154_ACTIVE_SCAN_DURATION), [NL802154_ATTR_MAX_ASSOCIATIONS] = { .type = NLA_U32 }, [NL802154_ATTR_PEER] = { .type = NLA_NESTED }, #ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL [NL802154_ATTR_SEC_ENABLED] = { .type = NLA_U8, }, [NL802154_ATTR_SEC_OUT_LEVEL] = { .type = NLA_U32, }, [NL802154_ATTR_SEC_OUT_KEY_ID] = { .type = NLA_NESTED, }, [NL802154_ATTR_SEC_FRAME_COUNTER] = { .type = NLA_U32 }, [NL802154_ATTR_SEC_LEVEL] = { .type = NLA_NESTED }, [NL802154_ATTR_SEC_DEVICE] = { .type = NLA_NESTED }, [NL802154_ATTR_SEC_DEVKEY] = { .type = NLA_NESTED }, [NL802154_ATTR_SEC_KEY] = { .type = NLA_NESTED }, #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */ }; static int nl802154_prepare_wpan_dev_dump(struct sk_buff *skb, struct netlink_callback *cb, struct cfg802154_registered_device **rdev, struct wpan_dev **wpan_dev) { const struct genl_dumpit_info *info = genl_dumpit_info(cb); int err; rtnl_lock(); if (!cb->args[0]) { *wpan_dev = __cfg802154_wpan_dev_from_attrs(sock_net(skb->sk), info->info.attrs); if (IS_ERR(*wpan_dev)) { err = PTR_ERR(*wpan_dev); goto out_unlock; } *rdev = wpan_phy_to_rdev((*wpan_dev)->wpan_phy); /* 0 is the first index - add 1 to parse only once */ cb->args[0] = (*rdev)->wpan_phy_idx + 1; cb->args[1] = (*wpan_dev)->identifier; } else { /* subtract the 1 again here */ struct wpan_phy *wpan_phy = wpan_phy_idx_to_wpan_phy(cb->args[0] - 1); struct wpan_dev *tmp; if (!wpan_phy) { err = -ENODEV; goto out_unlock; } *rdev = wpan_phy_to_rdev(wpan_phy); *wpan_dev = NULL; list_for_each_entry(tmp, &(*rdev)->wpan_dev_list, list) { if (tmp->identifier == cb->args[1]) { *wpan_dev = tmp; break; } } if (!*wpan_dev) { err = -ENODEV; goto out_unlock; } } return 0; out_unlock: rtnl_unlock(); return err; } static void nl802154_finish_wpan_dev_dump(struct cfg802154_registered_device *rdev) { rtnl_unlock(); } /* message building helper */ static inline void *nl802154hdr_put(struct sk_buff *skb, u32 portid, u32 seq, int flags, u8 cmd) { /* since there is no private header just add the generic one */ return genlmsg_put(skb, portid, seq, &nl802154_fam, flags, cmd); } static int nl802154_put_flags(struct sk_buff *msg, int attr, u32 mask) { struct nlattr *nl_flags = nla_nest_start_noflag(msg, attr); int i; if (!nl_flags) return -ENOBUFS; i = 0; while (mask) { if ((mask & 1) && nla_put_flag(msg, i)) return -ENOBUFS; mask >>= 1; i++; } nla_nest_end(msg, nl_flags); return 0; } static int nl802154_send_wpan_phy_channels(struct cfg802154_registered_device *rdev, struct sk_buff *msg) { struct nlattr *nl_page; unsigned long page; nl_page = nla_nest_start_noflag(msg, NL802154_ATTR_CHANNELS_SUPPORTED); if (!nl_page) return -ENOBUFS; for (page = 0; page <= IEEE802154_MAX_PAGE; page++) { if (nla_put_u32(msg, NL802154_ATTR_SUPPORTED_CHANNEL, rdev->wpan_phy.supported.channels[page])) return -ENOBUFS; } nla_nest_end(msg, nl_page); return 0; } static int nl802154_put_capabilities(struct sk_buff *msg, struct cfg802154_registered_device *rdev) { const struct wpan_phy_supported *caps = &rdev->wpan_phy.supported; struct nlattr *nl_caps, *nl_channels; int i; nl_caps = nla_nest_start_noflag(msg, NL802154_ATTR_WPAN_PHY_CAPS); if (!nl_caps) return -ENOBUFS; nl_channels = nla_nest_start_noflag(msg, NL802154_CAP_ATTR_CHANNELS); if (!nl_channels) return -ENOBUFS; for (i = 0; i <= IEEE802154_MAX_PAGE; i++) { if (caps->channels[i]) { if (nl802154_put_flags(msg, i, caps->channels[i])) return -ENOBUFS; } } nla_nest_end(msg, nl_channels); if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_ED_LEVEL) { struct nlattr *nl_ed_lvls; nl_ed_lvls = nla_nest_start_noflag(msg, NL802154_CAP_ATTR_CCA_ED_LEVELS); if (!nl_ed_lvls) return -ENOBUFS; for (i = 0; i < caps->cca_ed_levels_size; i++) { if (nla_put_s32(msg, i, caps->cca_ed_levels[i])) return -ENOBUFS; } nla_nest_end(msg, nl_ed_lvls); } if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_TXPOWER) { struct nlattr *nl_tx_pwrs; nl_tx_pwrs = nla_nest_start_noflag(msg, NL802154_CAP_ATTR_TX_POWERS); if (!nl_tx_pwrs) return -ENOBUFS; for (i = 0; i < caps->tx_powers_size; i++) { if (nla_put_s32(msg, i, caps->tx_powers[i])) return -ENOBUFS; } nla_nest_end(msg, nl_tx_pwrs); } if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_MODE) { if (nl802154_put_flags(msg, NL802154_CAP_ATTR_CCA_MODES, caps->cca_modes) || nl802154_put_flags(msg, NL802154_CAP_ATTR_CCA_OPTS, caps->cca_opts)) return -ENOBUFS; } if (nla_put_u8(msg, NL802154_CAP_ATTR_MIN_MINBE, caps->min_minbe) || nla_put_u8(msg, NL802154_CAP_ATTR_MAX_MINBE, caps->max_minbe) || nla_put_u8(msg, NL802154_CAP_ATTR_MIN_MAXBE, caps->min_maxbe) || nla_put_u8(msg, NL802154_CAP_ATTR_MAX_MAXBE, caps->max_maxbe) || nla_put_u8(msg, NL802154_CAP_ATTR_MIN_CSMA_BACKOFFS, caps->min_csma_backoffs) || nla_put_u8(msg, NL802154_CAP_ATTR_MAX_CSMA_BACKOFFS, caps->max_csma_backoffs) || nla_put_s8(msg, NL802154_CAP_ATTR_MIN_FRAME_RETRIES, caps->min_frame_retries) || nla_put_s8(msg, NL802154_CAP_ATTR_MAX_FRAME_RETRIES, caps->max_frame_retries) || nl802154_put_flags(msg, NL802154_CAP_ATTR_IFTYPES, caps->iftypes) || nla_put_u32(msg, NL802154_CAP_ATTR_LBT, caps->lbt)) return -ENOBUFS; nla_nest_end(msg, nl_caps); return 0; } static int nl802154_send_wpan_phy(struct cfg802154_registered_device *rdev, enum nl802154_commands cmd, struct sk_buff *msg, u32 portid, u32 seq, int flags) { struct nlattr *nl_cmds; void *hdr; int i; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_WPAN_PHY, rdev->wpan_phy_idx) || nla_put_string(msg, NL802154_ATTR_WPAN_PHY_NAME, wpan_phy_name(&rdev->wpan_phy)) || nla_put_u32(msg, NL802154_ATTR_GENERATION, cfg802154_rdev_list_generation)) goto nla_put_failure; if (cmd != NL802154_CMD_NEW_WPAN_PHY) goto finish; /* DUMP PHY PIB */ /* current channel settings */ if (nla_put_u8(msg, NL802154_ATTR_PAGE, rdev->wpan_phy.current_page) || nla_put_u8(msg, NL802154_ATTR_CHANNEL, rdev->wpan_phy.current_channel)) goto nla_put_failure; /* TODO remove this behaviour, we still keep support it for a while * so users can change the behaviour to the new one. */ if (nl802154_send_wpan_phy_channels(rdev, msg)) goto nla_put_failure; /* cca mode */ if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_MODE) { if (nla_put_u32(msg, NL802154_ATTR_CCA_MODE, rdev->wpan_phy.cca.mode)) goto nla_put_failure; if (rdev->wpan_phy.cca.mode == NL802154_CCA_ENERGY_CARRIER) { if (nla_put_u32(msg, NL802154_ATTR_CCA_OPT, rdev->wpan_phy.cca.opt)) goto nla_put_failure; } } if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_TXPOWER) { if (nla_put_s32(msg, NL802154_ATTR_TX_POWER, rdev->wpan_phy.transmit_power)) goto nla_put_failure; } if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_ED_LEVEL) { if (nla_put_s32(msg, NL802154_ATTR_CCA_ED_LEVEL, rdev->wpan_phy.cca_ed_level)) goto nla_put_failure; } if (nl802154_put_capabilities(msg, rdev)) goto nla_put_failure; nl_cmds = nla_nest_start_noflag(msg, NL802154_ATTR_SUPPORTED_COMMANDS); if (!nl_cmds) goto nla_put_failure; i = 0; #define CMD(op, n) \ do { \ if (rdev->ops->op) { \ i++; \ if (nla_put_u32(msg, i, NL802154_CMD_ ## n)) \ goto nla_put_failure; \ } \ } while (0) CMD(add_virtual_intf, NEW_INTERFACE); CMD(del_virtual_intf, DEL_INTERFACE); CMD(set_channel, SET_CHANNEL); CMD(set_pan_id, SET_PAN_ID); CMD(set_short_addr, SET_SHORT_ADDR); CMD(set_backoff_exponent, SET_BACKOFF_EXPONENT); CMD(set_max_csma_backoffs, SET_MAX_CSMA_BACKOFFS); CMD(set_max_frame_retries, SET_MAX_FRAME_RETRIES); CMD(set_lbt_mode, SET_LBT_MODE); CMD(set_ackreq_default, SET_ACKREQ_DEFAULT); if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_TXPOWER) CMD(set_tx_power, SET_TX_POWER); if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_ED_LEVEL) CMD(set_cca_ed_level, SET_CCA_ED_LEVEL); if (rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_MODE) CMD(set_cca_mode, SET_CCA_MODE); #undef CMD nla_nest_end(msg, nl_cmds); finish: genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } struct nl802154_dump_wpan_phy_state { s64 filter_wpan_phy; long start; }; static int nl802154_dump_wpan_phy_parse(struct sk_buff *skb, struct netlink_callback *cb, struct nl802154_dump_wpan_phy_state *state) { const struct genl_dumpit_info *info = genl_dumpit_info(cb); struct nlattr **tb = info->info.attrs; if (tb[NL802154_ATTR_WPAN_PHY]) state->filter_wpan_phy = nla_get_u32(tb[NL802154_ATTR_WPAN_PHY]); if (tb[NL802154_ATTR_WPAN_DEV]) state->filter_wpan_phy = nla_get_u64(tb[NL802154_ATTR_WPAN_DEV]) >> 32; if (tb[NL802154_ATTR_IFINDEX]) { struct net_device *netdev; struct cfg802154_registered_device *rdev; int ifidx = nla_get_u32(tb[NL802154_ATTR_IFINDEX]); netdev = __dev_get_by_index(&init_net, ifidx); if (!netdev) return -ENODEV; if (netdev->ieee802154_ptr) { rdev = wpan_phy_to_rdev( netdev->ieee802154_ptr->wpan_phy); state->filter_wpan_phy = rdev->wpan_phy_idx; } } return 0; } static int nl802154_dump_wpan_phy(struct sk_buff *skb, struct netlink_callback *cb) { int idx = 0, ret; struct nl802154_dump_wpan_phy_state *state = (void *)cb->args[0]; struct cfg802154_registered_device *rdev; rtnl_lock(); if (!state) { state = kzalloc(sizeof(*state), GFP_KERNEL); if (!state) { rtnl_unlock(); return -ENOMEM; } state->filter_wpan_phy = -1; ret = nl802154_dump_wpan_phy_parse(skb, cb, state); if (ret) { kfree(state); rtnl_unlock(); return ret; } cb->args[0] = (long)state; } list_for_each_entry(rdev, &cfg802154_rdev_list, list) { if (!net_eq(wpan_phy_net(&rdev->wpan_phy), sock_net(skb->sk))) continue; if (++idx <= state->start) continue; if (state->filter_wpan_phy != -1 && state->filter_wpan_phy != rdev->wpan_phy_idx) continue; /* attempt to fit multiple wpan_phy data chunks into the skb */ ret = nl802154_send_wpan_phy(rdev, NL802154_CMD_NEW_WPAN_PHY, skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI); if (ret < 0) { if ((ret == -ENOBUFS || ret == -EMSGSIZE) && !skb->len && cb->min_dump_alloc < 4096) { cb->min_dump_alloc = 4096; rtnl_unlock(); return 1; } idx--; break; } break; } rtnl_unlock(); state->start = idx; return skb->len; } static int nl802154_dump_wpan_phy_done(struct netlink_callback *cb) { kfree((void *)cb->args[0]); return 0; } static int nl802154_get_wpan_phy(struct sk_buff *skb, struct genl_info *info) { struct sk_buff *msg; struct cfg802154_registered_device *rdev = info->user_ptr[0]; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; if (nl802154_send_wpan_phy(rdev, NL802154_CMD_NEW_WPAN_PHY, msg, info->snd_portid, info->snd_seq, 0) < 0) { nlmsg_free(msg); return -ENOBUFS; } return genlmsg_reply(msg, info); } static inline u64 wpan_dev_id(struct wpan_dev *wpan_dev) { return (u64)wpan_dev->identifier | ((u64)wpan_phy_to_rdev(wpan_dev->wpan_phy)->wpan_phy_idx << 32); } #ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL #include <net/ieee802154_netdev.h> static int ieee802154_llsec_send_key_id(struct sk_buff *msg, const struct ieee802154_llsec_key_id *desc) { struct nlattr *nl_dev_addr; if (nla_put_u32(msg, NL802154_KEY_ID_ATTR_MODE, desc->mode)) return -ENOBUFS; switch (desc->mode) { case NL802154_KEY_ID_MODE_IMPLICIT: nl_dev_addr = nla_nest_start_noflag(msg, NL802154_KEY_ID_ATTR_IMPLICIT); if (!nl_dev_addr) return -ENOBUFS; if (nla_put_le16(msg, NL802154_DEV_ADDR_ATTR_PAN_ID, desc->device_addr.pan_id) || nla_put_u32(msg, NL802154_DEV_ADDR_ATTR_MODE, desc->device_addr.mode)) return -ENOBUFS; switch (desc->device_addr.mode) { case NL802154_DEV_ADDR_SHORT: if (nla_put_le16(msg, NL802154_DEV_ADDR_ATTR_SHORT, desc->device_addr.short_addr)) return -ENOBUFS; break; case NL802154_DEV_ADDR_EXTENDED: if (nla_put_le64(msg, NL802154_DEV_ADDR_ATTR_EXTENDED, desc->device_addr.extended_addr, NL802154_DEV_ADDR_ATTR_PAD)) return -ENOBUFS; break; default: /* userspace should handle unknown */ break; } nla_nest_end(msg, nl_dev_addr); break; case NL802154_KEY_ID_MODE_INDEX: break; case NL802154_KEY_ID_MODE_INDEX_SHORT: /* TODO renmae short_source? */ if (nla_put_le32(msg, NL802154_KEY_ID_ATTR_SOURCE_SHORT, desc->short_source)) return -ENOBUFS; break; case NL802154_KEY_ID_MODE_INDEX_EXTENDED: if (nla_put_le64(msg, NL802154_KEY_ID_ATTR_SOURCE_EXTENDED, desc->extended_source, NL802154_KEY_ID_ATTR_PAD)) return -ENOBUFS; break; default: /* userspace should handle unknown */ break; } /* TODO key_id to key_idx ? Check naming */ if (desc->mode != NL802154_KEY_ID_MODE_IMPLICIT) { if (nla_put_u8(msg, NL802154_KEY_ID_ATTR_INDEX, desc->id)) return -ENOBUFS; } return 0; } static int nl802154_get_llsec_params(struct sk_buff *msg, struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev) { struct nlattr *nl_key_id; struct ieee802154_llsec_params params; int ret; ret = rdev_get_llsec_params(rdev, wpan_dev, ¶ms); if (ret < 0) return ret; if (nla_put_u8(msg, NL802154_ATTR_SEC_ENABLED, params.enabled) || nla_put_u32(msg, NL802154_ATTR_SEC_OUT_LEVEL, params.out_level) || nla_put_be32(msg, NL802154_ATTR_SEC_FRAME_COUNTER, params.frame_counter)) return -ENOBUFS; nl_key_id = nla_nest_start_noflag(msg, NL802154_ATTR_SEC_OUT_KEY_ID); if (!nl_key_id) return -ENOBUFS; ret = ieee802154_llsec_send_key_id(msg, ¶ms.out_key); if (ret < 0) return ret; nla_nest_end(msg, nl_key_id); return 0; } #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */ static int nl802154_send_iface(struct sk_buff *msg, u32 portid, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev) { struct net_device *dev = wpan_dev->netdev; void *hdr; hdr = nl802154hdr_put(msg, portid, seq, flags, NL802154_CMD_NEW_INTERFACE); if (!hdr) return -1; if (dev && (nla_put_u32(msg, NL802154_ATTR_IFINDEX, dev->ifindex) || nla_put_string(msg, NL802154_ATTR_IFNAME, dev->name))) goto nla_put_failure; if (nla_put_u32(msg, NL802154_ATTR_WPAN_PHY, rdev->wpan_phy_idx) || nla_put_u32(msg, NL802154_ATTR_IFTYPE, wpan_dev->iftype) || nla_put_u64_64bit(msg, NL802154_ATTR_WPAN_DEV, wpan_dev_id(wpan_dev), NL802154_ATTR_PAD) || nla_put_u32(msg, NL802154_ATTR_GENERATION, rdev->devlist_generation ^ (cfg802154_rdev_list_generation << 2))) goto nla_put_failure; /* address settings */ if (nla_put_le64(msg, NL802154_ATTR_EXTENDED_ADDR, wpan_dev->extended_addr, NL802154_ATTR_PAD) || nla_put_le16(msg, NL802154_ATTR_SHORT_ADDR, wpan_dev->short_addr) || nla_put_le16(msg, NL802154_ATTR_PAN_ID, wpan_dev->pan_id)) goto nla_put_failure; /* ARET handling */ if (nla_put_s8(msg, NL802154_ATTR_MAX_FRAME_RETRIES, wpan_dev->frame_retries) || nla_put_u8(msg, NL802154_ATTR_MAX_BE, wpan_dev->max_be) || nla_put_u8(msg, NL802154_ATTR_MAX_CSMA_BACKOFFS, wpan_dev->csma_retries) || nla_put_u8(msg, NL802154_ATTR_MIN_BE, wpan_dev->min_be)) goto nla_put_failure; /* listen before transmit */ if (nla_put_u8(msg, NL802154_ATTR_LBT_MODE, wpan_dev->lbt)) goto nla_put_failure; /* ackreq default behaviour */ if (nla_put_u8(msg, NL802154_ATTR_ACKREQ_DEFAULT, wpan_dev->ackreq)) goto nla_put_failure; #ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) goto out; if (nl802154_get_llsec_params(msg, rdev, wpan_dev) < 0) goto nla_put_failure; out: #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */ genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_dump_interface(struct sk_buff *skb, struct netlink_callback *cb) { int wp_idx = 0; int if_idx = 0; int wp_start = cb->args[0]; int if_start = cb->args[1]; struct cfg802154_registered_device *rdev; struct wpan_dev *wpan_dev; rtnl_lock(); list_for_each_entry(rdev, &cfg802154_rdev_list, list) { if (!net_eq(wpan_phy_net(&rdev->wpan_phy), sock_net(skb->sk))) continue; if (wp_idx < wp_start) { wp_idx++; continue; } if_idx = 0; list_for_each_entry(wpan_dev, &rdev->wpan_dev_list, list) { if (if_idx < if_start) { if_idx++; continue; } if (nl802154_send_iface(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev) < 0) { goto out; } if_idx++; } wp_idx++; } out: rtnl_unlock(); cb->args[0] = wp_idx; cb->args[1] = if_idx; return skb->len; } static int nl802154_get_interface(struct sk_buff *skb, struct genl_info *info) { struct sk_buff *msg; struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct wpan_dev *wdev = info->user_ptr[1]; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; if (nl802154_send_iface(msg, info->snd_portid, info->snd_seq, 0, rdev, wdev) < 0) { nlmsg_free(msg); return -ENOBUFS; } return genlmsg_reply(msg, info); } static int nl802154_new_interface(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; enum nl802154_iftype type = NL802154_IFTYPE_UNSPEC; __le64 extended_addr = cpu_to_le64(0x0000000000000000ULL); /* TODO avoid failing a new interface * creation due to pending removal? */ if (!info->attrs[NL802154_ATTR_IFNAME]) return -EINVAL; if (info->attrs[NL802154_ATTR_IFTYPE]) { type = nla_get_u32(info->attrs[NL802154_ATTR_IFTYPE]); if (type > NL802154_IFTYPE_MAX || !(rdev->wpan_phy.supported.iftypes & BIT(type))) return -EINVAL; } if (info->attrs[NL802154_ATTR_EXTENDED_ADDR]) extended_addr = nla_get_le64(info->attrs[NL802154_ATTR_EXTENDED_ADDR]); if (!rdev->ops->add_virtual_intf) return -EOPNOTSUPP; return rdev_add_virtual_intf(rdev, nla_data(info->attrs[NL802154_ATTR_IFNAME]), NET_NAME_USER, type, extended_addr); } static int nl802154_del_interface(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct wpan_dev *wpan_dev = info->user_ptr[1]; if (!rdev->ops->del_virtual_intf) return -EOPNOTSUPP; /* If we remove a wpan device without a netdev then clear * user_ptr[1] so that nl802154_post_doit won't dereference it * to check if it needs to do dev_put(). Otherwise it crashes * since the wpan_dev has been freed, unlike with a netdev where * we need the dev_put() for the netdev to really be freed. */ if (!wpan_dev->netdev) info->user_ptr[1] = NULL; return rdev_del_virtual_intf(rdev, wpan_dev); } static int nl802154_set_channel(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; u8 channel, page; if (!info->attrs[NL802154_ATTR_PAGE] || !info->attrs[NL802154_ATTR_CHANNEL]) return -EINVAL; page = nla_get_u8(info->attrs[NL802154_ATTR_PAGE]); channel = nla_get_u8(info->attrs[NL802154_ATTR_CHANNEL]); /* check 802.15.4 constraints */ if (!ieee802154_chan_is_valid(&rdev->wpan_phy, page, channel)) return -EINVAL; return rdev_set_channel(rdev, page, channel); } static int nl802154_set_cca_mode(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct wpan_phy_cca cca; if (!(rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_MODE)) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_CCA_MODE]) return -EINVAL; cca.mode = nla_get_u32(info->attrs[NL802154_ATTR_CCA_MODE]); /* checking 802.15.4 constraints */ if (cca.mode < NL802154_CCA_ENERGY || cca.mode > NL802154_CCA_ATTR_MAX || !(rdev->wpan_phy.supported.cca_modes & BIT(cca.mode))) return -EINVAL; if (cca.mode == NL802154_CCA_ENERGY_CARRIER) { if (!info->attrs[NL802154_ATTR_CCA_OPT]) return -EINVAL; cca.opt = nla_get_u32(info->attrs[NL802154_ATTR_CCA_OPT]); if (cca.opt > NL802154_CCA_OPT_ATTR_MAX || !(rdev->wpan_phy.supported.cca_opts & BIT(cca.opt))) return -EINVAL; } return rdev_set_cca_mode(rdev, &cca); } static int nl802154_set_cca_ed_level(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; s32 ed_level; int i; if (!(rdev->wpan_phy.flags & WPAN_PHY_FLAG_CCA_ED_LEVEL)) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_CCA_ED_LEVEL]) return -EINVAL; ed_level = nla_get_s32(info->attrs[NL802154_ATTR_CCA_ED_LEVEL]); for (i = 0; i < rdev->wpan_phy.supported.cca_ed_levels_size; i++) { if (ed_level == rdev->wpan_phy.supported.cca_ed_levels[i]) return rdev_set_cca_ed_level(rdev, ed_level); } return -EINVAL; } static int nl802154_set_tx_power(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; s32 power; int i; if (!(rdev->wpan_phy.flags & WPAN_PHY_FLAG_TXPOWER)) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_TX_POWER]) return -EINVAL; power = nla_get_s32(info->attrs[NL802154_ATTR_TX_POWER]); for (i = 0; i < rdev->wpan_phy.supported.tx_powers_size; i++) { if (power == rdev->wpan_phy.supported.tx_powers[i]) return rdev_set_tx_power(rdev, power); } return -EINVAL; } static int nl802154_set_pan_id(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; __le16 pan_id; /* conflict here while tx/rx calls */ if (netif_running(dev)) return -EBUSY; if (wpan_dev->lowpan_dev) { if (netif_running(wpan_dev->lowpan_dev)) return -EBUSY; } /* don't change address fields on monitor */ if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR || !info->attrs[NL802154_ATTR_PAN_ID]) return -EINVAL; pan_id = nla_get_le16(info->attrs[NL802154_ATTR_PAN_ID]); /* Only allow changing the PAN ID when the device has no more * associations ongoing to avoid confusing peers. */ if (cfg802154_device_is_associated(wpan_dev)) { NL_SET_ERR_MSG(info->extack, "Existing associations, changing PAN ID forbidden"); return -EINVAL; } return rdev_set_pan_id(rdev, wpan_dev, pan_id); } static int nl802154_set_short_addr(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; __le16 short_addr; /* conflict here while tx/rx calls */ if (netif_running(dev)) return -EBUSY; if (wpan_dev->lowpan_dev) { if (netif_running(wpan_dev->lowpan_dev)) return -EBUSY; } /* don't change address fields on monitor */ if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR || !info->attrs[NL802154_ATTR_SHORT_ADDR]) return -EINVAL; short_addr = nla_get_le16(info->attrs[NL802154_ATTR_SHORT_ADDR]); /* The short address only has a meaning when part of a PAN, after a * proper association procedure. However, we want to still offer the * possibility to create static networks so changing the short address * is only allowed when not already associated to other devices with * the official handshake. */ if (cfg802154_device_is_associated(wpan_dev)) { NL_SET_ERR_MSG(info->extack, "Existing associations, changing short address forbidden"); return -EINVAL; } return rdev_set_short_addr(rdev, wpan_dev, short_addr); } static int nl802154_set_backoff_exponent(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; u8 min_be, max_be; /* should be set on netif open inside phy settings */ if (netif_running(dev)) return -EBUSY; if (!info->attrs[NL802154_ATTR_MIN_BE] || !info->attrs[NL802154_ATTR_MAX_BE]) return -EINVAL; min_be = nla_get_u8(info->attrs[NL802154_ATTR_MIN_BE]); max_be = nla_get_u8(info->attrs[NL802154_ATTR_MAX_BE]); /* check 802.15.4 constraints */ if (min_be < rdev->wpan_phy.supported.min_minbe || min_be > rdev->wpan_phy.supported.max_minbe || max_be < rdev->wpan_phy.supported.min_maxbe || max_be > rdev->wpan_phy.supported.max_maxbe || min_be > max_be) return -EINVAL; return rdev_set_backoff_exponent(rdev, wpan_dev, min_be, max_be); } static int nl802154_set_max_csma_backoffs(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; u8 max_csma_backoffs; /* conflict here while other running iface settings */ if (netif_running(dev)) return -EBUSY; if (!info->attrs[NL802154_ATTR_MAX_CSMA_BACKOFFS]) return -EINVAL; max_csma_backoffs = nla_get_u8( info->attrs[NL802154_ATTR_MAX_CSMA_BACKOFFS]); /* check 802.15.4 constraints */ if (max_csma_backoffs < rdev->wpan_phy.supported.min_csma_backoffs || max_csma_backoffs > rdev->wpan_phy.supported.max_csma_backoffs) return -EINVAL; return rdev_set_max_csma_backoffs(rdev, wpan_dev, max_csma_backoffs); } static int nl802154_set_max_frame_retries(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; s8 max_frame_retries; if (netif_running(dev)) return -EBUSY; if (!info->attrs[NL802154_ATTR_MAX_FRAME_RETRIES]) return -EINVAL; max_frame_retries = nla_get_s8( info->attrs[NL802154_ATTR_MAX_FRAME_RETRIES]); /* check 802.15.4 constraints */ if (max_frame_retries < rdev->wpan_phy.supported.min_frame_retries || max_frame_retries > rdev->wpan_phy.supported.max_frame_retries) return -EINVAL; return rdev_set_max_frame_retries(rdev, wpan_dev, max_frame_retries); } static int nl802154_set_lbt_mode(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; int mode; if (netif_running(dev)) return -EBUSY; if (!info->attrs[NL802154_ATTR_LBT_MODE]) return -EINVAL; mode = nla_get_u8(info->attrs[NL802154_ATTR_LBT_MODE]); if (mode != 0 && mode != 1) return -EINVAL; if (!wpan_phy_supported_bool(mode, rdev->wpan_phy.supported.lbt)) return -EINVAL; return rdev_set_lbt_mode(rdev, wpan_dev, mode); } static int nl802154_set_ackreq_default(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; int ackreq; if (netif_running(dev)) return -EBUSY; if (!info->attrs[NL802154_ATTR_ACKREQ_DEFAULT]) return -EINVAL; ackreq = nla_get_u8(info->attrs[NL802154_ATTR_ACKREQ_DEFAULT]); if (ackreq != 0 && ackreq != 1) return -EINVAL; return rdev_set_ackreq_default(rdev, wpan_dev, ackreq); } static int nl802154_wpan_phy_netns(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net *net; int err; if (info->attrs[NL802154_ATTR_PID]) { u32 pid = nla_get_u32(info->attrs[NL802154_ATTR_PID]); net = get_net_ns_by_pid(pid); } else if (info->attrs[NL802154_ATTR_NETNS_FD]) { u32 fd = nla_get_u32(info->attrs[NL802154_ATTR_NETNS_FD]); net = get_net_ns_by_fd(fd); } else { return -EINVAL; } if (IS_ERR(net)) return PTR_ERR(net); err = 0; /* check if anything to do */ if (!net_eq(wpan_phy_net(&rdev->wpan_phy), net)) err = cfg802154_switch_netns(rdev, net); put_net(net); return err; } static int nl802154_prep_scan_event_msg(struct sk_buff *msg, struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev, u32 portid, u32 seq, int flags, u8 cmd, struct ieee802154_coord_desc *desc) { struct nlattr *nla; void *hdr; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_WPAN_PHY, rdev->wpan_phy_idx)) goto nla_put_failure; if (wpan_dev->netdev && nla_put_u32(msg, NL802154_ATTR_IFINDEX, wpan_dev->netdev->ifindex)) goto nla_put_failure; if (nla_put_u64_64bit(msg, NL802154_ATTR_WPAN_DEV, wpan_dev_id(wpan_dev), NL802154_ATTR_PAD)) goto nla_put_failure; nla = nla_nest_start_noflag(msg, NL802154_ATTR_COORDINATOR); if (!nla) goto nla_put_failure; if (nla_put(msg, NL802154_COORD_PANID, IEEE802154_PAN_ID_LEN, &desc->addr.pan_id)) goto nla_put_failure; if (desc->addr.mode == IEEE802154_ADDR_SHORT) { if (nla_put(msg, NL802154_COORD_ADDR, IEEE802154_SHORT_ADDR_LEN, &desc->addr.short_addr)) goto nla_put_failure; } else { if (nla_put(msg, NL802154_COORD_ADDR, IEEE802154_EXTENDED_ADDR_LEN, &desc->addr.extended_addr)) goto nla_put_failure; } if (nla_put_u8(msg, NL802154_COORD_CHANNEL, desc->channel)) goto nla_put_failure; if (nla_put_u8(msg, NL802154_COORD_PAGE, desc->page)) goto nla_put_failure; if (nla_put_u16(msg, NL802154_COORD_SUPERFRAME_SPEC, desc->superframe_spec)) goto nla_put_failure; if (nla_put_u8(msg, NL802154_COORD_LINK_QUALITY, desc->link_quality)) goto nla_put_failure; if (desc->gts_permit && nla_put_flag(msg, NL802154_COORD_GTS_PERMIT)) goto nla_put_failure; /* TODO: NL802154_COORD_PAYLOAD_DATA if any */ nla_nest_end(msg, nla); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } int nl802154_scan_event(struct wpan_phy *wpan_phy, struct wpan_dev *wpan_dev, struct ieee802154_coord_desc *desc) { struct cfg802154_registered_device *rdev = wpan_phy_to_rdev(wpan_phy); struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); if (!msg) return -ENOMEM; ret = nl802154_prep_scan_event_msg(msg, rdev, wpan_dev, 0, 0, 0, NL802154_CMD_SCAN_EVENT, desc); if (ret < 0) { nlmsg_free(msg); return ret; } return genlmsg_multicast_netns(&nl802154_fam, wpan_phy_net(wpan_phy), msg, 0, NL802154_MCGRP_SCAN, GFP_ATOMIC); } EXPORT_SYMBOL_GPL(nl802154_scan_event); static int nl802154_trigger_scan(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct wpan_phy *wpan_phy = &rdev->wpan_phy; struct cfg802154_scan_request *request; u8 type; int err; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) { NL_SET_ERR_MSG(info->extack, "Monitors are not allowed to perform scans"); return -EOPNOTSUPP; } if (!info->attrs[NL802154_ATTR_SCAN_TYPE]) { NL_SET_ERR_MSG(info->extack, "Malformed request, missing scan type"); return -EINVAL; } if (wpan_phy->flags & WPAN_PHY_FLAG_DATAGRAMS_ONLY) { NL_SET_ERR_MSG(info->extack, "PHY only supports datagrams"); return -EOPNOTSUPP; } request = kzalloc(sizeof(*request), GFP_KERNEL); if (!request) return -ENOMEM; request->wpan_dev = wpan_dev; request->wpan_phy = wpan_phy; type = nla_get_u8(info->attrs[NL802154_ATTR_SCAN_TYPE]); switch (type) { case NL802154_SCAN_ACTIVE: case NL802154_SCAN_PASSIVE: request->type = type; break; default: NL_SET_ERR_MSG_FMT(info->extack, "Unsupported scan type: %d", type); err = -EINVAL; goto free_request; } /* Use current page by default */ request->page = nla_get_u8_default(info->attrs[NL802154_ATTR_PAGE], wpan_phy->current_page); /* Scan all supported channels by default */ request->channels = nla_get_u32_default(info->attrs[NL802154_ATTR_SCAN_CHANNELS], wpan_phy->supported.channels[request->page]); /* Use maximum duration order by default */ request->duration = nla_get_u8_default(info->attrs[NL802154_ATTR_SCAN_DURATION], IEEE802154_MAX_SCAN_DURATION); err = rdev_trigger_scan(rdev, request); if (err) { pr_err("Failure starting scanning (%d)\n", err); goto free_request; } return 0; free_request: kfree(request); return err; } static int nl802154_prep_scan_msg(struct sk_buff *msg, struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev, u32 portid, u32 seq, int flags, u8 cmd, u8 arg) { void *hdr; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_WPAN_PHY, rdev->wpan_phy_idx)) goto nla_put_failure; if (wpan_dev->netdev && nla_put_u32(msg, NL802154_ATTR_IFINDEX, wpan_dev->netdev->ifindex)) goto nla_put_failure; if (nla_put_u64_64bit(msg, NL802154_ATTR_WPAN_DEV, wpan_dev_id(wpan_dev), NL802154_ATTR_PAD)) goto nla_put_failure; if (cmd == NL802154_CMD_SCAN_DONE && nla_put_u8(msg, NL802154_ATTR_SCAN_DONE_REASON, arg)) goto nla_put_failure; genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_send_scan_msg(struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev, u8 cmd, u8 arg) { struct sk_buff *msg; int ret; msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) return -ENOMEM; ret = nl802154_prep_scan_msg(msg, rdev, wpan_dev, 0, 0, 0, cmd, arg); if (ret < 0) { nlmsg_free(msg); return ret; } return genlmsg_multicast_netns(&nl802154_fam, wpan_phy_net(&rdev->wpan_phy), msg, 0, NL802154_MCGRP_SCAN, GFP_KERNEL); } int nl802154_scan_started(struct wpan_phy *wpan_phy, struct wpan_dev *wpan_dev) { struct cfg802154_registered_device *rdev = wpan_phy_to_rdev(wpan_phy); int err; /* Ignore errors when there are no listeners */ err = nl802154_send_scan_msg(rdev, wpan_dev, NL802154_CMD_TRIGGER_SCAN, 0); if (err == -ESRCH) err = 0; return err; } EXPORT_SYMBOL_GPL(nl802154_scan_started); int nl802154_scan_done(struct wpan_phy *wpan_phy, struct wpan_dev *wpan_dev, enum nl802154_scan_done_reasons reason) { struct cfg802154_registered_device *rdev = wpan_phy_to_rdev(wpan_phy); int err; /* Ignore errors when there are no listeners */ err = nl802154_send_scan_msg(rdev, wpan_dev, NL802154_CMD_SCAN_DONE, reason); if (err == -ESRCH) err = 0; return err; } EXPORT_SYMBOL_GPL(nl802154_scan_done); static int nl802154_abort_scan(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; /* Resources are released in the notification helper above */ return rdev_abort_scan(rdev, wpan_dev); } static int nl802154_send_beacons(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct wpan_phy *wpan_phy = &rdev->wpan_phy; struct cfg802154_beacon_request *request; int err; if (wpan_dev->iftype != NL802154_IFTYPE_COORD) { NL_SET_ERR_MSG(info->extack, "Only coordinators can send beacons"); return -EOPNOTSUPP; } if (wpan_dev->pan_id == cpu_to_le16(IEEE802154_PANID_BROADCAST)) { NL_SET_ERR_MSG(info->extack, "Device is not part of any PAN"); return -EPERM; } if (wpan_phy->flags & WPAN_PHY_FLAG_DATAGRAMS_ONLY) { NL_SET_ERR_MSG(info->extack, "PHY only supports datagrams"); return -EOPNOTSUPP; } request = kzalloc(sizeof(*request), GFP_KERNEL); if (!request) return -ENOMEM; request->wpan_dev = wpan_dev; request->wpan_phy = wpan_phy; /* Use maximum duration order by default */ request->interval = nla_get_u8_default(info->attrs[NL802154_ATTR_BEACON_INTERVAL], IEEE802154_MAX_SCAN_DURATION); err = rdev_send_beacons(rdev, request); if (err) { pr_err("Failure starting sending beacons (%d)\n", err); goto free_request; } return 0; free_request: kfree(request); return err; } void nl802154_beaconing_done(struct wpan_dev *wpan_dev) { /* NOP */ } EXPORT_SYMBOL_GPL(nl802154_beaconing_done); static int nl802154_stop_beacons(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; /* Resources are released in the notification helper above */ return rdev_stop_beacons(rdev, wpan_dev); } static int nl802154_associate(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev; struct wpan_phy *wpan_phy; struct ieee802154_addr coord; int err; wpan_dev = dev->ieee802154_ptr; wpan_phy = &rdev->wpan_phy; if (wpan_phy->flags & WPAN_PHY_FLAG_DATAGRAMS_ONLY) { NL_SET_ERR_MSG(info->extack, "PHY only supports datagrams"); return -EOPNOTSUPP; } if (!info->attrs[NL802154_ATTR_PAN_ID] || !info->attrs[NL802154_ATTR_EXTENDED_ADDR]) return -EINVAL; coord.pan_id = nla_get_le16(info->attrs[NL802154_ATTR_PAN_ID]); coord.mode = IEEE802154_ADDR_LONG; coord.extended_addr = nla_get_le64(info->attrs[NL802154_ATTR_EXTENDED_ADDR]); mutex_lock(&wpan_dev->association_lock); err = rdev_associate(rdev, wpan_dev, &coord); mutex_unlock(&wpan_dev->association_lock); if (err) pr_err("Association with PAN ID 0x%x failed (%d)\n", le16_to_cpu(coord.pan_id), err); return err; } static int nl802154_disassociate(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct wpan_phy *wpan_phy = &rdev->wpan_phy; struct ieee802154_addr target; if (wpan_phy->flags & WPAN_PHY_FLAG_DATAGRAMS_ONLY) { NL_SET_ERR_MSG(info->extack, "PHY only supports datagrams"); return -EOPNOTSUPP; } target.pan_id = wpan_dev->pan_id; if (info->attrs[NL802154_ATTR_EXTENDED_ADDR]) { target.mode = IEEE802154_ADDR_LONG; target.extended_addr = nla_get_le64(info->attrs[NL802154_ATTR_EXTENDED_ADDR]); } else if (info->attrs[NL802154_ATTR_SHORT_ADDR]) { target.mode = IEEE802154_ADDR_SHORT; target.short_addr = nla_get_le16(info->attrs[NL802154_ATTR_SHORT_ADDR]); } else { NL_SET_ERR_MSG(info->extack, "Device address is missing"); return -EINVAL; } mutex_lock(&wpan_dev->association_lock); rdev_disassociate(rdev, wpan_dev, &target); mutex_unlock(&wpan_dev->association_lock); return 0; } static int nl802154_set_max_associations(struct sk_buff *skb, struct genl_info *info) { struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; unsigned int max_assoc; if (!info->attrs[NL802154_ATTR_MAX_ASSOCIATIONS]) { NL_SET_ERR_MSG(info->extack, "No maximum number of association given"); return -EINVAL; } max_assoc = nla_get_u32(info->attrs[NL802154_ATTR_MAX_ASSOCIATIONS]); mutex_lock(&wpan_dev->association_lock); cfg802154_set_max_associations(wpan_dev, max_assoc); mutex_unlock(&wpan_dev->association_lock); return 0; } static int nl802154_send_peer_info(struct sk_buff *msg, struct netlink_callback *cb, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct wpan_dev *wpan_dev, struct ieee802154_pan_device *peer, enum nl802154_peer_type type) { struct nlattr *nla; void *hdr; ASSERT_RTNL(); hdr = nl802154hdr_put(msg, NETLINK_CB(cb->skb).portid, seq, flags, NL802154_CMD_LIST_ASSOCIATIONS); if (!hdr) return -ENOBUFS; genl_dump_check_consistent(cb, hdr); nla = nla_nest_start_noflag(msg, NL802154_ATTR_PEER); if (!nla) goto nla_put_failure; if (nla_put_u8(msg, NL802154_DEV_ADDR_ATTR_PEER_TYPE, type)) goto nla_put_failure; if (nla_put_u8(msg, NL802154_DEV_ADDR_ATTR_MODE, peer->mode)) goto nla_put_failure; if (nla_put(msg, NL802154_DEV_ADDR_ATTR_SHORT, IEEE802154_SHORT_ADDR_LEN, &peer->short_addr)) goto nla_put_failure; if (nla_put(msg, NL802154_DEV_ADDR_ATTR_EXTENDED, IEEE802154_EXTENDED_ADDR_LEN, &peer->extended_addr)) goto nla_put_failure; nla_nest_end(msg, nla); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_list_associations(struct sk_buff *skb, struct netlink_callback *cb) { struct cfg802154_registered_device *rdev; struct ieee802154_pan_device *child; struct wpan_dev *wpan_dev; int err; err = nl802154_prepare_wpan_dev_dump(skb, cb, &rdev, &wpan_dev); if (err) return err; mutex_lock(&wpan_dev->association_lock); if (cb->args[2]) goto out; if (wpan_dev->parent) { err = nl802154_send_peer_info(skb, cb, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev, wpan_dev->parent, NL802154_PEER_TYPE_PARENT); if (err < 0) goto out_err; } list_for_each_entry(child, &wpan_dev->children, node) { err = nl802154_send_peer_info(skb, cb, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev, child, NL802154_PEER_TYPE_CHILD); if (err < 0) goto out_err; } cb->args[2] = 1; out: err = skb->len; out_err: mutex_unlock(&wpan_dev->association_lock); nl802154_finish_wpan_dev_dump(rdev); return err; } #ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL static const struct nla_policy nl802154_dev_addr_policy[NL802154_DEV_ADDR_ATTR_MAX + 1] = { [NL802154_DEV_ADDR_ATTR_PAN_ID] = { .type = NLA_U16 }, [NL802154_DEV_ADDR_ATTR_MODE] = { .type = NLA_U32 }, [NL802154_DEV_ADDR_ATTR_SHORT] = { .type = NLA_U16 }, [NL802154_DEV_ADDR_ATTR_EXTENDED] = { .type = NLA_U64 }, }; static int ieee802154_llsec_parse_dev_addr(struct nlattr *nla, struct ieee802154_addr *addr) { struct nlattr *attrs[NL802154_DEV_ADDR_ATTR_MAX + 1]; if (!nla || nla_parse_nested_deprecated(attrs, NL802154_DEV_ADDR_ATTR_MAX, nla, nl802154_dev_addr_policy, NULL)) return -EINVAL; if (!attrs[NL802154_DEV_ADDR_ATTR_PAN_ID] || !attrs[NL802154_DEV_ADDR_ATTR_MODE]) return -EINVAL; addr->pan_id = nla_get_le16(attrs[NL802154_DEV_ADDR_ATTR_PAN_ID]); addr->mode = nla_get_u32(attrs[NL802154_DEV_ADDR_ATTR_MODE]); switch (addr->mode) { case NL802154_DEV_ADDR_SHORT: if (!attrs[NL802154_DEV_ADDR_ATTR_SHORT]) return -EINVAL; addr->short_addr = nla_get_le16(attrs[NL802154_DEV_ADDR_ATTR_SHORT]); break; case NL802154_DEV_ADDR_EXTENDED: if (!attrs[NL802154_DEV_ADDR_ATTR_EXTENDED]) return -EINVAL; addr->extended_addr = nla_get_le64(attrs[NL802154_DEV_ADDR_ATTR_EXTENDED]); break; default: return -EINVAL; } return 0; } static const struct nla_policy nl802154_key_id_policy[NL802154_KEY_ID_ATTR_MAX + 1] = { [NL802154_KEY_ID_ATTR_MODE] = { .type = NLA_U32 }, [NL802154_KEY_ID_ATTR_INDEX] = { .type = NLA_U8 }, [NL802154_KEY_ID_ATTR_IMPLICIT] = { .type = NLA_NESTED }, [NL802154_KEY_ID_ATTR_SOURCE_SHORT] = { .type = NLA_U32 }, [NL802154_KEY_ID_ATTR_SOURCE_EXTENDED] = { .type = NLA_U64 }, }; static int ieee802154_llsec_parse_key_id(struct nlattr *nla, struct ieee802154_llsec_key_id *desc) { struct nlattr *attrs[NL802154_KEY_ID_ATTR_MAX + 1]; if (!nla || nla_parse_nested_deprecated(attrs, NL802154_KEY_ID_ATTR_MAX, nla, nl802154_key_id_policy, NULL)) return -EINVAL; if (!attrs[NL802154_KEY_ID_ATTR_MODE]) return -EINVAL; desc->mode = nla_get_u32(attrs[NL802154_KEY_ID_ATTR_MODE]); switch (desc->mode) { case NL802154_KEY_ID_MODE_IMPLICIT: if (!attrs[NL802154_KEY_ID_ATTR_IMPLICIT]) return -EINVAL; if (ieee802154_llsec_parse_dev_addr(attrs[NL802154_KEY_ID_ATTR_IMPLICIT], &desc->device_addr) < 0) return -EINVAL; break; case NL802154_KEY_ID_MODE_INDEX: break; case NL802154_KEY_ID_MODE_INDEX_SHORT: if (!attrs[NL802154_KEY_ID_ATTR_SOURCE_SHORT]) return -EINVAL; desc->short_source = nla_get_le32(attrs[NL802154_KEY_ID_ATTR_SOURCE_SHORT]); break; case NL802154_KEY_ID_MODE_INDEX_EXTENDED: if (!attrs[NL802154_KEY_ID_ATTR_SOURCE_EXTENDED]) return -EINVAL; desc->extended_source = nla_get_le64(attrs[NL802154_KEY_ID_ATTR_SOURCE_EXTENDED]); break; default: return -EINVAL; } if (desc->mode != NL802154_KEY_ID_MODE_IMPLICIT) { if (!attrs[NL802154_KEY_ID_ATTR_INDEX]) return -EINVAL; /* TODO change id to idx */ desc->id = nla_get_u8(attrs[NL802154_KEY_ID_ATTR_INDEX]); } return 0; } static int nl802154_set_llsec_params(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct ieee802154_llsec_params params; u32 changed = 0; int ret; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (info->attrs[NL802154_ATTR_SEC_ENABLED]) { u8 enabled; enabled = nla_get_u8(info->attrs[NL802154_ATTR_SEC_ENABLED]); if (enabled != 0 && enabled != 1) return -EINVAL; params.enabled = nla_get_u8(info->attrs[NL802154_ATTR_SEC_ENABLED]); changed |= IEEE802154_LLSEC_PARAM_ENABLED; } if (info->attrs[NL802154_ATTR_SEC_OUT_KEY_ID]) { ret = ieee802154_llsec_parse_key_id(info->attrs[NL802154_ATTR_SEC_OUT_KEY_ID], ¶ms.out_key); if (ret < 0) return ret; changed |= IEEE802154_LLSEC_PARAM_OUT_KEY; } if (info->attrs[NL802154_ATTR_SEC_OUT_LEVEL]) { params.out_level = nla_get_u32(info->attrs[NL802154_ATTR_SEC_OUT_LEVEL]); if (params.out_level > NL802154_SECLEVEL_MAX) return -EINVAL; changed |= IEEE802154_LLSEC_PARAM_OUT_LEVEL; } if (info->attrs[NL802154_ATTR_SEC_FRAME_COUNTER]) { params.frame_counter = nla_get_be32(info->attrs[NL802154_ATTR_SEC_FRAME_COUNTER]); changed |= IEEE802154_LLSEC_PARAM_FRAME_COUNTER; } return rdev_set_llsec_params(rdev, wpan_dev, ¶ms, changed); } static int nl802154_send_key(struct sk_buff *msg, u32 cmd, u32 portid, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct net_device *dev, const struct ieee802154_llsec_key_entry *key) { void *hdr; u32 commands[NL802154_CMD_FRAME_NR_IDS / 32]; struct nlattr *nl_key, *nl_key_id; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_IFINDEX, dev->ifindex)) goto nla_put_failure; nl_key = nla_nest_start_noflag(msg, NL802154_ATTR_SEC_KEY); if (!nl_key) goto nla_put_failure; nl_key_id = nla_nest_start_noflag(msg, NL802154_KEY_ATTR_ID); if (!nl_key_id) goto nla_put_failure; if (ieee802154_llsec_send_key_id(msg, &key->id) < 0) goto nla_put_failure; nla_nest_end(msg, nl_key_id); if (nla_put_u8(msg, NL802154_KEY_ATTR_USAGE_FRAMES, key->key->frame_types)) goto nla_put_failure; if (key->key->frame_types & BIT(NL802154_FRAME_CMD)) { /* TODO for each nested */ memset(commands, 0, sizeof(commands)); commands[7] = key->key->cmd_frame_ids; if (nla_put(msg, NL802154_KEY_ATTR_USAGE_CMDS, sizeof(commands), commands)) goto nla_put_failure; } if (nla_put(msg, NL802154_KEY_ATTR_BYTES, NL802154_KEY_SIZE, key->key->key)) goto nla_put_failure; nla_nest_end(msg, nl_key); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_dump_llsec_key(struct sk_buff *skb, struct netlink_callback *cb) { struct cfg802154_registered_device *rdev = NULL; struct ieee802154_llsec_key_entry *key; struct ieee802154_llsec_table *table; struct wpan_dev *wpan_dev; int err; err = nl802154_prepare_wpan_dev_dump(skb, cb, &rdev, &wpan_dev); if (err) return err; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) { err = skb->len; goto out_err; } if (!wpan_dev->netdev) { err = -EINVAL; goto out_err; } rdev_lock_llsec_table(rdev, wpan_dev); rdev_get_llsec_table(rdev, wpan_dev, &table); /* TODO make it like station dump */ if (cb->args[2]) goto out; list_for_each_entry(key, &table->keys, list) { if (nl802154_send_key(skb, NL802154_CMD_NEW_SEC_KEY, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev->netdev, key) < 0) { /* TODO */ err = -EIO; rdev_unlock_llsec_table(rdev, wpan_dev); goto out_err; } } cb->args[2] = 1; out: rdev_unlock_llsec_table(rdev, wpan_dev); err = skb->len; out_err: nl802154_finish_wpan_dev_dump(rdev); return err; } static const struct nla_policy nl802154_key_policy[NL802154_KEY_ATTR_MAX + 1] = { [NL802154_KEY_ATTR_ID] = { NLA_NESTED }, /* TODO handle it as for_each_nested and NLA_FLAG? */ [NL802154_KEY_ATTR_USAGE_FRAMES] = { NLA_U8 }, /* TODO handle it as for_each_nested, not static array? */ [NL802154_KEY_ATTR_USAGE_CMDS] = { .len = NL802154_CMD_FRAME_NR_IDS / 8 }, [NL802154_KEY_ATTR_BYTES] = { .len = NL802154_KEY_SIZE }, }; static int nl802154_add_llsec_key(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct nlattr *attrs[NL802154_KEY_ATTR_MAX + 1]; struct ieee802154_llsec_key key = { }; struct ieee802154_llsec_key_id id = { }; u32 commands[NL802154_CMD_FRAME_NR_IDS / 32] = { }; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_SEC_KEY] || nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) return -EINVAL; if (!attrs[NL802154_KEY_ATTR_USAGE_FRAMES] || !attrs[NL802154_KEY_ATTR_BYTES]) return -EINVAL; if (ieee802154_llsec_parse_key_id(attrs[NL802154_KEY_ATTR_ID], &id) < 0) return -ENOBUFS; key.frame_types = nla_get_u8(attrs[NL802154_KEY_ATTR_USAGE_FRAMES]); if (key.frame_types > BIT(NL802154_FRAME_MAX) || ((key.frame_types & BIT(NL802154_FRAME_CMD)) && !attrs[NL802154_KEY_ATTR_USAGE_CMDS])) return -EINVAL; if (attrs[NL802154_KEY_ATTR_USAGE_CMDS]) { /* TODO for each nested */ nla_memcpy(commands, attrs[NL802154_KEY_ATTR_USAGE_CMDS], NL802154_CMD_FRAME_NR_IDS / 8); /* TODO understand the -EINVAL logic here? last condition */ if (commands[0] || commands[1] || commands[2] || commands[3] || commands[4] || commands[5] || commands[6] || commands[7] > BIT(NL802154_CMD_FRAME_MAX)) return -EINVAL; key.cmd_frame_ids = commands[7]; } else { key.cmd_frame_ids = 0; } nla_memcpy(key.key, attrs[NL802154_KEY_ATTR_BYTES], NL802154_KEY_SIZE); if (ieee802154_llsec_parse_key_id(attrs[NL802154_KEY_ATTR_ID], &id) < 0) return -ENOBUFS; return rdev_add_llsec_key(rdev, wpan_dev, &id, &key); } static int nl802154_del_llsec_key(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct nlattr *attrs[NL802154_KEY_ATTR_MAX + 1]; struct ieee802154_llsec_key_id id; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_SEC_KEY] || nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) return -EINVAL; if (ieee802154_llsec_parse_key_id(attrs[NL802154_KEY_ATTR_ID], &id) < 0) return -ENOBUFS; return rdev_del_llsec_key(rdev, wpan_dev, &id); } static int nl802154_send_device(struct sk_buff *msg, u32 cmd, u32 portid, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct net_device *dev, const struct ieee802154_llsec_device *dev_desc) { void *hdr; struct nlattr *nl_device; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_IFINDEX, dev->ifindex)) goto nla_put_failure; nl_device = nla_nest_start_noflag(msg, NL802154_ATTR_SEC_DEVICE); if (!nl_device) goto nla_put_failure; if (nla_put_u32(msg, NL802154_DEV_ATTR_FRAME_COUNTER, dev_desc->frame_counter) || nla_put_le16(msg, NL802154_DEV_ATTR_PAN_ID, dev_desc->pan_id) || nla_put_le16(msg, NL802154_DEV_ATTR_SHORT_ADDR, dev_desc->short_addr) || nla_put_le64(msg, NL802154_DEV_ATTR_EXTENDED_ADDR, dev_desc->hwaddr, NL802154_DEV_ATTR_PAD) || nla_put_u8(msg, NL802154_DEV_ATTR_SECLEVEL_EXEMPT, dev_desc->seclevel_exempt) || nla_put_u32(msg, NL802154_DEV_ATTR_KEY_MODE, dev_desc->key_mode)) goto nla_put_failure; nla_nest_end(msg, nl_device); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_dump_llsec_dev(struct sk_buff *skb, struct netlink_callback *cb) { struct cfg802154_registered_device *rdev = NULL; struct ieee802154_llsec_device *dev; struct ieee802154_llsec_table *table; struct wpan_dev *wpan_dev; int err; err = nl802154_prepare_wpan_dev_dump(skb, cb, &rdev, &wpan_dev); if (err) return err; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) { err = skb->len; goto out_err; } if (!wpan_dev->netdev) { err = -EINVAL; goto out_err; } rdev_lock_llsec_table(rdev, wpan_dev); rdev_get_llsec_table(rdev, wpan_dev, &table); /* TODO make it like station dump */ if (cb->args[2]) goto out; list_for_each_entry(dev, &table->devices, list) { if (nl802154_send_device(skb, NL802154_CMD_NEW_SEC_LEVEL, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev->netdev, dev) < 0) { /* TODO */ err = -EIO; rdev_unlock_llsec_table(rdev, wpan_dev); goto out_err; } } cb->args[2] = 1; out: rdev_unlock_llsec_table(rdev, wpan_dev); err = skb->len; out_err: nl802154_finish_wpan_dev_dump(rdev); return err; } static const struct nla_policy nl802154_dev_policy[NL802154_DEV_ATTR_MAX + 1] = { [NL802154_DEV_ATTR_FRAME_COUNTER] = { NLA_U32 }, [NL802154_DEV_ATTR_PAN_ID] = { .type = NLA_U16 }, [NL802154_DEV_ATTR_SHORT_ADDR] = { .type = NLA_U16 }, [NL802154_DEV_ATTR_EXTENDED_ADDR] = { .type = NLA_U64 }, [NL802154_DEV_ATTR_SECLEVEL_EXEMPT] = { NLA_U8 }, [NL802154_DEV_ATTR_KEY_MODE] = { NLA_U32 }, }; static int ieee802154_llsec_parse_device(struct nlattr *nla, struct ieee802154_llsec_device *dev) { struct nlattr *attrs[NL802154_DEV_ATTR_MAX + 1]; if (!nla || nla_parse_nested_deprecated(attrs, NL802154_DEV_ATTR_MAX, nla, nl802154_dev_policy, NULL)) return -EINVAL; memset(dev, 0, sizeof(*dev)); if (!attrs[NL802154_DEV_ATTR_FRAME_COUNTER] || !attrs[NL802154_DEV_ATTR_PAN_ID] || !attrs[NL802154_DEV_ATTR_SHORT_ADDR] || !attrs[NL802154_DEV_ATTR_EXTENDED_ADDR] || !attrs[NL802154_DEV_ATTR_SECLEVEL_EXEMPT] || !attrs[NL802154_DEV_ATTR_KEY_MODE]) return -EINVAL; /* TODO be32 */ dev->frame_counter = nla_get_u32(attrs[NL802154_DEV_ATTR_FRAME_COUNTER]); dev->pan_id = nla_get_le16(attrs[NL802154_DEV_ATTR_PAN_ID]); dev->short_addr = nla_get_le16(attrs[NL802154_DEV_ATTR_SHORT_ADDR]); /* TODO rename hwaddr to extended_addr */ dev->hwaddr = nla_get_le64(attrs[NL802154_DEV_ATTR_EXTENDED_ADDR]); dev->seclevel_exempt = nla_get_u8(attrs[NL802154_DEV_ATTR_SECLEVEL_EXEMPT]); dev->key_mode = nla_get_u32(attrs[NL802154_DEV_ATTR_KEY_MODE]); if (dev->key_mode > NL802154_DEVKEY_MAX || (dev->seclevel_exempt != 0 && dev->seclevel_exempt != 1)) return -EINVAL; return 0; } static int nl802154_add_llsec_dev(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct ieee802154_llsec_device dev_desc; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (ieee802154_llsec_parse_device(info->attrs[NL802154_ATTR_SEC_DEVICE], &dev_desc) < 0) return -EINVAL; return rdev_add_device(rdev, wpan_dev, &dev_desc); } static int nl802154_del_llsec_dev(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct nlattr *attrs[NL802154_DEV_ATTR_MAX + 1]; __le64 extended_addr; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_SEC_DEVICE] || nla_parse_nested_deprecated(attrs, NL802154_DEV_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVICE], nl802154_dev_policy, info->extack)) return -EINVAL; if (!attrs[NL802154_DEV_ATTR_EXTENDED_ADDR]) return -EINVAL; extended_addr = nla_get_le64(attrs[NL802154_DEV_ATTR_EXTENDED_ADDR]); return rdev_del_device(rdev, wpan_dev, extended_addr); } static int nl802154_send_devkey(struct sk_buff *msg, u32 cmd, u32 portid, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct net_device *dev, __le64 extended_addr, const struct ieee802154_llsec_device_key *devkey) { void *hdr; struct nlattr *nl_devkey, *nl_key_id; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_IFINDEX, dev->ifindex)) goto nla_put_failure; nl_devkey = nla_nest_start_noflag(msg, NL802154_ATTR_SEC_DEVKEY); if (!nl_devkey) goto nla_put_failure; if (nla_put_le64(msg, NL802154_DEVKEY_ATTR_EXTENDED_ADDR, extended_addr, NL802154_DEVKEY_ATTR_PAD) || nla_put_u32(msg, NL802154_DEVKEY_ATTR_FRAME_COUNTER, devkey->frame_counter)) goto nla_put_failure; nl_key_id = nla_nest_start_noflag(msg, NL802154_DEVKEY_ATTR_ID); if (!nl_key_id) goto nla_put_failure; if (ieee802154_llsec_send_key_id(msg, &devkey->key_id) < 0) goto nla_put_failure; nla_nest_end(msg, nl_key_id); nla_nest_end(msg, nl_devkey); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_dump_llsec_devkey(struct sk_buff *skb, struct netlink_callback *cb) { struct cfg802154_registered_device *rdev = NULL; struct ieee802154_llsec_device_key *kpos; struct ieee802154_llsec_device *dpos; struct ieee802154_llsec_table *table; struct wpan_dev *wpan_dev; int err; err = nl802154_prepare_wpan_dev_dump(skb, cb, &rdev, &wpan_dev); if (err) return err; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) { err = skb->len; goto out_err; } if (!wpan_dev->netdev) { err = -EINVAL; goto out_err; } rdev_lock_llsec_table(rdev, wpan_dev); rdev_get_llsec_table(rdev, wpan_dev, &table); /* TODO make it like station dump */ if (cb->args[2]) goto out; /* TODO look if remove devkey and do some nested attribute */ list_for_each_entry(dpos, &table->devices, list) { list_for_each_entry(kpos, &dpos->keys, list) { if (nl802154_send_devkey(skb, NL802154_CMD_NEW_SEC_LEVEL, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev->netdev, dpos->hwaddr, kpos) < 0) { /* TODO */ err = -EIO; rdev_unlock_llsec_table(rdev, wpan_dev); goto out_err; } } } cb->args[2] = 1; out: rdev_unlock_llsec_table(rdev, wpan_dev); err = skb->len; out_err: nl802154_finish_wpan_dev_dump(rdev); return err; } static const struct nla_policy nl802154_devkey_policy[NL802154_DEVKEY_ATTR_MAX + 1] = { [NL802154_DEVKEY_ATTR_FRAME_COUNTER] = { NLA_U32 }, [NL802154_DEVKEY_ATTR_EXTENDED_ADDR] = { NLA_U64 }, [NL802154_DEVKEY_ATTR_ID] = { NLA_NESTED }, }; static int nl802154_add_llsec_devkey(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct nlattr *attrs[NL802154_DEVKEY_ATTR_MAX + 1]; struct ieee802154_llsec_device_key key; __le64 extended_addr; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_SEC_DEVKEY] || nla_parse_nested_deprecated(attrs, NL802154_DEVKEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVKEY], nl802154_devkey_policy, info->extack) < 0) return -EINVAL; if (!attrs[NL802154_DEVKEY_ATTR_FRAME_COUNTER] || !attrs[NL802154_DEVKEY_ATTR_EXTENDED_ADDR]) return -EINVAL; /* TODO change key.id ? */ if (ieee802154_llsec_parse_key_id(attrs[NL802154_DEVKEY_ATTR_ID], &key.key_id) < 0) return -ENOBUFS; /* TODO be32 */ key.frame_counter = nla_get_u32(attrs[NL802154_DEVKEY_ATTR_FRAME_COUNTER]); /* TODO change naming hwaddr -> extended_addr * check unique identifier short+pan OR extended_addr */ extended_addr = nla_get_le64(attrs[NL802154_DEVKEY_ATTR_EXTENDED_ADDR]); return rdev_add_devkey(rdev, wpan_dev, extended_addr, &key); } static int nl802154_del_llsec_devkey(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct nlattr *attrs[NL802154_DEVKEY_ATTR_MAX + 1]; struct ieee802154_llsec_device_key key; __le64 extended_addr; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (!info->attrs[NL802154_ATTR_SEC_DEVKEY] || nla_parse_nested_deprecated(attrs, NL802154_DEVKEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVKEY], nl802154_devkey_policy, info->extack)) return -EINVAL; if (!attrs[NL802154_DEVKEY_ATTR_EXTENDED_ADDR]) return -EINVAL; /* TODO change key.id ? */ if (ieee802154_llsec_parse_key_id(attrs[NL802154_DEVKEY_ATTR_ID], &key.key_id) < 0) return -ENOBUFS; /* TODO change naming hwaddr -> extended_addr * check unique identifier short+pan OR extended_addr */ extended_addr = nla_get_le64(attrs[NL802154_DEVKEY_ATTR_EXTENDED_ADDR]); return rdev_del_devkey(rdev, wpan_dev, extended_addr, &key); } static int nl802154_send_seclevel(struct sk_buff *msg, u32 cmd, u32 portid, u32 seq, int flags, struct cfg802154_registered_device *rdev, struct net_device *dev, const struct ieee802154_llsec_seclevel *sl) { void *hdr; struct nlattr *nl_seclevel; hdr = nl802154hdr_put(msg, portid, seq, flags, cmd); if (!hdr) return -ENOBUFS; if (nla_put_u32(msg, NL802154_ATTR_IFINDEX, dev->ifindex)) goto nla_put_failure; nl_seclevel = nla_nest_start_noflag(msg, NL802154_ATTR_SEC_LEVEL); if (!nl_seclevel) goto nla_put_failure; if (nla_put_u32(msg, NL802154_SECLEVEL_ATTR_FRAME, sl->frame_type) || nla_put_u32(msg, NL802154_SECLEVEL_ATTR_LEVELS, sl->sec_levels) || nla_put_u8(msg, NL802154_SECLEVEL_ATTR_DEV_OVERRIDE, sl->device_override)) goto nla_put_failure; if (sl->frame_type == NL802154_FRAME_CMD) { if (nla_put_u32(msg, NL802154_SECLEVEL_ATTR_CMD_FRAME, sl->cmd_frame_id)) goto nla_put_failure; } nla_nest_end(msg, nl_seclevel); genlmsg_end(msg, hdr); return 0; nla_put_failure: genlmsg_cancel(msg, hdr); return -EMSGSIZE; } static int nl802154_dump_llsec_seclevel(struct sk_buff *skb, struct netlink_callback *cb) { struct cfg802154_registered_device *rdev = NULL; struct ieee802154_llsec_seclevel *sl; struct ieee802154_llsec_table *table; struct wpan_dev *wpan_dev; int err; err = nl802154_prepare_wpan_dev_dump(skb, cb, &rdev, &wpan_dev); if (err) return err; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) { err = skb->len; goto out_err; } if (!wpan_dev->netdev) { err = -EINVAL; goto out_err; } rdev_lock_llsec_table(rdev, wpan_dev); rdev_get_llsec_table(rdev, wpan_dev, &table); /* TODO make it like station dump */ if (cb->args[2]) goto out; list_for_each_entry(sl, &table->security_levels, list) { if (nl802154_send_seclevel(skb, NL802154_CMD_NEW_SEC_LEVEL, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI, rdev, wpan_dev->netdev, sl) < 0) { /* TODO */ err = -EIO; rdev_unlock_llsec_table(rdev, wpan_dev); goto out_err; } } cb->args[2] = 1; out: rdev_unlock_llsec_table(rdev, wpan_dev); err = skb->len; out_err: nl802154_finish_wpan_dev_dump(rdev); return err; } static const struct nla_policy nl802154_seclevel_policy[NL802154_SECLEVEL_ATTR_MAX + 1] = { [NL802154_SECLEVEL_ATTR_LEVELS] = { .type = NLA_U8 }, [NL802154_SECLEVEL_ATTR_FRAME] = { .type = NLA_U32 }, [NL802154_SECLEVEL_ATTR_CMD_FRAME] = { .type = NLA_U32 }, [NL802154_SECLEVEL_ATTR_DEV_OVERRIDE] = { .type = NLA_U8 }, }; static int llsec_parse_seclevel(struct nlattr *nla, struct ieee802154_llsec_seclevel *sl) { struct nlattr *attrs[NL802154_SECLEVEL_ATTR_MAX + 1]; if (!nla || nla_parse_nested_deprecated(attrs, NL802154_SECLEVEL_ATTR_MAX, nla, nl802154_seclevel_policy, NULL)) return -EINVAL; memset(sl, 0, sizeof(*sl)); if (!attrs[NL802154_SECLEVEL_ATTR_LEVELS] || !attrs[NL802154_SECLEVEL_ATTR_FRAME] || !attrs[NL802154_SECLEVEL_ATTR_DEV_OVERRIDE]) return -EINVAL; sl->sec_levels = nla_get_u8(attrs[NL802154_SECLEVEL_ATTR_LEVELS]); sl->frame_type = nla_get_u32(attrs[NL802154_SECLEVEL_ATTR_FRAME]); sl->device_override = nla_get_u8(attrs[NL802154_SECLEVEL_ATTR_DEV_OVERRIDE]); if (sl->frame_type > NL802154_FRAME_MAX || (sl->device_override != 0 && sl->device_override != 1)) return -EINVAL; if (sl->frame_type == NL802154_FRAME_CMD) { if (!attrs[NL802154_SECLEVEL_ATTR_CMD_FRAME]) return -EINVAL; sl->cmd_frame_id = nla_get_u32(attrs[NL802154_SECLEVEL_ATTR_CMD_FRAME]); if (sl->cmd_frame_id > NL802154_CMD_FRAME_MAX) return -EINVAL; } return 0; } static int nl802154_add_llsec_seclevel(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct ieee802154_llsec_seclevel sl; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (llsec_parse_seclevel(info->attrs[NL802154_ATTR_SEC_LEVEL], &sl) < 0) return -EINVAL; return rdev_add_seclevel(rdev, wpan_dev, &sl); } static int nl802154_del_llsec_seclevel(struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev = info->user_ptr[0]; struct net_device *dev = info->user_ptr[1]; struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct ieee802154_llsec_seclevel sl; if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) return -EOPNOTSUPP; if (llsec_parse_seclevel(info->attrs[NL802154_ATTR_SEC_LEVEL], &sl) < 0) return -EINVAL; return rdev_del_seclevel(rdev, wpan_dev, &sl); } #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */ #define NL802154_FLAG_NEED_WPAN_PHY 0x01 #define NL802154_FLAG_NEED_NETDEV 0x02 #define NL802154_FLAG_NEED_RTNL 0x04 #define NL802154_FLAG_CHECK_NETDEV_UP 0x08 #define NL802154_FLAG_NEED_WPAN_DEV 0x10 static int nl802154_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info) { struct cfg802154_registered_device *rdev; struct wpan_dev *wpan_dev; struct net_device *dev; bool rtnl = ops->internal_flags & NL802154_FLAG_NEED_RTNL; if (rtnl) rtnl_lock(); if (ops->internal_flags & NL802154_FLAG_NEED_WPAN_PHY) { rdev = cfg802154_get_dev_from_info(genl_info_net(info), info); if (IS_ERR(rdev)) { if (rtnl) rtnl_unlock(); return PTR_ERR(rdev); } info->user_ptr[0] = rdev; } else if (ops->internal_flags & NL802154_FLAG_NEED_NETDEV || ops->internal_flags & NL802154_FLAG_NEED_WPAN_DEV) { ASSERT_RTNL(); wpan_dev = __cfg802154_wpan_dev_from_attrs(genl_info_net(info), info->attrs); if (IS_ERR(wpan_dev)) { if (rtnl) rtnl_unlock(); return PTR_ERR(wpan_dev); } dev = wpan_dev->netdev; rdev = wpan_phy_to_rdev(wpan_dev->wpan_phy); if (ops->internal_flags & NL802154_FLAG_NEED_NETDEV) { if (!dev) { if (rtnl) rtnl_unlock(); return -EINVAL; } info->user_ptr[1] = dev; } else { info->user_ptr[1] = wpan_dev; } if (dev) { if (ops->internal_flags & NL802154_FLAG_CHECK_NETDEV_UP && !netif_running(dev)) { if (rtnl) rtnl_unlock(); return -ENETDOWN; } dev_hold(dev); } info->user_ptr[0] = rdev; } return 0; } static void nl802154_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info) { if (info->user_ptr[1]) { if (ops->internal_flags & NL802154_FLAG_NEED_WPAN_DEV) { struct wpan_dev *wpan_dev = info->user_ptr[1]; dev_put(wpan_dev->netdev); } else { dev_put(info->user_ptr[1]); } } if (ops->internal_flags & NL802154_FLAG_NEED_RTNL) rtnl_unlock(); } static const struct genl_ops nl802154_ops[] = { { .cmd = NL802154_CMD_GET_WPAN_PHY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP_STRICT, .doit = nl802154_get_wpan_phy, .dumpit = nl802154_dump_wpan_phy, .done = nl802154_dump_wpan_phy_done, /* can be retrieved by unprivileged users */ .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_GET_INTERFACE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_get_interface, .dumpit = nl802154_dump_interface, /* can be retrieved by unprivileged users */ .internal_flags = NL802154_FLAG_NEED_WPAN_DEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_NEW_INTERFACE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_new_interface, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DEL_INTERFACE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_del_interface, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_DEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_CHANNEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_channel, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_CCA_MODE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_cca_mode, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_CCA_ED_LEVEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_cca_ed_level, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_TX_POWER, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_tx_power, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_WPAN_PHY_NETNS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_wpan_phy_netns, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_WPAN_PHY | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_PAN_ID, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_pan_id, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_SHORT_ADDR, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_short_addr, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_BACKOFF_EXPONENT, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_backoff_exponent, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_MAX_CSMA_BACKOFFS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_max_csma_backoffs, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_MAX_FRAME_RETRIES, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_max_frame_retries, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_LBT_MODE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_lbt_mode, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_ACKREQ_DEFAULT, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_ackreq_default, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_TRIGGER_SCAN, .doit = nl802154_trigger_scan, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_ABORT_SCAN, .doit = nl802154_abort_scan, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SEND_BEACONS, .doit = nl802154_send_beacons, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_STOP_BEACONS, .doit = nl802154_stop_beacons, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_ASSOCIATE, .doit = nl802154_associate, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DISASSOCIATE, .doit = nl802154_disassociate, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_CHECK_NETDEV_UP | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_SET_MAX_ASSOCIATIONS, .doit = nl802154_set_max_associations, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_LIST_ASSOCIATIONS, .dumpit = nl802154_list_associations, /* can be retrieved by unprivileged users */ }, #ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL { .cmd = NL802154_CMD_SET_SEC_PARAMS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_set_llsec_params, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_GET_SEC_KEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP_STRICT, /* TODO .doit by matching key id? */ .dumpit = nl802154_dump_llsec_key, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_NEW_SEC_KEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_add_llsec_key, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DEL_SEC_KEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_del_llsec_key, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, /* TODO unique identifier must short+pan OR extended_addr */ { .cmd = NL802154_CMD_GET_SEC_DEV, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP_STRICT, /* TODO .doit by matching extended_addr? */ .dumpit = nl802154_dump_llsec_dev, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_NEW_SEC_DEV, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_add_llsec_dev, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DEL_SEC_DEV, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_del_llsec_dev, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, /* TODO remove complete devkey, put it as nested? */ { .cmd = NL802154_CMD_GET_SEC_DEVKEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP_STRICT, /* TODO doit by matching ??? */ .dumpit = nl802154_dump_llsec_devkey, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_NEW_SEC_DEVKEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_add_llsec_devkey, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DEL_SEC_DEVKEY, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_del_llsec_devkey, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_GET_SEC_LEVEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP_STRICT, /* TODO .doit by matching frame_type? */ .dumpit = nl802154_dump_llsec_seclevel, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_NEW_SEC_LEVEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl802154_add_llsec_seclevel, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, { .cmd = NL802154_CMD_DEL_SEC_LEVEL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, /* TODO match frame_type only? */ .doit = nl802154_del_llsec_seclevel, .flags = GENL_ADMIN_PERM, .internal_flags = NL802154_FLAG_NEED_NETDEV | NL802154_FLAG_NEED_RTNL, }, #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */ }; static struct genl_family nl802154_fam __ro_after_init = { .name = NL802154_GENL_NAME, /* have users key off the name instead */ .hdrsize = 0, /* no private header */ .version = 1, /* no particular meaning now */ .maxattr = NL802154_ATTR_MAX, .policy = nl802154_policy, .netnsok = true, .pre_doit = nl802154_pre_doit, .post_doit = nl802154_post_doit, .module = THIS_MODULE, .ops = nl802154_ops, .n_ops = ARRAY_SIZE(nl802154_ops), .resv_start_op = NL802154_CMD_DEL_SEC_LEVEL + 1, .mcgrps = nl802154_mcgrps, .n_mcgrps = ARRAY_SIZE(nl802154_mcgrps), }; /* initialisation/exit functions */ int __init nl802154_init(void) { return genl_register_family(&nl802154_fam); } void nl802154_exit(void) { genl_unregister_family(&nl802154_fam); } |
| 2 2 3 1 2 1 3 2 6 1 2 1 1 3 2 1 22 1 8 1 12 4 1 15 14 4 11 2 2 4 8 3 1 6 1 2 2 2 2 2 12 15 12 6 6 4 10 1 1 1 6 6 6 2 6 6 2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 | // SPDX-License-Identifier: GPL-2.0-or-later #include <linux/plist.h> #include <linux/sched/signal.h> #include "futex.h" #include "../locking/rtmutex_common.h" /* * On PREEMPT_RT, the hash bucket lock is a 'sleeping' spinlock with an * underlying rtmutex. The task which is about to be requeued could have * just woken up (timeout, signal). After the wake up the task has to * acquire hash bucket lock, which is held by the requeue code. As a task * can only be blocked on _ONE_ rtmutex at a time, the proxy lock blocking * and the hash bucket lock blocking would collide and corrupt state. * * On !PREEMPT_RT this is not a problem and everything could be serialized * on hash bucket lock, but aside of having the benefit of common code, * this allows to avoid doing the requeue when the task is already on the * way out and taking the hash bucket lock of the original uaddr1 when the * requeue has been completed. * * The following state transitions are valid: * * On the waiter side: * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT * * On the requeue side: * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed) * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed) * * The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this * signals that the waiter is already on the way out. It also means that * the waiter is still on the 'wait' futex, i.e. uaddr1. * * The waiter side signals early wakeup to the requeue side either through * setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending * on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately * proceed to take the hash bucket lock of uaddr1. If it set state to WAIT, * which means the wakeup is interleaving with a requeue in progress it has * to wait for the requeue side to change the state. Either to DONE/LOCKED * or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex * and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by * the requeue side when the requeue attempt failed via deadlock detection * and therefore the waiter q is still on the uaddr1 futex. */ enum { Q_REQUEUE_PI_NONE = 0, Q_REQUEUE_PI_IGNORE, Q_REQUEUE_PI_IN_PROGRESS, Q_REQUEUE_PI_WAIT, Q_REQUEUE_PI_DONE, Q_REQUEUE_PI_LOCKED, }; const struct futex_q futex_q_init = { /* list gets initialized in futex_queue()*/ .wake = futex_wake_mark, .key = FUTEX_KEY_INIT, .bitset = FUTEX_BITSET_MATCH_ANY, .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), }; /** * requeue_futex() - Requeue a futex_q from one hb to another * @q: the futex_q to requeue * @hb1: the source hash_bucket * @hb2: the target hash_bucket * @key2: the new key for the requeued futex_q */ static inline void requeue_futex(struct futex_q *q, struct futex_hash_bucket *hb1, struct futex_hash_bucket *hb2, union futex_key *key2) { /* * If key1 and key2 hash to the same bucket, no need to * requeue. */ if (likely(&hb1->chain != &hb2->chain)) { plist_del(&q->list, &hb1->chain); futex_hb_waiters_dec(hb1); futex_hb_waiters_inc(hb2); plist_add(&q->list, &hb2->chain); q->lock_ptr = &hb2->lock; /* * hb1 and hb2 belong to the same futex_hash_bucket_private * because if we managed get a reference on hb1 then it can't be * replaced. Therefore we avoid put(hb1)+get(hb2) here. */ } q->key = *key2; } static inline bool futex_requeue_pi_prepare(struct futex_q *q, struct futex_pi_state *pi_state) { int old, new; /* * Set state to Q_REQUEUE_PI_IN_PROGRESS unless an early wakeup has * already set Q_REQUEUE_PI_IGNORE to signal that requeue should * ignore the waiter. */ old = atomic_read_acquire(&q->requeue_state); do { if (old == Q_REQUEUE_PI_IGNORE) return false; /* * futex_proxy_trylock_atomic() might have set it to * IN_PROGRESS and a interleaved early wake to WAIT. * * It was considered to have an extra state for that * trylock, but that would just add more conditionals * all over the place for a dubious value. */ if (old != Q_REQUEUE_PI_NONE) break; new = Q_REQUEUE_PI_IN_PROGRESS; } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); q->pi_state = pi_state; return true; } static inline void futex_requeue_pi_complete(struct futex_q *q, int locked) { int old, new; old = atomic_read_acquire(&q->requeue_state); do { if (old == Q_REQUEUE_PI_IGNORE) return; if (locked >= 0) { /* Requeue succeeded. Set DONE or LOCKED */ WARN_ON_ONCE(old != Q_REQUEUE_PI_IN_PROGRESS && old != Q_REQUEUE_PI_WAIT); new = Q_REQUEUE_PI_DONE + locked; } else if (old == Q_REQUEUE_PI_IN_PROGRESS) { /* Deadlock, no early wakeup interleave */ new = Q_REQUEUE_PI_NONE; } else { /* Deadlock, early wakeup interleave. */ WARN_ON_ONCE(old != Q_REQUEUE_PI_WAIT); new = Q_REQUEUE_PI_IGNORE; } } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); #ifdef CONFIG_PREEMPT_RT /* If the waiter interleaved with the requeue let it know */ if (unlikely(old == Q_REQUEUE_PI_WAIT)) rcuwait_wake_up(&q->requeue_wait); #endif } static inline int futex_requeue_pi_wakeup_sync(struct futex_q *q) { int old, new; old = atomic_read_acquire(&q->requeue_state); do { /* Is requeue done already? */ if (old >= Q_REQUEUE_PI_DONE) return old; /* * If not done, then tell the requeue code to either ignore * the waiter or to wake it up once the requeue is done. */ new = Q_REQUEUE_PI_WAIT; if (old == Q_REQUEUE_PI_NONE) new = Q_REQUEUE_PI_IGNORE; } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new)); /* If the requeue was in progress, wait for it to complete */ if (old == Q_REQUEUE_PI_IN_PROGRESS) { #ifdef CONFIG_PREEMPT_RT rcuwait_wait_event(&q->requeue_wait, atomic_read(&q->requeue_state) != Q_REQUEUE_PI_WAIT, TASK_UNINTERRUPTIBLE); #else (void)atomic_cond_read_relaxed(&q->requeue_state, VAL != Q_REQUEUE_PI_WAIT); #endif } /* * Requeue is now either prohibited or complete. Reread state * because during the wait above it might have changed. Nothing * will modify q->requeue_state after this point. */ return atomic_read(&q->requeue_state); } /** * requeue_pi_wake_futex() - Wake a task that acquired the lock during requeue * @q: the futex_q * @key: the key of the requeue target futex * @hb: the hash_bucket of the requeue target futex * * During futex_requeue, with requeue_pi=1, it is possible to acquire the * target futex if it is uncontended or via a lock steal. * * 1) Set @q::key to the requeue target futex key so the waiter can detect * the wakeup on the right futex. * * 2) Dequeue @q from the hash bucket. * * 3) Set @q::rt_waiter to NULL so the woken up task can detect atomic lock * acquisition. * * 4) Set the q->lock_ptr to the requeue target hb->lock for the case that * the waiter has to fixup the pi state. * * 5) Complete the requeue state so the waiter can make progress. After * this point the waiter task can return from the syscall immediately in * case that the pi state does not have to be fixed up. * * 6) Wake the waiter task. * * Must be called with both q->lock_ptr and hb->lock held. */ static inline void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key, struct futex_hash_bucket *hb) { q->key = *key; __futex_unqueue(q); WARN_ON(!q->rt_waiter); q->rt_waiter = NULL; /* * Acquire a reference for the waiter to ensure valid * futex_q::lock_ptr. */ futex_hash_get(hb); q->drop_hb_ref = true; q->lock_ptr = &hb->lock; /* Signal locked state to the waiter */ futex_requeue_pi_complete(q, 1); wake_up_state(q->task, TASK_NORMAL); } /** * futex_proxy_trylock_atomic() - Attempt an atomic lock for the top waiter * @pifutex: the user address of the to futex * @hb1: the from futex hash bucket, must be locked by the caller * @hb2: the to futex hash bucket, must be locked by the caller * @key1: the from futex key * @key2: the to futex key * @ps: address to store the pi_state pointer * @exiting: Pointer to store the task pointer of the owner task * which is in the middle of exiting * @set_waiters: force setting the FUTEX_WAITERS bit (1) or not (0) * * Try and get the lock on behalf of the top waiter if we can do it atomically. * Wake the top waiter if we succeed. If the caller specified set_waiters, * then direct futex_lock_pi_atomic() to force setting the FUTEX_WAITERS bit. * hb1 and hb2 must be held by the caller. * * @exiting is only set when the return value is -EBUSY. If so, this holds * a refcount on the exiting task on return and the caller needs to drop it * after waiting for the exit to complete. * * Return: * - 0 - failed to acquire the lock atomically; * - >0 - acquired the lock, return value is vpid of the top_waiter * - <0 - error */ static int futex_proxy_trylock_atomic(u32 __user *pifutex, struct futex_hash_bucket *hb1, struct futex_hash_bucket *hb2, union futex_key *key1, union futex_key *key2, struct futex_pi_state **ps, struct task_struct **exiting, int set_waiters) { struct futex_q *top_waiter; u32 curval; int ret; if (futex_get_value_locked(&curval, pifutex)) return -EFAULT; if (unlikely(should_fail_futex(true))) return -EFAULT; /* * Find the top_waiter and determine if there are additional waiters. * If the caller intends to requeue more than 1 waiter to pifutex, * force futex_lock_pi_atomic() to set the FUTEX_WAITERS bit now, * as we have means to handle the possible fault. If not, don't set * the bit unnecessarily as it will force the subsequent unlock to enter * the kernel. */ top_waiter = futex_top_waiter(hb1, key1); /* There are no waiters, nothing for us to do. */ if (!top_waiter) return 0; /* * Ensure that this is a waiter sitting in futex_wait_requeue_pi() * and waiting on the 'waitqueue' futex which is always !PI. */ if (!top_waiter->rt_waiter || top_waiter->pi_state) return -EINVAL; /* Ensure we requeue to the expected futex. */ if (!futex_match(top_waiter->requeue_pi_key, key2)) return -EINVAL; /* Ensure that this does not race against an early wakeup */ if (!futex_requeue_pi_prepare(top_waiter, NULL)) return -EAGAIN; /* * Try to take the lock for top_waiter and set the FUTEX_WAITERS bit * in the contended case or if @set_waiters is true. * * In the contended case PI state is attached to the lock owner. If * the user space lock can be acquired then PI state is attached to * the new owner (@top_waiter->task) when @set_waiters is true. */ ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->task, exiting, set_waiters); if (ret == 1) { /* * Lock was acquired in user space and PI state was * attached to @top_waiter->task. That means state is fully * consistent and the waiter can return to user space * immediately after the wakeup. */ requeue_pi_wake_futex(top_waiter, key2, hb2); } else if (ret < 0) { /* Rewind top_waiter::requeue_state */ futex_requeue_pi_complete(top_waiter, ret); } else { /* * futex_lock_pi_atomic() did not acquire the user space * futex, but managed to establish the proxy lock and pi * state. top_waiter::requeue_state cannot be fixed up here * because the waiter is not enqueued on the rtmutex * yet. This is handled at the callsite depending on the * result of rt_mutex_start_proxy_lock() which is * guaranteed to be reached with this function returning 0. */ } return ret; } /** * futex_requeue() - Requeue waiters from uaddr1 to uaddr2 * @uaddr1: source futex user address * @flags1: futex flags (FLAGS_SHARED, etc.) * @uaddr2: target futex user address * @flags2: futex flags (FLAGS_SHARED, etc.) * @nr_wake: number of waiters to wake (must be 1 for requeue_pi) * @nr_requeue: number of waiters to requeue (0-INT_MAX) * @cmpval: @uaddr1 expected value (or %NULL) * @requeue_pi: if we are attempting to requeue from a non-pi futex to a * pi futex (pi to pi requeue is not supported) * * Requeue waiters on uaddr1 to uaddr2. In the requeue_pi case, try to acquire * uaddr2 atomically on behalf of the top waiter. * * Return: * - >=0 - on success, the number of tasks requeued or woken; * - <0 - on error */ int futex_requeue(u32 __user *uaddr1, unsigned int flags1, u32 __user *uaddr2, unsigned int flags2, int nr_wake, int nr_requeue, u32 *cmpval, int requeue_pi) { union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT; int task_count = 0, ret; struct futex_pi_state *pi_state = NULL; struct futex_q *this, *next; DEFINE_WAKE_Q(wake_q); if (nr_wake < 0 || nr_requeue < 0) return -EINVAL; /* * When PI not supported: return -ENOSYS if requeue_pi is true, * consequently the compiler knows requeue_pi is always false past * this point which will optimize away all the conditional code * further down. */ if (!IS_ENABLED(CONFIG_FUTEX_PI) && requeue_pi) return -ENOSYS; if (requeue_pi) { /* * Requeue PI only works on two distinct uaddrs. This * check is only valid for private futexes. See below. */ if (uaddr1 == uaddr2) return -EINVAL; /* * futex_requeue() allows the caller to define the number * of waiters to wake up via the @nr_wake argument. With * REQUEUE_PI, waking up more than one waiter is creating * more problems than it solves. Waking up a waiter makes * only sense if the PI futex @uaddr2 is uncontended as * this allows the requeue code to acquire the futex * @uaddr2 before waking the waiter. The waiter can then * return to user space without further action. A secondary * wakeup would just make the futex_wait_requeue_pi() * handling more complex, because that code would have to * look up pi_state and do more or less all the handling * which the requeue code has to do for the to be requeued * waiters. So restrict the number of waiters to wake to * one, and only wake it up when the PI futex is * uncontended. Otherwise requeue it and let the unlock of * the PI futex handle the wakeup. * * All REQUEUE_PI users, e.g. pthread_cond_signal() and * pthread_cond_broadcast() must use nr_wake=1. */ if (nr_wake != 1) return -EINVAL; /* * requeue_pi requires a pi_state, try to allocate it now * without any locks in case it fails. */ if (refill_pi_state_cache()) return -ENOMEM; } retry: ret = get_futex_key(uaddr1, flags1, &key1, FUTEX_READ); if (unlikely(ret != 0)) return ret; ret = get_futex_key(uaddr2, flags2, &key2, requeue_pi ? FUTEX_WRITE : FUTEX_READ); if (unlikely(ret != 0)) return ret; /* * The check above which compares uaddrs is not sufficient for * shared futexes. We need to compare the keys: */ if (requeue_pi && futex_match(&key1, &key2)) return -EINVAL; retry_private: if (1) { CLASS(hb, hb1)(&key1); CLASS(hb, hb2)(&key2); futex_hb_waiters_inc(hb2); double_lock_hb(hb1, hb2); if (likely(cmpval != NULL)) { u32 curval; ret = futex_get_value_locked(&curval, uaddr1); if (unlikely(ret)) { futex_hb_waiters_dec(hb2); double_unlock_hb(hb1, hb2); ret = get_user(curval, uaddr1); if (ret) return ret; if (!(flags1 & FLAGS_SHARED)) goto retry_private; goto retry; } if (curval != *cmpval) { ret = -EAGAIN; goto out_unlock; } } if (requeue_pi) { struct task_struct *exiting = NULL; /* * Attempt to acquire uaddr2 and wake the top waiter. If we * intend to requeue waiters, force setting the FUTEX_WAITERS * bit. We force this here where we are able to easily handle * faults rather in the requeue loop below. * * Updates topwaiter::requeue_state if a top waiter exists. */ ret = futex_proxy_trylock_atomic(uaddr2, hb1, hb2, &key1, &key2, &pi_state, &exiting, nr_requeue); /* * At this point the top_waiter has either taken uaddr2 or * is waiting on it. In both cases pi_state has been * established and an initial refcount on it. In case of an * error there's nothing. * * The top waiter's requeue_state is up to date: * * - If the lock was acquired atomically (ret == 1), then * the state is Q_REQUEUE_PI_LOCKED. * * The top waiter has been dequeued and woken up and can * return to user space immediately. The kernel/user * space state is consistent. In case that there must be * more waiters requeued the WAITERS bit in the user * space futex is set so the top waiter task has to go * into the syscall slowpath to unlock the futex. This * will block until this requeue operation has been * completed and the hash bucket locks have been * dropped. * * - If the trylock failed with an error (ret < 0) then * the state is either Q_REQUEUE_PI_NONE, i.e. "nothing * happened", or Q_REQUEUE_PI_IGNORE when there was an * interleaved early wakeup. * * - If the trylock did not succeed (ret == 0) then the * state is either Q_REQUEUE_PI_IN_PROGRESS or * Q_REQUEUE_PI_WAIT if an early wakeup interleaved. * This will be cleaned up in the loop below, which * cannot fail because futex_proxy_trylock_atomic() did * the same sanity checks for requeue_pi as the loop * below does. */ switch (ret) { case 0: /* We hold a reference on the pi state. */ break; case 1: /* * futex_proxy_trylock_atomic() acquired the user space * futex. Adjust task_count. */ task_count++; ret = 0; break; /* * If the above failed, then pi_state is NULL and * waiter::requeue_state is correct. */ case -EFAULT: futex_hb_waiters_dec(hb2); double_unlock_hb(hb1, hb2); ret = fault_in_user_writeable(uaddr2); if (!ret) goto retry; return ret; case -EBUSY: case -EAGAIN: /* * Two reasons for this: * - EBUSY: Owner is exiting and we just wait for the * exit to complete. * - EAGAIN: The user space value changed. */ futex_hb_waiters_dec(hb2); double_unlock_hb(hb1, hb2); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise * this task might loop forever, aka. live lock. */ wait_for_owner_exiting(ret, exiting); cond_resched(); goto retry; default: goto out_unlock; } } plist_for_each_entry_safe(this, next, &hb1->chain, list) { if (task_count - nr_wake >= nr_requeue) break; if (!futex_match(&this->key, &key1)) continue; /* * FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI should always * be paired with each other and no other futex ops. * * We should never be requeueing a futex_q with a pi_state, * which is awaiting a futex_unlock_pi(). */ if ((requeue_pi && !this->rt_waiter) || (!requeue_pi && this->rt_waiter) || this->pi_state) { ret = -EINVAL; break; } /* Plain futexes just wake or requeue and are done */ if (!requeue_pi) { if (++task_count <= nr_wake) this->wake(&wake_q, this); else requeue_futex(this, hb1, hb2, &key2); continue; } /* Ensure we requeue to the expected futex for requeue_pi. */ if (!futex_match(this->requeue_pi_key, &key2)) { ret = -EINVAL; break; } /* * Requeue nr_requeue waiters and possibly one more in the case * of requeue_pi if we couldn't acquire the lock atomically. * * Prepare the waiter to take the rt_mutex. Take a refcount * on the pi_state and store the pointer in the futex_q * object of the waiter. */ get_pi_state(pi_state); /* Don't requeue when the waiter is already on the way out. */ if (!futex_requeue_pi_prepare(this, pi_state)) { /* * Early woken waiter signaled that it is on the * way out. Drop the pi_state reference and try the * next waiter. @this->pi_state is still NULL. */ put_pi_state(pi_state); continue; } ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex, this->rt_waiter, this->task); if (ret == 1) { /* * We got the lock. We do neither drop the refcount * on pi_state nor clear this->pi_state because the * waiter needs the pi_state for cleaning up the * user space value. It will drop the refcount * after doing so. this::requeue_state is updated * in the wakeup as well. */ requeue_pi_wake_futex(this, &key2, hb2); task_count++; } else if (!ret) { /* Waiter is queued, move it to hb2 */ requeue_futex(this, hb1, hb2, &key2); futex_requeue_pi_complete(this, 0); task_count++; } else { /* * rt_mutex_start_proxy_lock() detected a potential * deadlock when we tried to queue that waiter. * Drop the pi_state reference which we took above * and remove the pointer to the state from the * waiters futex_q object. */ this->pi_state = NULL; put_pi_state(pi_state); futex_requeue_pi_complete(this, ret); /* * We stop queueing more waiters and let user space * deal with the mess. */ break; } } /* * We took an extra initial reference to the pi_state in * futex_proxy_trylock_atomic(). We need to drop it here again. */ put_pi_state(pi_state); out_unlock: futex_hb_waiters_dec(hb2); double_unlock_hb(hb1, hb2); } wake_up_q(&wake_q); return ret ? ret : task_count; } /** * handle_early_requeue_pi_wakeup() - Handle early wakeup on the initial futex * @hb: the hash_bucket futex_q was original enqueued on * @q: the futex_q woken while waiting to be requeued * @timeout: the timeout associated with the wait (NULL if none) * * Determine the cause for the early wakeup. * * Return: * -EWOULDBLOCK or -ETIMEDOUT or -ERESTARTNOINTR */ static inline int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb, struct futex_q *q, struct hrtimer_sleeper *timeout) { int ret; /* * With the hb lock held, we avoid races while we process the wakeup. * We only need to hold hb (and not hb2) to ensure atomicity as the * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb. * It can't be requeued from uaddr2 to something else since we don't * support a PI aware source futex for requeue. */ WARN_ON_ONCE(&hb->lock != q->lock_ptr); /* * We were woken prior to requeue by a timeout or a signal. * Unqueue the futex_q and determine which it was. */ plist_del(&q->list, &hb->chain); futex_hb_waiters_dec(hb); /* Handle spurious wakeups gracefully */ ret = -EWOULDBLOCK; if (timeout && !timeout->task) ret = -ETIMEDOUT; else if (signal_pending(current)) ret = -ERESTARTNOINTR; return ret; } /** * futex_wait_requeue_pi() - Wait on uaddr and take uaddr2 * @uaddr: the futex we initially wait on (non-pi) * @flags: futex flags (FLAGS_SHARED, FLAGS_CLOCKRT, etc.), they must be * the same type, no requeueing from private to shared, etc. * @val: the expected value of uaddr * @abs_time: absolute timeout * @bitset: 32 bit wakeup bitset set by userspace, defaults to all * @uaddr2: the pi futex we will take prior to returning to user-space * * The caller will wait on uaddr and will be requeued by futex_requeue() to * uaddr2 which must be PI aware and unique from uaddr. Normal wakeup will wake * on uaddr2 and complete the acquisition of the rt_mutex prior to returning to * userspace. This ensures the rt_mutex maintains an owner when it has waiters; * without one, the pi logic would not know which task to boost/deboost, if * there was a need to. * * We call schedule in futex_wait_queue() when we enqueue and return there * via the following-- * 1) wakeup on uaddr2 after an atomic lock acquisition by futex_requeue() * 2) wakeup on uaddr2 after a requeue * 3) signal * 4) timeout * * If 3, cleanup and return -ERESTARTNOINTR. * * If 2, we may then block on trying to take the rt_mutex and return via: * 5) successful lock * 6) signal * 7) timeout * 8) other lock acquisition failure * * If 6, return -EWOULDBLOCK (restarting the syscall would do the same). * * If 4 or 7, we cleanup and return with -ETIMEDOUT. * * Return: * - 0 - On success; * - <0 - On error */ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset, u32 __user *uaddr2) { struct hrtimer_sleeper timeout, *to; struct rt_mutex_waiter rt_waiter; union futex_key key2 = FUTEX_KEY_INIT; struct futex_q q = futex_q_init; struct rt_mutex_base *pi_mutex; int res, ret; if (!IS_ENABLED(CONFIG_FUTEX_PI)) return -ENOSYS; if (uaddr == uaddr2) return -EINVAL; if (!bitset) return -EINVAL; to = futex_setup_timer(abs_time, &timeout, flags, current->timer_slack_ns); /* * The waiter is allocated on our stack, manipulated by the requeue * code while we sleep on uaddr. */ rt_mutex_init_waiter(&rt_waiter); ret = get_futex_key(uaddr2, flags, &key2, FUTEX_WRITE); if (unlikely(ret != 0)) goto out; q.bitset = bitset; q.rt_waiter = &rt_waiter; q.requeue_pi_key = &key2; /* * Prepare to wait on uaddr. On success, it holds hb->lock and q * is initialized. */ ret = futex_wait_setup(uaddr, val, flags, &q, &key2, current); if (ret) goto out; /* Queue the futex_q, drop the hb lock, wait for wakeup. */ futex_do_wait(&q, to); switch (futex_requeue_pi_wakeup_sync(&q)) { case Q_REQUEUE_PI_IGNORE: { CLASS(hb, hb)(&q.key); /* The waiter is still on uaddr1 */ spin_lock(&hb->lock); ret = handle_early_requeue_pi_wakeup(hb, &q, to); spin_unlock(&hb->lock); } break; case Q_REQUEUE_PI_LOCKED: /* The requeue acquired the lock */ if (q.pi_state && (q.pi_state->owner != current)) { futex_q_lockptr_lock(&q); ret = fixup_pi_owner(uaddr2, &q, true); /* * Drop the reference to the pi state which the * requeue_pi() code acquired for us. */ put_pi_state(q.pi_state); spin_unlock(q.lock_ptr); /* * Adjust the return value. It's either -EFAULT or * success (1) but the caller expects 0 for success. */ ret = ret < 0 ? ret : 0; } break; case Q_REQUEUE_PI_DONE: /* Requeue completed. Current is 'pi_blocked_on' the rtmutex */ pi_mutex = &q.pi_state->pi_mutex; ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter); /* * See futex_unlock_pi()'s cleanup: comment. */ if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter)) ret = 0; futex_q_lockptr_lock(&q); debug_rt_mutex_free_waiter(&rt_waiter); /* * Fixup the pi_state owner and possibly acquire the lock if we * haven't already. */ res = fixup_pi_owner(uaddr2, &q, !ret); /* * If fixup_pi_owner() returned an error, propagate that. If it * acquired the lock, clear -ETIMEDOUT or -EINTR. */ if (res) ret = (res < 0) ? res : 0; futex_unqueue_pi(&q); spin_unlock(q.lock_ptr); if (ret == -EINTR) { /* * We've already been requeued, but cannot restart * by calling futex_lock_pi() directly. We could * restart this syscall, but it would detect that * the user space "val" changed and return * -EWOULDBLOCK. Save the overhead of the restart * and return -EWOULDBLOCK directly. */ ret = -EWOULDBLOCK; } break; default: BUG(); } if (q.drop_hb_ref) { CLASS(hb, hb)(&q.key); /* Additional reference from requeue_pi_wake_futex() */ futex_hash_put(hb); } out: if (to) { hrtimer_cancel(&to->timer); destroy_hrtimer_on_stack(&to->timer); } return ret; } |
| 55 6 50 53 2 53 14 1 1 1 10 1 1 1 6 15 1 1 13 12 15 22 1 1 1 2 17 1 2 22 24 24 87 1 1 39 1 2 45 78 1 2 1 1 1 1 71 25 50 68 1 28 33 29 1 2 1 1 11 1 2 17 1 16 3 3 8 8 2 7 4 12 3 8 9 13 4 9 2 4 1 1 2 3 6 1 1 1 3 1 27 1 27 27 7 8 1 7 7 23 3 21 10 13 30 32 33 34 33 14 20 33 34 20 1 2 1 17 18 3 12 6 1 17 19 2 18 18 19 1 1 246 16 885 20 475 213 248 353 515 2 858 9 859 8 2 847 688 551 269 691 90 596 190 761 49 777 422 379 91 92 91 59 59 18 869 128 758 154 150 3 3 2 1 733 8 137 603 786 1 1 1 174 612 1 768 15 773 9 625 154 8 775 1 4 5 5 5 13 13 1 11 770 6 6 1 277 549 729 64 64 665 665 21 1 1 17 16 3 14 32 841 831 20 9 836 324 525 46 512 233 233 93 328 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 | // SPDX-License-Identifier: GPL-2.0-only /* * linux/fs/open.c * * Copyright (C) 1991, 1992 Linus Torvalds */ #include <linux/string.h> #include <linux/mm.h> #include <linux/file.h> #include <linux/fdtable.h> #include <linux/fsnotify.h> #include <linux/module.h> #include <linux/tty.h> #include <linux/namei.h> #include <linux/backing-dev.h> #include <linux/capability.h> #include <linux/securebits.h> #include <linux/security.h> #include <linux/mount.h> #include <linux/fcntl.h> #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/fs.h> #include <linux/personality.h> #include <linux/pagemap.h> #include <linux/syscalls.h> #include <linux/rcupdate.h> #include <linux/audit.h> #include <linux/falloc.h> #include <linux/fs_struct.h> #include <linux/dnotify.h> #include <linux/compat.h> #include <linux/mnt_idmapping.h> #include <linux/filelock.h> #include "internal.h" int do_truncate(struct mnt_idmap *idmap, struct dentry *dentry, loff_t length, unsigned int time_attrs, struct file *filp) { int ret; struct iattr newattrs; /* Not pretty: "inode->i_size" shouldn't really be signed. But it is. */ if (length < 0) return -EINVAL; newattrs.ia_size = length; newattrs.ia_valid = ATTR_SIZE | time_attrs; if (filp) { newattrs.ia_file = filp; newattrs.ia_valid |= ATTR_FILE; } /* Remove suid, sgid, and file capabilities on truncate too */ ret = dentry_needs_remove_privs(idmap, dentry); if (ret < 0) return ret; if (ret) newattrs.ia_valid |= ret | ATTR_FORCE; ret = inode_lock_killable(dentry->d_inode); if (ret) return ret; /* Note any delegations or leases have already been broken: */ ret = notify_change(idmap, dentry, &newattrs, NULL); inode_unlock(dentry->d_inode); return ret; } int vfs_truncate(const struct path *path, loff_t length) { struct mnt_idmap *idmap; struct inode *inode; int error; inode = path->dentry->d_inode; /* For directories it's -EISDIR, for other non-regulars - -EINVAL */ if (S_ISDIR(inode->i_mode)) return -EISDIR; if (!S_ISREG(inode->i_mode)) return -EINVAL; idmap = mnt_idmap(path->mnt); error = inode_permission(idmap, inode, MAY_WRITE); if (error) return error; error = fsnotify_truncate_perm(path, length); if (error) return error; error = mnt_want_write(path->mnt); if (error) return error; error = -EPERM; if (IS_APPEND(inode)) goto mnt_drop_write_and_out; error = get_write_access(inode); if (error) goto mnt_drop_write_and_out; /* * Make sure that there are no leases. get_write_access() protects * against the truncate racing with a lease-granting setlease(). */ error = break_lease(inode, O_WRONLY); if (error) goto put_write_and_out; error = security_path_truncate(path); if (!error) error = do_truncate(idmap, path->dentry, length, 0, NULL); put_write_and_out: put_write_access(inode); mnt_drop_write_and_out: mnt_drop_write(path->mnt); return error; } EXPORT_SYMBOL_GPL(vfs_truncate); int do_sys_truncate(const char __user *pathname, loff_t length) { unsigned int lookup_flags = LOOKUP_FOLLOW; struct path path; int error; if (length < 0) /* sorry, but loff_t says... */ return -EINVAL; retry: error = user_path_at(AT_FDCWD, pathname, lookup_flags, &path); if (!error) { error = vfs_truncate(&path, length); path_put(&path); } if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } return error; } SYSCALL_DEFINE2(truncate, const char __user *, path, long, length) { return do_sys_truncate(path, length); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(truncate, const char __user *, path, compat_off_t, length) { return do_sys_truncate(path, length); } #endif int do_ftruncate(struct file *file, loff_t length, int small) { struct inode *inode; struct dentry *dentry; int error; /* explicitly opened as large or we are on 64-bit box */ if (file->f_flags & O_LARGEFILE) small = 0; dentry = file->f_path.dentry; inode = dentry->d_inode; if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE)) return -EINVAL; /* Cannot ftruncate over 2^31 bytes without large file support */ if (small && length > MAX_NON_LFS) return -EINVAL; /* Check IS_APPEND on real upper inode */ if (IS_APPEND(file_inode(file))) return -EPERM; error = security_file_truncate(file); if (error) return error; error = fsnotify_truncate_perm(&file->f_path, length); if (error) return error; sb_start_write(inode->i_sb); error = do_truncate(file_mnt_idmap(file), dentry, length, ATTR_MTIME | ATTR_CTIME, file); sb_end_write(inode->i_sb); return error; } int do_sys_ftruncate(unsigned int fd, loff_t length, int small) { if (length < 0) return -EINVAL; CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; return do_ftruncate(fd_file(f), length, small); } SYSCALL_DEFINE2(ftruncate, unsigned int, fd, off_t, length) { return do_sys_ftruncate(fd, length, 1); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(ftruncate, unsigned int, fd, compat_off_t, length) { return do_sys_ftruncate(fd, length, 1); } #endif /* LFS versions of truncate are only needed on 32 bit machines */ #if BITS_PER_LONG == 32 SYSCALL_DEFINE2(truncate64, const char __user *, path, loff_t, length) { return do_sys_truncate(path, length); } SYSCALL_DEFINE2(ftruncate64, unsigned int, fd, loff_t, length) { return do_sys_ftruncate(fd, length, 0); } #endif /* BITS_PER_LONG == 32 */ #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_TRUNCATE64) COMPAT_SYSCALL_DEFINE3(truncate64, const char __user *, pathname, compat_arg_u64_dual(length)) { return ksys_truncate(pathname, compat_arg_u64_glue(length)); } #endif #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_FTRUNCATE64) COMPAT_SYSCALL_DEFINE3(ftruncate64, unsigned int, fd, compat_arg_u64_dual(length)) { return ksys_ftruncate(fd, compat_arg_u64_glue(length)); } #endif int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len) { struct inode *inode = file_inode(file); int ret; loff_t sum; if (offset < 0 || len <= 0) return -EINVAL; if (mode & ~(FALLOC_FL_MODE_MASK | FALLOC_FL_KEEP_SIZE)) return -EOPNOTSUPP; /* * Modes are exclusive, even if that is not obvious from the encoding * as bit masks and the mix with the flag in the same namespace. * * To make things even more complicated, FALLOC_FL_ALLOCATE_RANGE is * encoded as no bit set. */ switch (mode & FALLOC_FL_MODE_MASK) { case FALLOC_FL_ALLOCATE_RANGE: case FALLOC_FL_UNSHARE_RANGE: case FALLOC_FL_ZERO_RANGE: break; case FALLOC_FL_PUNCH_HOLE: if (!(mode & FALLOC_FL_KEEP_SIZE)) return -EOPNOTSUPP; break; case FALLOC_FL_COLLAPSE_RANGE: case FALLOC_FL_INSERT_RANGE: if (mode & FALLOC_FL_KEEP_SIZE) return -EOPNOTSUPP; break; default: return -EOPNOTSUPP; } if (!(file->f_mode & FMODE_WRITE)) return -EBADF; /* * On append-only files only space preallocation is supported. */ if ((mode & ~FALLOC_FL_KEEP_SIZE) && IS_APPEND(inode)) return -EPERM; if (IS_IMMUTABLE(inode)) return -EPERM; /* * We cannot allow any fallocate operation on an active swapfile */ if (IS_SWAPFILE(inode)) return -ETXTBSY; /* * Revalidate the write permissions, in case security policy has * changed since the files were opened. */ ret = security_file_permission(file, MAY_WRITE); if (ret) return ret; ret = fsnotify_file_area_perm(file, MAY_WRITE, &offset, len); if (ret) return ret; if (S_ISFIFO(inode->i_mode)) return -ESPIPE; if (S_ISDIR(inode->i_mode)) return -EISDIR; if (!S_ISREG(inode->i_mode) && !S_ISBLK(inode->i_mode)) return -ENODEV; /* Check for wraparound */ if (check_add_overflow(offset, len, &sum)) return -EFBIG; if (sum > inode->i_sb->s_maxbytes) return -EFBIG; if (!file->f_op->fallocate) return -EOPNOTSUPP; file_start_write(file); ret = file->f_op->fallocate(file, mode, offset, len); /* * Create inotify and fanotify events. * * To keep the logic simple always create events if fallocate succeeds. * This implies that events are even created if the file size remains * unchanged, e.g. when using flag FALLOC_FL_KEEP_SIZE. */ if (ret == 0) fsnotify_modify(file); file_end_write(file); return ret; } EXPORT_SYMBOL_GPL(vfs_fallocate); int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len) { CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_fallocate(fd_file(f), mode, offset, len); } SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len) { return ksys_fallocate(fd, mode, offset, len); } #if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_FALLOCATE) COMPAT_SYSCALL_DEFINE6(fallocate, int, fd, int, mode, compat_arg_u64_dual(offset), compat_arg_u64_dual(len)) { return ksys_fallocate(fd, mode, compat_arg_u64_glue(offset), compat_arg_u64_glue(len)); } #endif /* * access() needs to use the real uid/gid, not the effective uid/gid. * We do this by temporarily clearing all FS-related capabilities and * switching the fsuid/fsgid around to the real ones. * * Creating new credentials is expensive, so we try to skip doing it, * which we can if the result would match what we already got. */ static bool access_need_override_creds(int flags) { const struct cred *cred; if (flags & AT_EACCESS) return false; cred = current_cred(); if (!uid_eq(cred->fsuid, cred->uid) || !gid_eq(cred->fsgid, cred->gid)) return true; if (!issecure(SECURE_NO_SETUID_FIXUP)) { kuid_t root_uid = make_kuid(cred->user_ns, 0); if (!uid_eq(cred->uid, root_uid)) { if (!cap_isclear(cred->cap_effective)) return true; } else { if (!cap_isidentical(cred->cap_effective, cred->cap_permitted)) return true; } } return false; } static const struct cred *access_override_creds(void) { struct cred *override_cred; override_cred = prepare_creds(); if (!override_cred) return NULL; /* * XXX access_need_override_creds performs checks in hopes of skipping * this work. Make sure it stays in sync if making any changes in this * routine. */ override_cred->fsuid = override_cred->uid; override_cred->fsgid = override_cred->gid; if (!issecure(SECURE_NO_SETUID_FIXUP)) { /* Clear the capabilities if we switch to a non-root user */ kuid_t root_uid = make_kuid(override_cred->user_ns, 0); if (!uid_eq(override_cred->uid, root_uid)) cap_clear(override_cred->cap_effective); else override_cred->cap_effective = override_cred->cap_permitted; } /* * The new set of credentials can *only* be used in * task-synchronous circumstances, and does not need * RCU freeing, unless somebody then takes a separate * reference to it. * * NOTE! This is _only_ true because this credential * is used purely for override_creds() that installs * it as the subjective cred. Other threads will be * accessing ->real_cred, not the subjective cred. * * If somebody _does_ make a copy of this (using the * 'get_current_cred()' function), that will clear the * non_rcu field, because now that other user may be * expecting RCU freeing. But normal thread-synchronous * cred accesses will keep things non-racy to avoid RCU * freeing. */ override_cred->non_rcu = 1; return override_creds(override_cred); } static int do_faccessat(int dfd, const char __user *filename, int mode, int flags) { struct path path; struct inode *inode; int res; unsigned int lookup_flags = LOOKUP_FOLLOW; const struct cred *old_cred = NULL; if (mode & ~S_IRWXO) /* where's F_OK, X_OK, W_OK, R_OK? */ return -EINVAL; if (flags & ~(AT_EACCESS | AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) return -EINVAL; if (flags & AT_SYMLINK_NOFOLLOW) lookup_flags &= ~LOOKUP_FOLLOW; if (flags & AT_EMPTY_PATH) lookup_flags |= LOOKUP_EMPTY; if (access_need_override_creds(flags)) { old_cred = access_override_creds(); if (!old_cred) return -ENOMEM; } retry: res = user_path_at(dfd, filename, lookup_flags, &path); if (res) goto out; inode = d_backing_inode(path.dentry); if ((mode & MAY_EXEC) && S_ISREG(inode->i_mode)) { /* * MAY_EXEC on regular files is denied if the fs is mounted * with the "noexec" flag. */ res = -EACCES; if (path_noexec(&path)) goto out_path_release; } res = inode_permission(mnt_idmap(path.mnt), inode, mode | MAY_ACCESS); /* SuS v2 requires we report a read only fs too */ if (res || !(mode & S_IWOTH) || special_file(inode->i_mode)) goto out_path_release; /* * This is a rare case where using __mnt_is_readonly() * is OK without a mnt_want/drop_write() pair. Since * no actual write to the fs is performed here, we do * not need to telegraph to that to anyone. * * By doing this, we accept that this access is * inherently racy and know that the fs may change * state before we even see this result. */ if (__mnt_is_readonly(path.mnt)) res = -EROFS; out_path_release: path_put(&path); if (retry_estale(res, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } out: if (old_cred) put_cred(revert_creds(old_cred)); return res; } SYSCALL_DEFINE3(faccessat, int, dfd, const char __user *, filename, int, mode) { return do_faccessat(dfd, filename, mode, 0); } SYSCALL_DEFINE4(faccessat2, int, dfd, const char __user *, filename, int, mode, int, flags) { return do_faccessat(dfd, filename, mode, flags); } SYSCALL_DEFINE2(access, const char __user *, filename, int, mode) { return do_faccessat(AT_FDCWD, filename, mode, 0); } SYSCALL_DEFINE1(chdir, const char __user *, filename) { struct path path; int error; unsigned int lookup_flags = LOOKUP_FOLLOW | LOOKUP_DIRECTORY; retry: error = user_path_at(AT_FDCWD, filename, lookup_flags, &path); if (error) goto out; error = path_permission(&path, MAY_EXEC | MAY_CHDIR); if (error) goto dput_and_out; set_fs_pwd(current->fs, &path); dput_and_out: path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } out: return error; } SYSCALL_DEFINE1(fchdir, unsigned int, fd) { CLASS(fd_raw, f)(fd); int error; if (fd_empty(f)) return -EBADF; if (!d_can_lookup(fd_file(f)->f_path.dentry)) return -ENOTDIR; error = file_permission(fd_file(f), MAY_EXEC | MAY_CHDIR); if (!error) set_fs_pwd(current->fs, &fd_file(f)->f_path); return error; } SYSCALL_DEFINE1(chroot, const char __user *, filename) { struct path path; int error; unsigned int lookup_flags = LOOKUP_FOLLOW | LOOKUP_DIRECTORY; retry: error = user_path_at(AT_FDCWD, filename, lookup_flags, &path); if (error) goto out; error = path_permission(&path, MAY_EXEC | MAY_CHDIR); if (error) goto dput_and_out; error = -EPERM; if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT)) goto dput_and_out; error = security_path_chroot(&path); if (error) goto dput_and_out; set_fs_root(current->fs, &path); error = 0; dput_and_out: path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } out: return error; } int chmod_common(const struct path *path, umode_t mode) { struct inode *inode = path->dentry->d_inode; struct inode *delegated_inode = NULL; struct iattr newattrs; int error; error = mnt_want_write(path->mnt); if (error) return error; retry_deleg: error = inode_lock_killable(inode); if (error) goto out_mnt_unlock; error = security_path_chmod(path, mode); if (error) goto out_unlock; newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO); newattrs.ia_valid = ATTR_MODE | ATTR_CTIME; error = notify_change(mnt_idmap(path->mnt), path->dentry, &newattrs, &delegated_inode); out_unlock: inode_unlock(inode); if (delegated_inode) { error = break_deleg_wait(&delegated_inode); if (!error) goto retry_deleg; } out_mnt_unlock: mnt_drop_write(path->mnt); return error; } int vfs_fchmod(struct file *file, umode_t mode) { audit_file(file); return chmod_common(&file->f_path, mode); } SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) { CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_fchmod(fd_file(f), mode); } static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, unsigned int flags) { struct path path; int error; unsigned int lookup_flags; if (unlikely(flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH))) return -EINVAL; lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW; if (flags & AT_EMPTY_PATH) lookup_flags |= LOOKUP_EMPTY; retry: error = user_path_at(dfd, filename, lookup_flags, &path); if (!error) { error = chmod_common(&path, mode); path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } } return error; } SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, umode_t, mode, unsigned int, flags) { return do_fchmodat(dfd, filename, mode, flags); } SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename, umode_t, mode) { return do_fchmodat(dfd, filename, mode, 0); } SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode) { return do_fchmodat(AT_FDCWD, filename, mode, 0); } /* * Check whether @kuid is valid and if so generate and set vfsuid_t in * ia_vfsuid. * * Return: true if @kuid is valid, false if not. */ static inline bool setattr_vfsuid(struct iattr *attr, kuid_t kuid) { if (!uid_valid(kuid)) return false; attr->ia_valid |= ATTR_UID; attr->ia_vfsuid = VFSUIDT_INIT(kuid); return true; } /* * Check whether @kgid is valid and if so generate and set vfsgid_t in * ia_vfsgid. * * Return: true if @kgid is valid, false if not. */ static inline bool setattr_vfsgid(struct iattr *attr, kgid_t kgid) { if (!gid_valid(kgid)) return false; attr->ia_valid |= ATTR_GID; attr->ia_vfsgid = VFSGIDT_INIT(kgid); return true; } int chown_common(const struct path *path, uid_t user, gid_t group) { struct mnt_idmap *idmap; struct user_namespace *fs_userns; struct inode *inode = path->dentry->d_inode; struct inode *delegated_inode = NULL; int error; struct iattr newattrs; kuid_t uid; kgid_t gid; uid = make_kuid(current_user_ns(), user); gid = make_kgid(current_user_ns(), group); idmap = mnt_idmap(path->mnt); fs_userns = i_user_ns(inode); retry_deleg: newattrs.ia_vfsuid = INVALID_VFSUID; newattrs.ia_vfsgid = INVALID_VFSGID; newattrs.ia_valid = ATTR_CTIME; if ((user != (uid_t)-1) && !setattr_vfsuid(&newattrs, uid)) return -EINVAL; if ((group != (gid_t)-1) && !setattr_vfsgid(&newattrs, gid)) return -EINVAL; error = inode_lock_killable(inode); if (error) return error; if (!S_ISDIR(inode->i_mode)) newattrs.ia_valid |= ATTR_KILL_SUID | ATTR_KILL_PRIV | setattr_should_drop_sgid(idmap, inode); /* Continue to send actual fs values, not the mount values. */ error = security_path_chown( path, from_vfsuid(idmap, fs_userns, newattrs.ia_vfsuid), from_vfsgid(idmap, fs_userns, newattrs.ia_vfsgid)); if (!error) error = notify_change(idmap, path->dentry, &newattrs, &delegated_inode); inode_unlock(inode); if (delegated_inode) { error = break_deleg_wait(&delegated_inode); if (!error) goto retry_deleg; } return error; } int do_fchownat(int dfd, const char __user *filename, uid_t user, gid_t group, int flag) { struct path path; int error = -EINVAL; int lookup_flags; if ((flag & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) goto out; lookup_flags = (flag & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW; if (flag & AT_EMPTY_PATH) lookup_flags |= LOOKUP_EMPTY; retry: error = user_path_at(dfd, filename, lookup_flags, &path); if (error) goto out; error = mnt_want_write(path.mnt); if (error) goto out_release; error = chown_common(&path, user, group); mnt_drop_write(path.mnt); out_release: path_put(&path); if (retry_estale(error, lookup_flags)) { lookup_flags |= LOOKUP_REVAL; goto retry; } out: return error; } SYSCALL_DEFINE5(fchownat, int, dfd, const char __user *, filename, uid_t, user, gid_t, group, int, flag) { return do_fchownat(dfd, filename, user, group, flag); } SYSCALL_DEFINE3(chown, const char __user *, filename, uid_t, user, gid_t, group) { return do_fchownat(AT_FDCWD, filename, user, group, 0); } SYSCALL_DEFINE3(lchown, const char __user *, filename, uid_t, user, gid_t, group) { return do_fchownat(AT_FDCWD, filename, user, group, AT_SYMLINK_NOFOLLOW); } int vfs_fchown(struct file *file, uid_t user, gid_t group) { int error; error = mnt_want_write_file(file); if (error) return error; audit_file(file); error = chown_common(&file->f_path, user, group); mnt_drop_write_file(file); return error; } int ksys_fchown(unsigned int fd, uid_t user, gid_t group) { CLASS(fd, f)(fd); if (fd_empty(f)) return -EBADF; return vfs_fchown(fd_file(f), user, group); } SYSCALL_DEFINE3(fchown, unsigned int, fd, uid_t, user, gid_t, group) { return ksys_fchown(fd, user, group); } static inline int file_get_write_access(struct file *f) { int error; error = get_write_access(f->f_inode); if (unlikely(error)) return error; error = mnt_get_write_access(f->f_path.mnt); if (unlikely(error)) goto cleanup_inode; if (unlikely(f->f_mode & FMODE_BACKING)) { error = mnt_get_write_access(backing_file_user_path(f)->mnt); if (unlikely(error)) goto cleanup_mnt; } return 0; cleanup_mnt: mnt_put_write_access(f->f_path.mnt); cleanup_inode: put_write_access(f->f_inode); return error; } static int do_dentry_open(struct file *f, int (*open)(struct inode *, struct file *)) { static const struct file_operations empty_fops = {}; struct inode *inode = f->f_path.dentry->d_inode; int error; path_get(&f->f_path); f->f_inode = inode; f->f_mapping = inode->i_mapping; f->f_wb_err = filemap_sample_wb_err(f->f_mapping); f->f_sb_err = file_sample_sb_err(f); if (unlikely(f->f_flags & O_PATH)) { f->f_mode = FMODE_PATH | FMODE_OPENED; file_set_fsnotify_mode(f, FMODE_NONOTIFY); f->f_op = &empty_fops; return 0; } if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) { i_readcount_inc(inode); } else if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) { error = file_get_write_access(f); if (unlikely(error)) goto cleanup_file; f->f_mode |= FMODE_WRITER; } /* POSIX.1-2008/SUSv4 Section XSI 2.9.7 */ if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)) f->f_mode |= FMODE_ATOMIC_POS; f->f_op = fops_get(inode->i_fop); if (WARN_ON(!f->f_op)) { error = -ENODEV; goto cleanup_all; } error = security_file_open(f); if (error) goto cleanup_all; /* * Set FMODE_NONOTIFY_* bits according to existing permission watches. * If FMODE_NONOTIFY mode was already set for an fanotify fd or for a * pseudo file, this call will not change the mode. */ file_set_fsnotify_mode_from_watchers(f); error = fsnotify_open_perm(f); if (error) goto cleanup_all; error = break_lease(file_inode(f), f->f_flags); if (error) goto cleanup_all; /* normally all 3 are set; ->open() can clear them if needed */ f->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; if (!open) open = f->f_op->open; if (open) { error = open(inode, f); if (error) goto cleanup_all; } f->f_mode |= FMODE_OPENED; if ((f->f_mode & FMODE_READ) && likely(f->f_op->read || f->f_op->read_iter)) f->f_mode |= FMODE_CAN_READ; if ((f->f_mode & FMODE_WRITE) && likely(f->f_op->write || f->f_op->write_iter)) f->f_mode |= FMODE_CAN_WRITE; if ((f->f_mode & FMODE_LSEEK) && !f->f_op->llseek) f->f_mode &= ~FMODE_LSEEK; if (f->f_mapping->a_ops && f->f_mapping->a_ops->direct_IO) f->f_mode |= FMODE_CAN_ODIRECT; f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC); f->f_iocb_flags = iocb_flags(f); file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping); if ((f->f_flags & O_DIRECT) && !(f->f_mode & FMODE_CAN_ODIRECT)) return -EINVAL; /* * XXX: Huge page cache doesn't support writing yet. Drop all page * cache for this file before processing writes. */ if (f->f_mode & FMODE_WRITE) { /* * Depends on full fence from get_write_access() to synchronize * against collapse_file() regarding i_writecount and nr_thps * updates. Ensures subsequent insertion of THPs into the page * cache will fail. */ if (filemap_nr_thps(inode->i_mapping)) { struct address_space *mapping = inode->i_mapping; filemap_invalidate_lock(inode->i_mapping); /* * unmap_mapping_range just need to be called once * here, because the private pages is not need to be * unmapped mapping (e.g. data segment of dynamic * shared libraries here). */ unmap_mapping_range(mapping, 0, 0, 0); truncate_inode_pages(mapping, 0); filemap_invalidate_unlock(inode->i_mapping); } } return 0; cleanup_all: if (WARN_ON_ONCE(error > 0)) error = -EINVAL; fops_put(f->f_op); put_file_access(f); cleanup_file: path_put(&f->f_path); f->f_path.mnt = NULL; f->f_path.dentry = NULL; f->f_inode = NULL; return error; } /** * finish_open - finish opening a file * @file: file pointer * @dentry: pointer to dentry * @open: open callback * * This can be used to finish opening a file passed to i_op->atomic_open(). * * If the open callback is set to NULL, then the standard f_op->open() * filesystem callback is substituted. * * NB: the dentry reference is _not_ consumed. If, for example, the dentry is * the return value of d_splice_alias(), then the caller needs to perform dput() * on it after finish_open(). * * Returns zero on success or -errno if the open failed. */ int finish_open(struct file *file, struct dentry *dentry, int (*open)(struct inode *, struct file *)) { BUG_ON(file->f_mode & FMODE_OPENED); /* once it's opened, it's opened */ file->f_path.dentry = dentry; return do_dentry_open(file, open); } EXPORT_SYMBOL(finish_open); /** * finish_no_open - finish ->atomic_open() without opening the file * * @file: file pointer * @dentry: dentry or NULL (as returned from ->lookup()) * * This can be used to set the result of a successful lookup in ->atomic_open(). * * NB: unlike finish_open() this function does consume the dentry reference and * the caller need not dput() it. * * Returns "0" which must be the return value of ->atomic_open() after having * called this function. */ int finish_no_open(struct file *file, struct dentry *dentry) { file->f_path.dentry = dentry; return 0; } EXPORT_SYMBOL(finish_no_open); char *file_path(struct file *filp, char *buf, int buflen) { return d_path(&filp->f_path, buf, buflen); } EXPORT_SYMBOL(file_path); /** * vfs_open - open the file at the given path * @path: path to open * @file: newly allocated file with f_flag initialized */ int vfs_open(const struct path *path, struct file *file) { int ret; file->f_path = *path; ret = do_dentry_open(file, NULL); if (!ret) { /* * Once we return a file with FMODE_OPENED, __fput() will call * fsnotify_close(), so we need fsnotify_open() here for * symmetry. */ fsnotify_open(file); } return ret; } struct file *dentry_open(const struct path *path, int flags, const struct cred *cred) { int error; struct file *f; /* We must always pass in a valid mount pointer. */ BUG_ON(!path->mnt); f = alloc_empty_file(flags, cred); if (!IS_ERR(f)) { error = vfs_open(path, f); if (error) { fput(f); f = ERR_PTR(error); } } return f; } EXPORT_SYMBOL(dentry_open); struct file *dentry_open_nonotify(const struct path *path, int flags, const struct cred *cred) { struct file *f = alloc_empty_file(flags, cred); if (!IS_ERR(f)) { int error; file_set_fsnotify_mode(f, FMODE_NONOTIFY); error = vfs_open(path, f); if (error) { fput(f); f = ERR_PTR(error); } } return f; } /** * dentry_create - Create and open a file * @path: path to create * @flags: O_ flags * @mode: mode bits for new file * @cred: credentials to use * * Caller must hold the parent directory's lock, and have prepared * a negative dentry, placed in @path->dentry, for the new file. * * Caller sets @path->mnt to the vfsmount of the filesystem where * the new file is to be created. The parent directory and the * negative dentry must reside on the same filesystem instance. * * On success, returns a "struct file *". Otherwise a ERR_PTR * is returned. */ struct file *dentry_create(const struct path *path, int flags, umode_t mode, const struct cred *cred) { struct file *f; int error; f = alloc_empty_file(flags, cred); if (IS_ERR(f)) return f; error = vfs_create(mnt_idmap(path->mnt), d_inode(path->dentry->d_parent), path->dentry, mode, true); if (!error) error = vfs_open(path, f); if (unlikely(error)) { fput(f); return ERR_PTR(error); } return f; } EXPORT_SYMBOL(dentry_create); /** * kernel_file_open - open a file for kernel internal use * @path: path of the file to open * @flags: open flags * @cred: credentials for open * * Open a file for use by in-kernel consumers. The file is not accounted * against nr_files and must not be installed into the file descriptor * table. * * Return: Opened file on success, an error pointer on failure. */ struct file *kernel_file_open(const struct path *path, int flags, const struct cred *cred) { struct file *f; int error; f = alloc_empty_file_noaccount(flags, cred); if (IS_ERR(f)) return f; f->f_path = *path; error = do_dentry_open(f, NULL); if (error) { fput(f); return ERR_PTR(error); } fsnotify_open(f); return f; } EXPORT_SYMBOL_GPL(kernel_file_open); #define WILL_CREATE(flags) (flags & (O_CREAT | __O_TMPFILE)) #define O_PATH_FLAGS (O_DIRECTORY | O_NOFOLLOW | O_PATH | O_CLOEXEC) inline struct open_how build_open_how(int flags, umode_t mode) { struct open_how how = { .flags = flags & VALID_OPEN_FLAGS, .mode = mode & S_IALLUGO, }; /* O_PATH beats everything else. */ if (how.flags & O_PATH) how.flags &= O_PATH_FLAGS; /* Modes should only be set for create-like flags. */ if (!WILL_CREATE(how.flags)) how.mode = 0; return how; } inline int build_open_flags(const struct open_how *how, struct open_flags *op) { u64 flags = how->flags; u64 strip = O_CLOEXEC; int lookup_flags = 0; int acc_mode = ACC_MODE(flags); BUILD_BUG_ON_MSG(upper_32_bits(VALID_OPEN_FLAGS), "struct open_flags doesn't yet handle flags > 32 bits"); /* * Strip flags that aren't relevant in determining struct open_flags. */ flags &= ~strip; /* * Older syscalls implicitly clear all of the invalid flags or argument * values before calling build_open_flags(), but openat2(2) checks all * of its arguments. */ if (flags & ~VALID_OPEN_FLAGS) return -EINVAL; if (how->resolve & ~VALID_RESOLVE_FLAGS) return -EINVAL; /* Scoping flags are mutually exclusive. */ if ((how->resolve & RESOLVE_BENEATH) && (how->resolve & RESOLVE_IN_ROOT)) return -EINVAL; /* Deal with the mode. */ if (WILL_CREATE(flags)) { if (how->mode & ~S_IALLUGO) return -EINVAL; op->mode = how->mode | S_IFREG; } else { if (how->mode != 0) return -EINVAL; op->mode = 0; } /* * Block bugs where O_DIRECTORY | O_CREAT created regular files. * Note, that blocking O_DIRECTORY | O_CREAT here also protects * O_TMPFILE below which requires O_DIRECTORY being raised. */ if ((flags & (O_DIRECTORY | O_CREAT)) == (O_DIRECTORY | O_CREAT)) return -EINVAL; /* Now handle the creative implementation of O_TMPFILE. */ if (flags & __O_TMPFILE) { /* * In order to ensure programs get explicit errors when trying * to use O_TMPFILE on old kernels we enforce that O_DIRECTORY * is raised alongside __O_TMPFILE. */ if (!(flags & O_DIRECTORY)) return -EINVAL; if (!(acc_mode & MAY_WRITE)) return -EINVAL; } if (flags & O_PATH) { /* O_PATH only permits certain other flags to be set. */ if (flags & ~O_PATH_FLAGS) return -EINVAL; acc_mode = 0; } /* * O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only * check for O_DSYNC if the need any syncing at all we enforce it's * always set instead of having to deal with possibly weird behaviour * for malicious applications setting only __O_SYNC. */ if (flags & __O_SYNC) flags |= O_DSYNC; op->open_flag = flags; /* O_TRUNC implies we need access checks for write permissions */ if (flags & O_TRUNC) acc_mode |= MAY_WRITE; /* Allow the LSM permission hook to distinguish append access from general write access. */ if (flags & O_APPEND) acc_mode |= MAY_APPEND; op->acc_mode = acc_mode; op->intent = flags & O_PATH ? 0 : LOOKUP_OPEN; if (flags & O_CREAT) { op->intent |= LOOKUP_CREATE; if (flags & O_EXCL) { op->intent |= LOOKUP_EXCL; flags |= O_NOFOLLOW; } } if (flags & O_DIRECTORY) lookup_flags |= LOOKUP_DIRECTORY; if (!(flags & O_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; if (how->resolve & RESOLVE_NO_XDEV) lookup_flags |= LOOKUP_NO_XDEV; if (how->resolve & RESOLVE_NO_MAGICLINKS) lookup_flags |= LOOKUP_NO_MAGICLINKS; if (how->resolve & RESOLVE_NO_SYMLINKS) lookup_flags |= LOOKUP_NO_SYMLINKS; if (how->resolve & RESOLVE_BENEATH) lookup_flags |= LOOKUP_BENEATH; if (how->resolve & RESOLVE_IN_ROOT) lookup_flags |= LOOKUP_IN_ROOT; if (how->resolve & RESOLVE_CACHED) { /* Don't bother even trying for create/truncate/tmpfile open */ if (flags & (O_TRUNC | O_CREAT | __O_TMPFILE)) return -EAGAIN; lookup_flags |= LOOKUP_CACHED; } op->lookup_flags = lookup_flags; return 0; } /** * file_open_name - open file and return file pointer * * @name: struct filename containing path to open * @flags: open flags as per the open(2) second argument * @mode: mode for the new file if O_CREAT is set, else ignored * * This is the helper to open a file from kernelspace if you really * have to. But in generally you should not do this, so please move * along, nothing to see here.. */ struct file *file_open_name(struct filename *name, int flags, umode_t mode) { struct open_flags op; struct open_how how = build_open_how(flags, mode); int err = build_open_flags(&how, &op); if (err) return ERR_PTR(err); return do_filp_open(AT_FDCWD, name, &op); } /** * filp_open - open file and return file pointer * * @filename: path to open * @flags: open flags as per the open(2) second argument * @mode: mode for the new file if O_CREAT is set, else ignored * * This is the helper to open a file from kernelspace if you really * have to. But in generally you should not do this, so please move * along, nothing to see here.. */ struct file *filp_open(const char *filename, int flags, umode_t mode) { struct filename *name = getname_kernel(filename); struct file *file = ERR_CAST(name); if (!IS_ERR(name)) { file = file_open_name(name, flags, mode); putname(name); } return file; } EXPORT_SYMBOL(filp_open); struct file *file_open_root(const struct path *root, const char *filename, int flags, umode_t mode) { struct open_flags op; struct open_how how = build_open_how(flags, mode); int err = build_open_flags(&how, &op); if (err) return ERR_PTR(err); return do_file_open_root(root, filename, &op); } EXPORT_SYMBOL(file_open_root); static int do_sys_openat2(int dfd, const char __user *filename, struct open_how *how) { struct open_flags op; struct filename *tmp; int err, fd; err = build_open_flags(how, &op); if (unlikely(err)) return err; tmp = getname(filename); if (IS_ERR(tmp)) return PTR_ERR(tmp); fd = get_unused_fd_flags(how->flags); if (likely(fd >= 0)) { struct file *f = do_filp_open(dfd, tmp, &op); if (IS_ERR(f)) { put_unused_fd(fd); fd = PTR_ERR(f); } else { fd_install(fd, f); } } putname(tmp); return fd; } int do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode) { struct open_how how = build_open_how(flags, mode); return do_sys_openat2(dfd, filename, &how); } SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode) { if (force_o_largefile()) flags |= O_LARGEFILE; return do_sys_open(AT_FDCWD, filename, flags, mode); } SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags, umode_t, mode) { if (force_o_largefile()) flags |= O_LARGEFILE; return do_sys_open(dfd, filename, flags, mode); } SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename, struct open_how __user *, how, size_t, usize) { int err; struct open_how tmp; BUILD_BUG_ON(sizeof(struct open_how) < OPEN_HOW_SIZE_VER0); BUILD_BUG_ON(sizeof(struct open_how) != OPEN_HOW_SIZE_LATEST); if (unlikely(usize < OPEN_HOW_SIZE_VER0)) return -EINVAL; if (unlikely(usize > PAGE_SIZE)) return -E2BIG; err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize); if (err) return err; audit_openat2_how(&tmp); /* O_LARGEFILE is only allowed for non-O_PATH. */ if (!(tmp.flags & O_PATH) && force_o_largefile()) tmp.flags |= O_LARGEFILE; return do_sys_openat2(dfd, filename, &tmp); } #ifdef CONFIG_COMPAT /* * Exactly like sys_open(), except that it doesn't set the * O_LARGEFILE flag. */ COMPAT_SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode) { return do_sys_open(AT_FDCWD, filename, flags, mode); } /* * Exactly like sys_openat(), except that it doesn't set the * O_LARGEFILE flag. */ COMPAT_SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags, umode_t, mode) { return do_sys_open(dfd, filename, flags, mode); } #endif #ifndef __alpha__ /* * For backward compatibility? Maybe this should be moved * into arch/i386 instead? */ SYSCALL_DEFINE2(creat, const char __user *, pathname, umode_t, mode) { int flags = O_CREAT | O_WRONLY | O_TRUNC; if (force_o_largefile()) flags |= O_LARGEFILE; return do_sys_open(AT_FDCWD, pathname, flags, mode); } #endif /* * "id" is the POSIX thread ID. We use the * files pointer for this.. */ static int filp_flush(struct file *filp, fl_owner_t id) { int retval = 0; if (CHECK_DATA_CORRUPTION(file_count(filp) == 0, filp, "VFS: Close: file count is 0 (f_op=%ps)", filp->f_op)) { return 0; } if (filp->f_op->flush) retval = filp->f_op->flush(filp, id); if (likely(!(filp->f_mode & FMODE_PATH))) { dnotify_flush(filp, id); locks_remove_posix(filp, id); } return retval; } int filp_close(struct file *filp, fl_owner_t id) { int retval; retval = filp_flush(filp, id); fput_close(filp); return retval; } EXPORT_SYMBOL(filp_close); /* * Careful here! We test whether the file pointer is NULL before * releasing the fd. This ensures that one clone task can't release * an fd while another clone is opening it. */ SYSCALL_DEFINE1(close, unsigned int, fd) { int retval; struct file *file; file = file_close_fd(fd); if (!file) return -EBADF; retval = filp_flush(file, current->files); /* * We're returning to user space. Don't bother * with any delayed fput() cases. */ fput_close_sync(file); if (likely(retval == 0)) return 0; /* can't restart close syscall because file table entry was cleared */ if (retval == -ERESTARTSYS || retval == -ERESTARTNOINTR || retval == -ERESTARTNOHAND || retval == -ERESTART_RESTARTBLOCK) retval = -EINTR; return retval; } /* * This routine simulates a hangup on the tty, to arrange that users * are given clean terminals at login time. */ SYSCALL_DEFINE0(vhangup) { if (capable(CAP_SYS_TTY_CONFIG)) { tty_vhangup_self(); return 0; } return -EPERM; } /* * Called when an inode is about to be open. * We use this to disallow opening large files on 32bit systems if * the caller didn't specify O_LARGEFILE. On 64bit systems we force * on this flag in sys_open. */ int generic_file_open(struct inode * inode, struct file * filp) { if (!(filp->f_flags & O_LARGEFILE) && i_size_read(inode) > MAX_NON_LFS) return -EOVERFLOW; return 0; } EXPORT_SYMBOL(generic_file_open); /* * This is used by subsystems that don't want seekable * file descriptors. The function is not supposed to ever fail, the only * reason it returns an 'int' and not 'void' is so that it can be plugged * directly into file_operations structure. */ int nonseekable_open(struct inode *inode, struct file *filp) { filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE); return 0; } EXPORT_SYMBOL(nonseekable_open); /* * stream_open is used by subsystems that want stream-like file descriptors. * Such file descriptors are not seekable and don't have notion of position * (file.f_pos is always 0 and ppos passed to .read()/.write() is always NULL). * Contrary to file descriptors of other regular files, .read() and .write() * can run simultaneously. * * stream_open never fails and is marked to return int so that it could be * directly used as file_operations.open . */ int stream_open(struct inode *inode, struct file *filp) { filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS); filp->f_mode |= FMODE_STREAM; return 0; } EXPORT_SYMBOL(stream_open); |
| 178 100 15 172 87 3 10 5 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 | // SPDX-License-Identifier: GPL-2.0-or-later /* SCTP kernel implementation * (C) Copyright IBM Corp. 2001, 2004 * Copyright (c) 1999-2000 Cisco, Inc. * Copyright (c) 1999-2001 Motorola, Inc. * Copyright (c) 2001 Intel Corp. * Copyright (c) 2001 Nokia, Inc. * * This file is part of the SCTP kernel implementation * * These are the state tables for the SCTP state machine. * * Please send any bug reports or fixes you make to the * email address(es): * lksctp developers <linux-sctp@vger.kernel.org> * * Written or modified by: * La Monte H.P. Yarroll <piggy@acm.org> * Karl Knutson <karl@athena.chicago.il.us> * Jon Grimm <jgrimm@us.ibm.com> * Hui Huang <hui.huang@nokia.com> * Daisy Chang <daisyc@us.ibm.com> * Ardelle Fan <ardelle.fan@intel.com> * Sridhar Samudrala <sri@us.ibm.com> */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include <linux/skbuff.h> #include <net/sctp/sctp.h> #include <net/sctp/sm.h> static const struct sctp_sm_table_entry primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES]; static const struct sctp_sm_table_entry other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES]; static const struct sctp_sm_table_entry timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES]; static const struct sctp_sm_table_entry *sctp_chunk_event_lookup( struct net *net, enum sctp_cid cid, enum sctp_state state); static const struct sctp_sm_table_entry bug = { .fn = sctp_sf_bug, .name = "sctp_sf_bug" }; #define DO_LOOKUP(_max, _type, _table) \ ({ \ const struct sctp_sm_table_entry *rtn; \ \ if ((event_subtype._type > (_max))) { \ pr_warn("table %p possible attack: event %d exceeds max %d\n", \ _table, event_subtype._type, _max); \ rtn = &bug; \ } else \ rtn = &_table[event_subtype._type][(int)state]; \ \ rtn; \ }) const struct sctp_sm_table_entry *sctp_sm_lookup_event( struct net *net, enum sctp_event_type event_type, enum sctp_state state, union sctp_subtype event_subtype) { switch (event_type) { case SCTP_EVENT_T_CHUNK: return sctp_chunk_event_lookup(net, event_subtype.chunk, state); case SCTP_EVENT_T_TIMEOUT: return DO_LOOKUP(SCTP_EVENT_TIMEOUT_MAX, timeout, timeout_event_table); case SCTP_EVENT_T_OTHER: return DO_LOOKUP(SCTP_EVENT_OTHER_MAX, other, other_event_table); case SCTP_EVENT_T_PRIMITIVE: return DO_LOOKUP(SCTP_EVENT_PRIMITIVE_MAX, primitive, primitive_event_table); default: /* Yikes! We got an illegal event type. */ return &bug; } } #define TYPE_SCTP_FUNC(func) {.fn = func, .name = #func} #define TYPE_SCTP_DATA { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_data_6_2), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_eat_data_6_2), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_eat_data_fast_4_4), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_DATA */ #define TYPE_SCTP_INIT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_1B_init), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_1_siminit), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_1_siminit), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_2_dupinit), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_2_dupinit), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_2_dupinit), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_2_dupinit), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_reshutack), \ } /* TYPE_SCTP_INIT */ #define TYPE_SCTP_INIT_ACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_3_initack), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_1C_ack), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_INIT_ACK */ #define TYPE_SCTP_SACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_sack_6_2), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_sack_6_2), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_eat_sack_6_2), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_sack_6_2), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_SACK */ #define TYPE_SCTP_HEARTBEAT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ /* This should not happen, but we are nice. */ \ TYPE_SCTP_FUNC(sctp_sf_beat_8_3), \ } /* TYPE_SCTP_HEARTBEAT */ #define TYPE_SCTP_HEARTBEAT_ACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_violation), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_backbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_backbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_backbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_backbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_HEARTBEAT_ACK */ #define TYPE_SCTP_ABORT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_pdiscard), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_wait_abort), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_echoed_abort), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_1_abort), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_pending_abort), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_sent_abort), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_1_abort), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_ack_sent_abort), \ } /* TYPE_SCTP_ABORT */ #define TYPE_SCTP_SHUTDOWN { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_shutdown), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_shutdown), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_shutdown_ack), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_shut_ctsn), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_SHUTDOWN */ #define TYPE_SCTP_SHUTDOWN_ACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_8_5_1_E_sa), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_8_5_1_E_sa), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_violation), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_violation), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_final), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_violation), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_final), \ } /* TYPE_SCTP_SHUTDOWN_ACK */ #define TYPE_SCTP_ERROR { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_echoed_err), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_operr_notify), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_operr_notify), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_operr_notify), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ERROR */ #define TYPE_SCTP_COOKIE_ECHO { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_1D_ce), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_2_4_dupcook), \ } /* TYPE_SCTP_COOKIE_ECHO */ #define TYPE_SCTP_COOKIE_ACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_5_1E_ca), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_COOKIE_ACK */ #define TYPE_SCTP_ECN_ECNE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecne), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecne), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecne), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecne), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecne), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ECN_ECNE */ #define TYPE_SCTP_ECN_CWR { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecn_cwr), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecn_cwr), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_ecn_cwr), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ECN_CWR */ #define TYPE_SCTP_SHUTDOWN_COMPLETE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_4_C), \ } /* TYPE_SCTP_SHUTDOWN_COMPLETE */ /* The primary index for this table is the chunk type. * The secondary index for this table is the state. * * For base protocol (RFC 2960). */ static const struct sctp_sm_table_entry chunk_event_table[SCTP_NUM_BASE_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_DATA, TYPE_SCTP_INIT, TYPE_SCTP_INIT_ACK, TYPE_SCTP_SACK, TYPE_SCTP_HEARTBEAT, TYPE_SCTP_HEARTBEAT_ACK, TYPE_SCTP_ABORT, TYPE_SCTP_SHUTDOWN, TYPE_SCTP_SHUTDOWN_ACK, TYPE_SCTP_ERROR, TYPE_SCTP_COOKIE_ECHO, TYPE_SCTP_COOKIE_ACK, TYPE_SCTP_ECN_ECNE, TYPE_SCTP_ECN_CWR, TYPE_SCTP_SHUTDOWN_COMPLETE, }; /* state_fn_t chunk_event_table[][] */ #define TYPE_SCTP_ASCONF { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ASCONF */ #define TYPE_SCTP_ASCONF_ACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ASCONF_ACK */ /* The primary index for this table is the chunk type. * The secondary index for this table is the state. */ static const struct sctp_sm_table_entry addip_chunk_event_table[SCTP_NUM_ADDIP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_ASCONF, TYPE_SCTP_ASCONF_ACK, }; /*state_fn_t addip_chunk_event_table[][] */ #define TYPE_SCTP_FWD_TSN { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_fwd_tsn), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_eat_fwd_tsn), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_eat_fwd_tsn_fast), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_FWD_TSN */ /* The primary index for this table is the chunk type. * The secondary index for this table is the state. */ static const struct sctp_sm_table_entry prsctp_chunk_event_table[SCTP_NUM_PRSCTP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_FWD_TSN, }; /*state_fn_t prsctp_chunk_event_table[][] */ #define TYPE_SCTP_RECONF { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_reconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_reconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_RECONF */ /* The primary index for this table is the chunk type. * The secondary index for this table is the state. */ static const struct sctp_sm_table_entry reconf_chunk_event_table[SCTP_NUM_RECONF_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_RECONF, }; /*state_fn_t reconf_chunk_event_table[][] */ #define TYPE_SCTP_AUTH { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ootb), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ } /* TYPE_SCTP_AUTH */ /* The primary index for this table is the chunk type. * The secondary index for this table is the state. */ static const struct sctp_sm_table_entry auth_chunk_event_table[SCTP_NUM_AUTH_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_AUTH, }; /*state_fn_t auth_chunk_event_table[][] */ static const struct sctp_sm_table_entry pad_chunk_event_table[SCTP_STATE_NUM_STATES] = { /* SCTP_STATE_CLOSED */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_COOKIE_WAIT */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_COOKIE_ECHOED */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_ESTABLISHED */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_SHUTDOWN_PENDING */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_SHUTDOWN_SENT */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_SHUTDOWN_RECEIVED */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), /* SCTP_STATE_SHUTDOWN_ACK_SENT */ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), }; /* chunk pad */ static const struct sctp_sm_table_entry chunk_event_table_unknown[SCTP_STATE_NUM_STATES] = { /* SCTP_STATE_CLOSED */ TYPE_SCTP_FUNC(sctp_sf_ootb), /* SCTP_STATE_COOKIE_WAIT */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_COOKIE_ECHOED */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_ESTABLISHED */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_SHUTDOWN_PENDING */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_SHUTDOWN_SENT */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_SHUTDOWN_RECEIVED */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), /* SCTP_STATE_SHUTDOWN_ACK_SENT */ TYPE_SCTP_FUNC(sctp_sf_unk_chunk), }; /* chunk unknown */ #define TYPE_SCTP_PRIMITIVE_ASSOCIATE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asoc), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_not_impl), \ } /* TYPE_SCTP_PRIMITIVE_ASSOCIATE */ #define TYPE_SCTP_PRIMITIVE_SHUTDOWN { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_wait_prm_shutdown), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_echoed_prm_shutdown),\ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_prm_shutdown), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_primitive), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_primitive), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_primitive), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_primitive), \ } /* TYPE_SCTP_PRIMITIVE_SHUTDOWN */ #define TYPE_SCTP_PRIMITIVE_ABORT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_wait_prm_abort), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_echoed_prm_abort), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_1_prm_abort), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_pending_prm_abort), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_sent_prm_abort), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_1_prm_abort), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_shutdown_ack_sent_prm_abort), \ } /* TYPE_SCTP_PRIMITIVE_ABORT */ #define TYPE_SCTP_PRIMITIVE_SEND { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_send), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_send), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_send), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ } /* TYPE_SCTP_PRIMITIVE_SEND */ #define TYPE_SCTP_PRIMITIVE_REQUESTHEARTBEAT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_requestheartbeat), \ } /* TYPE_SCTP_PRIMITIVE_REQUESTHEARTBEAT */ #define TYPE_SCTP_PRIMITIVE_ASCONF { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ } /* TYPE_SCTP_PRIMITIVE_ASCONF */ #define TYPE_SCTP_PRIMITIVE_RECONF { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_error_closed), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_reconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_reconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_reconf), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_reconf), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ } /* TYPE_SCTP_PRIMITIVE_RECONF */ /* The primary index for this table is the primitive type. * The secondary index for this table is the state. */ static const struct sctp_sm_table_entry primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_PRIMITIVE_ASSOCIATE, TYPE_SCTP_PRIMITIVE_SHUTDOWN, TYPE_SCTP_PRIMITIVE_ABORT, TYPE_SCTP_PRIMITIVE_SEND, TYPE_SCTP_PRIMITIVE_REQUESTHEARTBEAT, TYPE_SCTP_PRIMITIVE_ASCONF, TYPE_SCTP_PRIMITIVE_RECONF, }; #define TYPE_SCTP_OTHER_NO_PENDING_TSN { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_no_pending_tsn), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_start_shutdown), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_9_2_shutdown_ack), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ } #define TYPE_SCTP_OTHER_ICMP_PROTO_UNREACH { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_cookie_wait_icmp_abort), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_ignore_other), \ } static const struct sctp_sm_table_entry other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_OTHER_NO_PENDING_TSN, TYPE_SCTP_OTHER_ICMP_PROTO_UNREACH, }; #define TYPE_SCTP_EVENT_TIMEOUT_NONE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T1_COOKIE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_bug), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_t1_cookie_timer_expire), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T1_INIT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_t1_init_timer_expire), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T2_SHUTDOWN { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_t2_timer_expire), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_t2_timer_expire), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T3_RTX { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_3_3_rtx), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_3_3_rtx), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_3_3_rtx), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_3_3_rtx), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T4_RTO { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_t4_timer_expire), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_t5_timer_expire), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_t5_timer_expire), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_HEARTBEAT { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_sendbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_sendbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_sendbeat_8_3), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_SACK { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_2_sack), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_2_sack), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_do_6_2_sack), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_AUTOCLOSE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_autoclose_timer_expire), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_RECONF { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_send_reconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } #define TYPE_SCTP_EVENT_TIMEOUT_PROBE { \ /* SCTP_STATE_CLOSED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_WAIT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_COOKIE_ECHOED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_send_probe), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_timer_ignore), \ } static const struct sctp_sm_table_entry timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES] = { TYPE_SCTP_EVENT_TIMEOUT_NONE, TYPE_SCTP_EVENT_TIMEOUT_T1_COOKIE, TYPE_SCTP_EVENT_TIMEOUT_T1_INIT, TYPE_SCTP_EVENT_TIMEOUT_T2_SHUTDOWN, TYPE_SCTP_EVENT_TIMEOUT_T3_RTX, TYPE_SCTP_EVENT_TIMEOUT_T4_RTO, TYPE_SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD, TYPE_SCTP_EVENT_TIMEOUT_HEARTBEAT, TYPE_SCTP_EVENT_TIMEOUT_RECONF, TYPE_SCTP_EVENT_TIMEOUT_PROBE, TYPE_SCTP_EVENT_TIMEOUT_SACK, TYPE_SCTP_EVENT_TIMEOUT_AUTOCLOSE, }; static const struct sctp_sm_table_entry *sctp_chunk_event_lookup( struct net *net, enum sctp_cid cid, enum sctp_state state) { if (state > SCTP_STATE_MAX) return &bug; if (cid == SCTP_CID_I_DATA) cid = SCTP_CID_DATA; if (cid <= SCTP_CID_BASE_MAX) return &chunk_event_table[cid][state]; switch ((u16)cid) { case SCTP_CID_FWD_TSN: case SCTP_CID_I_FWD_TSN: return &prsctp_chunk_event_table[0][state]; case SCTP_CID_ASCONF: return &addip_chunk_event_table[0][state]; case SCTP_CID_ASCONF_ACK: return &addip_chunk_event_table[1][state]; case SCTP_CID_RECONF: return &reconf_chunk_event_table[0][state]; case SCTP_CID_AUTH: return &auth_chunk_event_table[0][state]; case SCTP_CID_PAD: return &pad_chunk_event_table[state]; } return &chunk_event_table_unknown[state]; } |
| 52 34 15 20 21 34 35 4 33 5 5 5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 | /* * net/tipc/name_distr.c: TIPC name distribution code * * Copyright (c) 2000-2006, 2014-2019, Ericsson AB * Copyright (c) 2005, 2010-2011, Wind River Systems * Copyright (c) 2020-2021, Red Hat Inc * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the names of the copyright holders nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * Alternatively, this software may be distributed under the terms of the * GNU General Public License ("GPL") version 2 as published by the Free * Software Foundation. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #include "core.h" #include "link.h" #include "name_distr.h" int sysctl_tipc_named_timeout __read_mostly = 2000; /** * publ_to_item - add publication info to a publication message * @p: publication info * @i: location of item in the message */ static void publ_to_item(struct distr_item *i, struct publication *p) { i->type = htonl(p->sr.type); i->lower = htonl(p->sr.lower); i->upper = htonl(p->sr.upper); i->port = htonl(p->sk.ref); i->key = htonl(p->key); } /** * named_prepare_buf - allocate & initialize a publication message * @net: the associated network namespace * @type: message type * @size: payload size * @dest: destination node * * The buffer returned is of size INT_H_SIZE + payload size */ static struct sk_buff *named_prepare_buf(struct net *net, u32 type, u32 size, u32 dest) { struct sk_buff *buf = tipc_buf_acquire(INT_H_SIZE + size, GFP_ATOMIC); u32 self = tipc_own_addr(net); struct tipc_msg *msg; if (buf != NULL) { msg = buf_msg(buf); tipc_msg_init(self, msg, NAME_DISTRIBUTOR, type, INT_H_SIZE, dest); msg_set_size(msg, INT_H_SIZE + size); } return buf; } /** * tipc_named_publish - tell other nodes about a new publication by this node * @net: the associated network namespace * @p: the new publication */ struct sk_buff *tipc_named_publish(struct net *net, struct publication *p) { struct name_table *nt = tipc_name_table(net); struct distr_item *item; struct sk_buff *skb; if (p->scope == TIPC_NODE_SCOPE) { list_add_tail_rcu(&p->binding_node, &nt->node_scope); return NULL; } write_lock_bh(&nt->cluster_scope_lock); list_add_tail(&p->binding_node, &nt->cluster_scope); write_unlock_bh(&nt->cluster_scope_lock); skb = named_prepare_buf(net, PUBLICATION, ITEM_SIZE, 0); if (!skb) { pr_warn("Publication distribution failure\n"); return NULL; } msg_set_named_seqno(buf_msg(skb), nt->snd_nxt++); msg_set_non_legacy(buf_msg(skb)); item = (struct distr_item *)msg_data(buf_msg(skb)); publ_to_item(item, p); return skb; } /** * tipc_named_withdraw - tell other nodes about a withdrawn publication by this node * @net: the associated network namespace * @p: the withdrawn publication */ struct sk_buff *tipc_named_withdraw(struct net *net, struct publication *p) { struct name_table *nt = tipc_name_table(net); struct distr_item *item; struct sk_buff *skb; write_lock_bh(&nt->cluster_scope_lock); list_del(&p->binding_node); write_unlock_bh(&nt->cluster_scope_lock); if (p->scope == TIPC_NODE_SCOPE) return NULL; skb = named_prepare_buf(net, WITHDRAWAL, ITEM_SIZE, 0); if (!skb) { pr_warn("Withdrawal distribution failure\n"); return NULL; } msg_set_named_seqno(buf_msg(skb), nt->snd_nxt++); msg_set_non_legacy(buf_msg(skb)); item = (struct distr_item *)msg_data(buf_msg(skb)); publ_to_item(item, p); return skb; } /** * named_distribute - prepare name info for bulk distribution to another node * @net: the associated network namespace * @list: list of messages (buffers) to be returned from this function * @dnode: node to be updated * @pls: linked list of publication items to be packed into buffer chain * @seqno: sequence number for this message */ static void named_distribute(struct net *net, struct sk_buff_head *list, u32 dnode, struct list_head *pls, u16 seqno) { struct publication *publ; struct sk_buff *skb = NULL; struct distr_item *item = NULL; u32 msg_dsz = ((tipc_node_get_mtu(net, dnode, 0, false) - INT_H_SIZE) / ITEM_SIZE) * ITEM_SIZE; u32 msg_rem = msg_dsz; struct tipc_msg *hdr; list_for_each_entry(publ, pls, binding_node) { /* Prepare next buffer: */ if (!skb) { skb = named_prepare_buf(net, PUBLICATION, msg_rem, dnode); if (!skb) { pr_warn("Bulk publication failure\n"); return; } hdr = buf_msg(skb); msg_set_bc_ack_invalid(hdr, true); msg_set_bulk(hdr); msg_set_non_legacy(hdr); item = (struct distr_item *)msg_data(hdr); } /* Pack publication into message: */ publ_to_item(item, publ); item++; msg_rem -= ITEM_SIZE; /* Append full buffer to list: */ if (!msg_rem) { __skb_queue_tail(list, skb); skb = NULL; msg_rem = msg_dsz; } } if (skb) { hdr = buf_msg(skb); msg_set_size(hdr, INT_H_SIZE + (msg_dsz - msg_rem)); skb_trim(skb, INT_H_SIZE + (msg_dsz - msg_rem)); __skb_queue_tail(list, skb); } hdr = buf_msg(skb_peek_tail(list)); msg_set_last_bulk(hdr); msg_set_named_seqno(hdr, seqno); } /** * tipc_named_node_up - tell specified node about all publications by this node * @net: the associated network namespace * @dnode: destination node * @capabilities: peer node's capabilities */ void tipc_named_node_up(struct net *net, u32 dnode, u16 capabilities) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct sk_buff_head head; u16 seqno; __skb_queue_head_init(&head); spin_lock_bh(&tn->nametbl_lock); if (!(capabilities & TIPC_NAMED_BCAST)) nt->rc_dests++; seqno = nt->snd_nxt; spin_unlock_bh(&tn->nametbl_lock); read_lock_bh(&nt->cluster_scope_lock); named_distribute(net, &head, dnode, &nt->cluster_scope, seqno); tipc_node_xmit(net, &head, dnode, 0); read_unlock_bh(&nt->cluster_scope_lock); } /** * tipc_publ_purge - remove publication associated with a failed node * @net: the associated network namespace * @p: the publication to remove * @addr: failed node's address * * Invoked for each publication issued by a newly failed node. * Removes publication structure from name table & deletes it. */ static void tipc_publ_purge(struct net *net, struct publication *p, u32 addr) { struct tipc_net *tn = tipc_net(net); struct publication *_p; struct tipc_uaddr ua; tipc_uaddr(&ua, TIPC_SERVICE_RANGE, p->scope, p->sr.type, p->sr.lower, p->sr.upper); spin_lock_bh(&tn->nametbl_lock); _p = tipc_nametbl_remove_publ(net, &ua, &p->sk, p->key); if (_p) tipc_node_unsubscribe(net, &_p->binding_node, addr); spin_unlock_bh(&tn->nametbl_lock); if (_p) kfree_rcu(_p, rcu); } void tipc_publ_notify(struct net *net, struct list_head *nsub_list, u32 addr, u16 capabilities) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct publication *publ, *tmp; list_for_each_entry_safe(publ, tmp, nsub_list, binding_node) tipc_publ_purge(net, publ, addr); spin_lock_bh(&tn->nametbl_lock); if (!(capabilities & TIPC_NAMED_BCAST)) nt->rc_dests--; spin_unlock_bh(&tn->nametbl_lock); } /** * tipc_update_nametbl - try to process a nametable update and notify * subscribers * @net: the associated network namespace * @i: location of item in the message * @node: node address * @dtype: name distributor message type * * tipc_nametbl_lock must be held. * Return: the publication item if successful, otherwise NULL. */ static bool tipc_update_nametbl(struct net *net, struct distr_item *i, u32 node, u32 dtype) { struct publication *p = NULL; struct tipc_socket_addr sk; struct tipc_uaddr ua; u32 key = ntohl(i->key); tipc_uaddr(&ua, TIPC_SERVICE_RANGE, TIPC_CLUSTER_SCOPE, ntohl(i->type), ntohl(i->lower), ntohl(i->upper)); sk.ref = ntohl(i->port); sk.node = node; if (dtype == PUBLICATION) { p = tipc_nametbl_insert_publ(net, &ua, &sk, key); if (p) { tipc_node_subscribe(net, &p->binding_node, node); return true; } } else if (dtype == WITHDRAWAL) { p = tipc_nametbl_remove_publ(net, &ua, &sk, key); if (p) { tipc_node_unsubscribe(net, &p->binding_node, node); kfree_rcu(p, rcu); return true; } pr_warn_ratelimited("Failed to remove binding %u,%u from %u\n", ua.sr.type, ua.sr.lower, node); } else { pr_warn_ratelimited("Unknown name table message received\n"); } return false; } static struct sk_buff *tipc_named_dequeue(struct sk_buff_head *namedq, u16 *rcv_nxt, bool *open) { struct sk_buff *skb, *tmp; struct tipc_msg *hdr; u16 seqno; spin_lock_bh(&namedq->lock); skb_queue_walk_safe(namedq, skb, tmp) { if (unlikely(skb_linearize(skb))) { __skb_unlink(skb, namedq); kfree_skb(skb); continue; } hdr = buf_msg(skb); seqno = msg_named_seqno(hdr); if (msg_is_last_bulk(hdr)) { *rcv_nxt = seqno; *open = true; } if (msg_is_bulk(hdr) || msg_is_legacy(hdr)) { __skb_unlink(skb, namedq); spin_unlock_bh(&namedq->lock); return skb; } if (*open && (*rcv_nxt == seqno)) { (*rcv_nxt)++; __skb_unlink(skb, namedq); spin_unlock_bh(&namedq->lock); return skb; } if (less(seqno, *rcv_nxt)) { __skb_unlink(skb, namedq); kfree_skb(skb); continue; } } spin_unlock_bh(&namedq->lock); return NULL; } /** * tipc_named_rcv - process name table update messages sent by another node * @net: the associated network namespace * @namedq: queue to receive from * @rcv_nxt: store last received seqno here * @open: last bulk msg was received (FIXME) */ void tipc_named_rcv(struct net *net, struct sk_buff_head *namedq, u16 *rcv_nxt, bool *open) { struct tipc_net *tn = tipc_net(net); struct distr_item *item; struct tipc_msg *hdr; struct sk_buff *skb; u32 count, node; spin_lock_bh(&tn->nametbl_lock); while ((skb = tipc_named_dequeue(namedq, rcv_nxt, open))) { hdr = buf_msg(skb); node = msg_orignode(hdr); item = (struct distr_item *)msg_data(hdr); count = msg_data_sz(hdr) / ITEM_SIZE; while (count--) { tipc_update_nametbl(net, item, node, msg_type(hdr)); item++; } kfree_skb(skb); } spin_unlock_bh(&tn->nametbl_lock); } /** * tipc_named_reinit - re-initialize local publications * @net: the associated network namespace * * This routine is called whenever TIPC networking is enabled. * All name table entries published by this node are updated to reflect * the node's new network address. */ void tipc_named_reinit(struct net *net) { struct name_table *nt = tipc_name_table(net); struct tipc_net *tn = tipc_net(net); struct publication *p; u32 self = tipc_own_addr(net); spin_lock_bh(&tn->nametbl_lock); list_for_each_entry_rcu(p, &nt->node_scope, binding_node) p->sk.node = self; list_for_each_entry_rcu(p, &nt->cluster_scope, binding_node) p->sk.node = self; nt->rc_dests = 0; spin_unlock_bh(&tn->nametbl_lock); } |
| 2 468 365 108 9 6 2 2 2 2 2 5 1 1 3 1 1 5 1 1 1 1 1 1 1 1 1 2 1 1 3 1 2 2 1 6 1 1 3 1 1 2 3 2 2 1 1 1 1 1 2 1 1 2 1 1 1 1 58 15 1 1 5 1 2 1 1 8 1 1 1 2 3 1 2 1 1 1 1 4 1 1 2 1 1 1 1 1 1 1 1 1 1 13 6 6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 | // SPDX-License-Identifier: GPL-2.0-or-later /* * X.25 Packet Layer release 002 * * This is ALPHA test software. This code may break your machine, * randomly fail to work with new releases, misbehave and/or generally * screw up. It might even work. * * This code REQUIRES 2.1.15 or higher * * History * X.25 001 Jonathan Naylor Started coding. * X.25 002 Jonathan Naylor Centralised disconnect handling. * New timer architecture. * 2000-03-11 Henner Eisen MSG_EOR handling more POSIX compliant. * 2000-03-22 Daniela Squassoni Allowed disabling/enabling of * facilities negotiation and increased * the throughput upper limit. * 2000-08-27 Arnaldo C. Melo s/suser/capable/ + micro cleanups * 2000-09-04 Henner Eisen Set sock->state in x25_accept(). * Fixed x25_output() related skb leakage. * 2000-10-02 Henner Eisen Made x25_kick() single threaded per socket. * 2000-10-27 Henner Eisen MSG_DONTWAIT for fragment allocation. * 2000-11-14 Henner Eisen Closing datalink from NETDEV_GOING_DOWN * 2002-10-06 Arnaldo C. Melo Get rid of cli/sti, move proc stuff to * x25_proc.c, using seq_file * 2005-04-02 Shaun Pereira Selective sub address matching * with call user data * 2005-04-15 Shaun Pereira Fast select with no restriction on * response */ #define pr_fmt(fmt) "X25: " fmt #include <linux/module.h> #include <linux/capability.h> #include <linux/errno.h> #include <linux/kernel.h> #include <linux/sched/signal.h> #include <linux/timer.h> #include <linux/string.h> #include <linux/net.h> #include <linux/netdevice.h> #include <linux/if_arp.h> #include <linux/skbuff.h> #include <linux/slab.h> #include <net/sock.h> #include <net/tcp_states.h> #include <linux/uaccess.h> #include <linux/fcntl.h> #include <linux/termios.h> /* For TIOCINQ/OUTQ */ #include <linux/notifier.h> #include <linux/init.h> #include <linux/compat.h> #include <linux/ctype.h> #include <net/x25.h> #include <net/compat.h> int sysctl_x25_restart_request_timeout = X25_DEFAULT_T20; int sysctl_x25_call_request_timeout = X25_DEFAULT_T21; int sysctl_x25_reset_request_timeout = X25_DEFAULT_T22; int sysctl_x25_clear_request_timeout = X25_DEFAULT_T23; int sysctl_x25_ack_holdback_timeout = X25_DEFAULT_T2; int sysctl_x25_forward = 0; HLIST_HEAD(x25_list); DEFINE_RWLOCK(x25_list_lock); static const struct proto_ops x25_proto_ops; static const struct x25_address null_x25_address = {" "}; #ifdef CONFIG_COMPAT struct compat_x25_subscrip_struct { char device[200-sizeof(compat_ulong_t)]; compat_ulong_t global_facil_mask; compat_uint_t extended; }; #endif int x25_parse_address_block(struct sk_buff *skb, struct x25_address *called_addr, struct x25_address *calling_addr) { unsigned char len; int needed; int rc; if (!pskb_may_pull(skb, 1)) { /* packet has no address block */ rc = 0; goto empty; } len = *skb->data; needed = 1 + ((len >> 4) + (len & 0x0f) + 1) / 2; if (!pskb_may_pull(skb, needed)) { /* packet is too short to hold the addresses it claims to hold */ rc = -1; goto empty; } return x25_addr_ntoa(skb->data, called_addr, calling_addr); empty: *called_addr->x25_addr = 0; *calling_addr->x25_addr = 0; return rc; } int x25_addr_ntoa(unsigned char *p, struct x25_address *called_addr, struct x25_address *calling_addr) { unsigned int called_len, calling_len; char *called, *calling; unsigned int i; called_len = (*p >> 0) & 0x0F; calling_len = (*p >> 4) & 0x0F; called = called_addr->x25_addr; calling = calling_addr->x25_addr; p++; for (i = 0; i < (called_len + calling_len); i++) { if (i < called_len) { if (i % 2 != 0) { *called++ = ((*p >> 0) & 0x0F) + '0'; p++; } else { *called++ = ((*p >> 4) & 0x0F) + '0'; } } else { if (i % 2 != 0) { *calling++ = ((*p >> 0) & 0x0F) + '0'; p++; } else { *calling++ = ((*p >> 4) & 0x0F) + '0'; } } } *called = *calling = '\0'; return 1 + (called_len + calling_len + 1) / 2; } int x25_addr_aton(unsigned char *p, struct x25_address *called_addr, struct x25_address *calling_addr) { unsigned int called_len, calling_len; char *called, *calling; int i; called = called_addr->x25_addr; calling = calling_addr->x25_addr; called_len = strlen(called); calling_len = strlen(calling); *p++ = (calling_len << 4) | (called_len << 0); for (i = 0; i < (called_len + calling_len); i++) { if (i < called_len) { if (i % 2 != 0) { *p |= (*called++ - '0') << 0; p++; } else { *p = 0x00; *p |= (*called++ - '0') << 4; } } else { if (i % 2 != 0) { *p |= (*calling++ - '0') << 0; p++; } else { *p = 0x00; *p |= (*calling++ - '0') << 4; } } } return 1 + (called_len + calling_len + 1) / 2; } /* * Socket removal during an interrupt is now safe. */ static void x25_remove_socket(struct sock *sk) { write_lock_bh(&x25_list_lock); sk_del_node_init(sk); write_unlock_bh(&x25_list_lock); } /* * Handle device status changes. */ static int x25_device_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct x25_neigh *nb; if (!net_eq(dev_net(dev), &init_net)) return NOTIFY_DONE; if (dev->type == ARPHRD_X25) { switch (event) { case NETDEV_REGISTER: case NETDEV_POST_TYPE_CHANGE: x25_link_device_up(dev); break; case NETDEV_DOWN: nb = x25_get_neigh(dev); if (nb) { x25_link_terminated(nb); x25_neigh_put(nb); } x25_route_device_down(dev); break; case NETDEV_PRE_TYPE_CHANGE: case NETDEV_UNREGISTER: x25_link_device_down(dev); break; case NETDEV_CHANGE: if (!netif_carrier_ok(dev)) { nb = x25_get_neigh(dev); if (nb) { x25_link_terminated(nb); x25_neigh_put(nb); } } break; } } return NOTIFY_DONE; } /* * Add a socket to the bound sockets list. */ static void x25_insert_socket(struct sock *sk) { write_lock_bh(&x25_list_lock); sk_add_node(sk, &x25_list); write_unlock_bh(&x25_list_lock); } /* * Find a socket that wants to accept the Call Request we just * received. Check the full list for an address/cud match. * If no cuds match return the next_best thing, an address match. * Note: if a listening socket has cud set it must only get calls * with matching cud. */ static struct sock *x25_find_listener(struct x25_address *addr, struct sk_buff *skb) { struct sock *s; struct sock *next_best; read_lock_bh(&x25_list_lock); next_best = NULL; sk_for_each(s, &x25_list) if ((!strcmp(addr->x25_addr, x25_sk(s)->source_addr.x25_addr) || !strcmp(x25_sk(s)->source_addr.x25_addr, null_x25_address.x25_addr)) && s->sk_state == TCP_LISTEN) { /* * Found a listening socket, now check the incoming * call user data vs this sockets call user data */ if (x25_sk(s)->cudmatchlength > 0 && skb->len >= x25_sk(s)->cudmatchlength) { if((memcmp(x25_sk(s)->calluserdata.cuddata, skb->data, x25_sk(s)->cudmatchlength)) == 0) { sock_hold(s); goto found; } } else next_best = s; } if (next_best) { s = next_best; sock_hold(s); goto found; } s = NULL; found: read_unlock_bh(&x25_list_lock); return s; } /* * Find a connected X.25 socket given my LCI and neighbour. */ static struct sock *__x25_find_socket(unsigned int lci, struct x25_neigh *nb) { struct sock *s; sk_for_each(s, &x25_list) if (x25_sk(s)->lci == lci && x25_sk(s)->neighbour == nb) { sock_hold(s); goto found; } s = NULL; found: return s; } struct sock *x25_find_socket(unsigned int lci, struct x25_neigh *nb) { struct sock *s; read_lock_bh(&x25_list_lock); s = __x25_find_socket(lci, nb); read_unlock_bh(&x25_list_lock); return s; } /* * Find a unique LCI for a given device. */ static unsigned int x25_new_lci(struct x25_neigh *nb) { unsigned int lci = 1; struct sock *sk; while ((sk = x25_find_socket(lci, nb)) != NULL) { sock_put(sk); if (++lci == 4096) { lci = 0; break; } cond_resched(); } return lci; } /* * Deferred destroy. */ static void __x25_destroy_socket(struct sock *); /* * handler for deferred kills. */ static void x25_destroy_timer(struct timer_list *t) { struct sock *sk = timer_container_of(sk, t, sk_timer); x25_destroy_socket_from_timer(sk); } /* * This is called from user mode and the timers. Thus it protects itself * against interrupting users but doesn't worry about being called during * work. Once it is removed from the queue no interrupt or bottom half * will touch it and we are (fairly 8-) ) safe. * Not static as it's used by the timer */ static void __x25_destroy_socket(struct sock *sk) { struct sk_buff *skb; x25_stop_heartbeat(sk); x25_stop_timer(sk); x25_remove_socket(sk); x25_clear_queues(sk); /* Flush the queues */ while ((skb = skb_dequeue(&sk->sk_receive_queue)) != NULL) { if (skb->sk != sk) { /* A pending connection */ /* * Queue the unaccepted socket for death */ skb->sk->sk_state = TCP_LISTEN; sock_set_flag(skb->sk, SOCK_DEAD); x25_start_heartbeat(skb->sk); x25_sk(skb->sk)->state = X25_STATE_0; } kfree_skb(skb); } if (sk_has_allocations(sk)) { /* Defer: outstanding buffers */ sk->sk_timer.expires = jiffies + 10 * HZ; sk->sk_timer.function = x25_destroy_timer; add_timer(&sk->sk_timer); } else { /* drop last reference so sock_put will free */ __sock_put(sk); } } void x25_destroy_socket_from_timer(struct sock *sk) { sock_hold(sk); bh_lock_sock(sk); __x25_destroy_socket(sk); bh_unlock_sock(sk); sock_put(sk); } /* * Handling for system calls applied via the various interfaces to a * X.25 socket object. */ static int x25_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { int opt; struct sock *sk = sock->sk; int rc = -ENOPROTOOPT; if (level != SOL_X25 || optname != X25_QBITINCL) goto out; rc = -EINVAL; if (optlen < sizeof(int)) goto out; rc = -EFAULT; if (copy_from_sockptr(&opt, optval, sizeof(int))) goto out; if (opt) set_bit(X25_Q_BIT_FLAG, &x25_sk(sk)->flags); else clear_bit(X25_Q_BIT_FLAG, &x25_sk(sk)->flags); rc = 0; out: return rc; } static int x25_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct sock *sk = sock->sk; int val, len, rc = -ENOPROTOOPT; if (level != SOL_X25 || optname != X25_QBITINCL) goto out; rc = -EFAULT; if (get_user(len, optlen)) goto out; rc = -EINVAL; if (len < 0) goto out; len = min_t(unsigned int, len, sizeof(int)); rc = -EFAULT; if (put_user(len, optlen)) goto out; val = test_bit(X25_Q_BIT_FLAG, &x25_sk(sk)->flags); rc = copy_to_user(optval, &val, len) ? -EFAULT : 0; out: return rc; } static int x25_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk; int rc = -EOPNOTSUPP; lock_sock(sk); if (sock->state != SS_UNCONNECTED) { rc = -EINVAL; release_sock(sk); return rc; } if (sk->sk_state != TCP_LISTEN) { memset(&x25_sk(sk)->dest_addr, 0, X25_ADDR_LEN); sk->sk_max_ack_backlog = backlog; sk->sk_state = TCP_LISTEN; rc = 0; } release_sock(sk); return rc; } static struct proto x25_proto = { .name = "X25", .owner = THIS_MODULE, .obj_size = sizeof(struct x25_sock), }; static struct sock *x25_alloc_socket(struct net *net, int kern) { struct x25_sock *x25; struct sock *sk = sk_alloc(net, AF_X25, GFP_ATOMIC, &x25_proto, kern); if (!sk) goto out; sock_init_data(NULL, sk); x25 = x25_sk(sk); skb_queue_head_init(&x25->ack_queue); skb_queue_head_init(&x25->fragment_queue); skb_queue_head_init(&x25->interrupt_in_queue); skb_queue_head_init(&x25->interrupt_out_queue); out: return sk; } static int x25_create(struct net *net, struct socket *sock, int protocol, int kern) { struct sock *sk; struct x25_sock *x25; int rc = -EAFNOSUPPORT; if (!net_eq(net, &init_net)) goto out; rc = -ESOCKTNOSUPPORT; if (sock->type != SOCK_SEQPACKET) goto out; rc = -EINVAL; if (protocol) goto out; rc = -ENOMEM; if ((sk = x25_alloc_socket(net, kern)) == NULL) goto out; x25 = x25_sk(sk); sock_init_data(sock, sk); x25_init_timers(sk); sock->ops = &x25_proto_ops; sk->sk_protocol = protocol; sk->sk_backlog_rcv = x25_backlog_rcv; x25->t21 = sysctl_x25_call_request_timeout; x25->t22 = sysctl_x25_reset_request_timeout; x25->t23 = sysctl_x25_clear_request_timeout; x25->t2 = sysctl_x25_ack_holdback_timeout; x25->state = X25_STATE_0; x25->cudmatchlength = 0; set_bit(X25_ACCPT_APPRV_FLAG, &x25->flags); /* normally no cud */ /* on call accept */ x25->facilities.winsize_in = X25_DEFAULT_WINDOW_SIZE; x25->facilities.winsize_out = X25_DEFAULT_WINDOW_SIZE; x25->facilities.pacsize_in = X25_DEFAULT_PACKET_SIZE; x25->facilities.pacsize_out = X25_DEFAULT_PACKET_SIZE; x25->facilities.throughput = 0; /* by default don't negotiate throughput */ x25->facilities.reverse = X25_DEFAULT_REVERSE; x25->dte_facilities.calling_len = 0; x25->dte_facilities.called_len = 0; memset(x25->dte_facilities.called_ae, '\0', sizeof(x25->dte_facilities.called_ae)); memset(x25->dte_facilities.calling_ae, '\0', sizeof(x25->dte_facilities.calling_ae)); rc = 0; out: return rc; } static struct sock *x25_make_new(struct sock *osk) { struct sock *sk = NULL; struct x25_sock *x25, *ox25; if (osk->sk_type != SOCK_SEQPACKET) goto out; if ((sk = x25_alloc_socket(sock_net(osk), 0)) == NULL) goto out; x25 = x25_sk(sk); sk->sk_type = osk->sk_type; sk->sk_priority = READ_ONCE(osk->sk_priority); sk->sk_protocol = osk->sk_protocol; sk->sk_rcvbuf = osk->sk_rcvbuf; sk->sk_sndbuf = osk->sk_sndbuf; sk->sk_state = TCP_ESTABLISHED; sk->sk_backlog_rcv = osk->sk_backlog_rcv; sock_copy_flags(sk, osk); ox25 = x25_sk(osk); x25->t21 = ox25->t21; x25->t22 = ox25->t22; x25->t23 = ox25->t23; x25->t2 = ox25->t2; x25->flags = ox25->flags; x25->facilities = ox25->facilities; x25->dte_facilities = ox25->dte_facilities; x25->cudmatchlength = ox25->cudmatchlength; clear_bit(X25_INTERRUPT_FLAG, &x25->flags); x25_init_timers(sk); out: return sk; } static int x25_release(struct socket *sock) { struct sock *sk = sock->sk; struct x25_sock *x25; if (!sk) return 0; x25 = x25_sk(sk); sock_hold(sk); lock_sock(sk); switch (x25->state) { case X25_STATE_0: case X25_STATE_2: x25_disconnect(sk, 0, 0, 0); __x25_destroy_socket(sk); goto out; case X25_STATE_1: case X25_STATE_3: case X25_STATE_4: x25_clear_queues(sk); x25_write_internal(sk, X25_CLEAR_REQUEST); x25_start_t23timer(sk); x25->state = X25_STATE_2; sk->sk_state = TCP_CLOSE; sk->sk_shutdown |= SEND_SHUTDOWN; sk->sk_state_change(sk); sock_set_flag(sk, SOCK_DEAD); sock_set_flag(sk, SOCK_DESTROY); break; case X25_STATE_5: x25_write_internal(sk, X25_CLEAR_REQUEST); x25_disconnect(sk, 0, 0, 0); __x25_destroy_socket(sk); goto out; } sock_orphan(sk); out: release_sock(sk); sock_put(sk); return 0; } static int x25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) { struct sock *sk = sock->sk; struct sockaddr_x25 *addr = (struct sockaddr_x25 *)uaddr; int len, i, rc = 0; if (addr_len != sizeof(struct sockaddr_x25) || addr->sx25_family != AF_X25 || strnlen(addr->sx25_addr.x25_addr, X25_ADDR_LEN) == X25_ADDR_LEN) { rc = -EINVAL; goto out; } /* check for the null_x25_address */ if (strcmp(addr->sx25_addr.x25_addr, null_x25_address.x25_addr)) { len = strlen(addr->sx25_addr.x25_addr); for (i = 0; i < len; i++) { if (!isdigit(addr->sx25_addr.x25_addr[i])) { rc = -EINVAL; goto out; } } } lock_sock(sk); if (sock_flag(sk, SOCK_ZAPPED)) { x25_sk(sk)->source_addr = addr->sx25_addr; x25_insert_socket(sk); sock_reset_flag(sk, SOCK_ZAPPED); } else { rc = -EINVAL; } release_sock(sk); net_dbg_ratelimited("x25_bind: socket is bound\n"); out: return rc; } static int x25_wait_for_connection_establishment(struct sock *sk) { DECLARE_WAITQUEUE(wait, current); int rc; add_wait_queue_exclusive(sk_sleep(sk), &wait); for (;;) { __set_current_state(TASK_INTERRUPTIBLE); rc = -ERESTARTSYS; if (signal_pending(current)) break; rc = sock_error(sk); if (rc) { sk->sk_socket->state = SS_UNCONNECTED; break; } rc = -ENOTCONN; if (sk->sk_state == TCP_CLOSE) { sk->sk_socket->state = SS_UNCONNECTED; break; } rc = 0; if (sk->sk_state != TCP_ESTABLISHED) { release_sock(sk); schedule(); lock_sock(sk); } else break; } __set_current_state(TASK_RUNNING); remove_wait_queue(sk_sleep(sk), &wait); return rc; } static int x25_connect(struct socket *sock, struct sockaddr *uaddr, int addr_len, int flags) { struct sock *sk = sock->sk; struct x25_sock *x25 = x25_sk(sk); struct sockaddr_x25 *addr = (struct sockaddr_x25 *)uaddr; struct x25_route *rt; int rc = 0; lock_sock(sk); if (sk->sk_state == TCP_ESTABLISHED && sock->state == SS_CONNECTING) { sock->state = SS_CONNECTED; goto out; /* Connect completed during a ERESTARTSYS event */ } rc = -ECONNREFUSED; if (sk->sk_state == TCP_CLOSE && sock->state == SS_CONNECTING) { sock->state = SS_UNCONNECTED; goto out; } rc = -EISCONN; /* No reconnect on a seqpacket socket */ if (sk->sk_state == TCP_ESTABLISHED) goto out; rc = -EALREADY; /* Do nothing if call is already in progress */ if (sk->sk_state == TCP_SYN_SENT) goto out; sk->sk_state = TCP_CLOSE; sock->state = SS_UNCONNECTED; rc = -EINVAL; if (addr_len != sizeof(struct sockaddr_x25) || addr->sx25_family != AF_X25 || strnlen(addr->sx25_addr.x25_addr, X25_ADDR_LEN) == X25_ADDR_LEN) goto out; rc = -ENETUNREACH; rt = x25_get_route(&addr->sx25_addr); if (!rt) goto out; x25->neighbour = x25_get_neigh(rt->dev); if (!x25->neighbour) goto out_put_route; x25_limit_facilities(&x25->facilities, x25->neighbour); x25->lci = x25_new_lci(x25->neighbour); if (!x25->lci) goto out_put_neigh; rc = -EINVAL; if (sock_flag(sk, SOCK_ZAPPED)) /* Must bind first - autobinding does not work */ goto out_put_neigh; if (!strcmp(x25->source_addr.x25_addr, null_x25_address.x25_addr)) memset(&x25->source_addr, '\0', X25_ADDR_LEN); x25->dest_addr = addr->sx25_addr; /* Move to connecting socket, start sending Connect Requests */ sock->state = SS_CONNECTING; sk->sk_state = TCP_SYN_SENT; x25->state = X25_STATE_1; x25_write_internal(sk, X25_CALL_REQUEST); x25_start_heartbeat(sk); x25_start_t21timer(sk); /* Now the loop */ rc = -EINPROGRESS; if (sk->sk_state != TCP_ESTABLISHED && (flags & O_NONBLOCK)) goto out; rc = x25_wait_for_connection_establishment(sk); if (rc) goto out_put_neigh; sock->state = SS_CONNECTED; rc = 0; out_put_neigh: if (rc && x25->neighbour) { read_lock_bh(&x25_list_lock); x25_neigh_put(x25->neighbour); x25->neighbour = NULL; read_unlock_bh(&x25_list_lock); x25->state = X25_STATE_0; } out_put_route: x25_route_put(rt); out: release_sock(sk); return rc; } static int x25_wait_for_data(struct sock *sk, long timeout) { DECLARE_WAITQUEUE(wait, current); int rc = 0; add_wait_queue_exclusive(sk_sleep(sk), &wait); for (;;) { __set_current_state(TASK_INTERRUPTIBLE); if (sk->sk_shutdown & RCV_SHUTDOWN) break; rc = -ERESTARTSYS; if (signal_pending(current)) break; rc = -EAGAIN; if (!timeout) break; rc = 0; if (skb_queue_empty(&sk->sk_receive_queue)) { release_sock(sk); timeout = schedule_timeout(timeout); lock_sock(sk); } else break; } __set_current_state(TASK_RUNNING); remove_wait_queue(sk_sleep(sk), &wait); return rc; } static int x25_accept(struct socket *sock, struct socket *newsock, struct proto_accept_arg *arg) { struct sock *sk = sock->sk; struct sock *newsk; struct sk_buff *skb; int rc = -EINVAL; if (!sk) goto out; rc = -EOPNOTSUPP; if (sk->sk_type != SOCK_SEQPACKET) goto out; lock_sock(sk); rc = -EINVAL; if (sk->sk_state != TCP_LISTEN) goto out2; rc = x25_wait_for_data(sk, sk->sk_rcvtimeo); if (rc) goto out2; skb = skb_dequeue(&sk->sk_receive_queue); rc = -EINVAL; if (!skb->sk) goto out2; newsk = skb->sk; sock_graft(newsk, newsock); /* Now attach up the new socket */ skb->sk = NULL; kfree_skb(skb); sk_acceptq_removed(sk); newsock->state = SS_CONNECTED; rc = 0; out2: release_sock(sk); out: return rc; } static int x25_getname(struct socket *sock, struct sockaddr *uaddr, int peer) { struct sockaddr_x25 *sx25 = (struct sockaddr_x25 *)uaddr; struct sock *sk = sock->sk; struct x25_sock *x25 = x25_sk(sk); int rc = 0; if (peer) { if (sk->sk_state != TCP_ESTABLISHED) { rc = -ENOTCONN; goto out; } sx25->sx25_addr = x25->dest_addr; } else sx25->sx25_addr = x25->source_addr; sx25->sx25_family = AF_X25; rc = sizeof(*sx25); out: return rc; } int x25_rx_call_request(struct sk_buff *skb, struct x25_neigh *nb, unsigned int lci) { struct sock *sk; struct sock *make; struct x25_sock *makex25; struct x25_address source_addr, dest_addr; struct x25_facilities facilities; struct x25_dte_facilities dte_facilities; int len, addr_len, rc; /* * Remove the LCI and frame type. */ skb_pull(skb, X25_STD_MIN_LEN); /* * Extract the X.25 addresses and convert them to ASCII strings, * and remove them. * * Address block is mandatory in call request packets */ addr_len = x25_parse_address_block(skb, &source_addr, &dest_addr); if (addr_len <= 0) goto out_clear_request; skb_pull(skb, addr_len); /* * Get the length of the facilities, skip past them for the moment * get the call user data because this is needed to determine * the correct listener * * Facilities length is mandatory in call request packets */ if (!pskb_may_pull(skb, 1)) goto out_clear_request; len = skb->data[0] + 1; if (!pskb_may_pull(skb, len)) goto out_clear_request; skb_pull(skb,len); /* * Ensure that the amount of call user data is valid. */ if (skb->len > X25_MAX_CUD_LEN) goto out_clear_request; /* * Get all the call user data so it can be used in * x25_find_listener and skb_copy_from_linear_data up ahead. */ if (!pskb_may_pull(skb, skb->len)) goto out_clear_request; /* * Find a listener for the particular address/cud pair. */ sk = x25_find_listener(&source_addr,skb); skb_push(skb,len); if (sk != NULL && sk_acceptq_is_full(sk)) { goto out_sock_put; } /* * We dont have any listeners for this incoming call. * Try forwarding it. */ if (sk == NULL) { skb_push(skb, addr_len + X25_STD_MIN_LEN); if (sysctl_x25_forward && x25_forward_call(&dest_addr, nb, skb, lci) > 0) { /* Call was forwarded, dont process it any more */ kfree_skb(skb); rc = 1; goto out; } else { /* No listeners, can't forward, clear the call */ goto out_clear_request; } } /* * Try to reach a compromise on the requested facilities. */ len = x25_negotiate_facilities(skb, sk, &facilities, &dte_facilities); if (len == -1) goto out_sock_put; /* * current neighbour/link might impose additional limits * on certain facilities */ x25_limit_facilities(&facilities, nb); /* * Try to create a new socket. */ make = x25_make_new(sk); if (!make) goto out_sock_put; /* * Remove the facilities */ skb_pull(skb, len); skb->sk = make; make->sk_state = TCP_ESTABLISHED; makex25 = x25_sk(make); makex25->lci = lci; makex25->dest_addr = dest_addr; makex25->source_addr = source_addr; x25_neigh_hold(nb); makex25->neighbour = nb; makex25->facilities = facilities; makex25->dte_facilities= dte_facilities; makex25->vc_facil_mask = x25_sk(sk)->vc_facil_mask; /* ensure no reverse facil on accept */ makex25->vc_facil_mask &= ~X25_MASK_REVERSE; /* ensure no calling address extension on accept */ makex25->vc_facil_mask &= ~X25_MASK_CALLING_AE; makex25->cudmatchlength = x25_sk(sk)->cudmatchlength; /* Normally all calls are accepted immediately */ if (test_bit(X25_ACCPT_APPRV_FLAG, &makex25->flags)) { x25_write_internal(make, X25_CALL_ACCEPTED); makex25->state = X25_STATE_3; } else { makex25->state = X25_STATE_5; } /* * Incoming Call User Data. */ skb_copy_from_linear_data(skb, makex25->calluserdata.cuddata, skb->len); makex25->calluserdata.cudlength = skb->len; sk_acceptq_added(sk); x25_insert_socket(make); skb_queue_head(&sk->sk_receive_queue, skb); x25_start_heartbeat(make); if (!sock_flag(sk, SOCK_DEAD)) sk->sk_data_ready(sk); rc = 1; sock_put(sk); out: return rc; out_sock_put: sock_put(sk); out_clear_request: rc = 0; x25_transmit_clear_request(nb, lci, 0x01); goto out; } static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct x25_sock *x25 = x25_sk(sk); DECLARE_SOCKADDR(struct sockaddr_x25 *, usx25, msg->msg_name); struct sockaddr_x25 sx25; struct sk_buff *skb; unsigned char *asmptr; int noblock = msg->msg_flags & MSG_DONTWAIT; size_t size; int qbit = 0, rc = -EINVAL; lock_sock(sk); if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_OOB|MSG_EOR|MSG_CMSG_COMPAT)) goto out; /* we currently don't support segmented records at the user interface */ if (!(msg->msg_flags & (MSG_EOR|MSG_OOB))) goto out; rc = -EADDRNOTAVAIL; if (sock_flag(sk, SOCK_ZAPPED)) goto out; rc = -EPIPE; if (sk->sk_shutdown & SEND_SHUTDOWN) { send_sig(SIGPIPE, current, 0); goto out; } rc = -ENETUNREACH; if (!x25->neighbour) goto out; if (usx25) { rc = -EINVAL; if (msg->msg_namelen < sizeof(sx25)) goto out; memcpy(&sx25, usx25, sizeof(sx25)); rc = -EISCONN; if (strcmp(x25->dest_addr.x25_addr, sx25.sx25_addr.x25_addr)) goto out; rc = -EINVAL; if (sx25.sx25_family != AF_X25) goto out; } else { /* * FIXME 1003.1g - if the socket is like this because * it has become closed (not started closed) we ought * to SIGPIPE, EPIPE; */ rc = -ENOTCONN; if (sk->sk_state != TCP_ESTABLISHED) goto out; sx25.sx25_family = AF_X25; sx25.sx25_addr = x25->dest_addr; } /* Sanity check the packet size */ if (len > 65535) { rc = -EMSGSIZE; goto out; } net_dbg_ratelimited("x25_sendmsg: sendto: Addresses built.\n"); /* Build a packet */ net_dbg_ratelimited("x25_sendmsg: sendto: building packet.\n"); if ((msg->msg_flags & MSG_OOB) && len > 32) len = 32; size = len + X25_MAX_L2_LEN + X25_EXT_MIN_LEN; release_sock(sk); skb = sock_alloc_send_skb(sk, size, noblock, &rc); lock_sock(sk); if (!skb) goto out; X25_SKB_CB(skb)->flags = msg->msg_flags; skb_reserve(skb, X25_MAX_L2_LEN + X25_EXT_MIN_LEN); /* * Put the data on the end */ net_dbg_ratelimited("x25_sendmsg: Copying user data\n"); skb_reset_transport_header(skb); skb_put(skb, len); rc = memcpy_from_msg(skb_transport_header(skb), msg, len); if (rc) goto out_kfree_skb; /* * If the Q BIT Include socket option is in force, the first * byte of the user data is the logical value of the Q Bit. */ if (test_bit(X25_Q_BIT_FLAG, &x25->flags)) { if (!pskb_may_pull(skb, 1)) goto out_kfree_skb; qbit = skb->data[0]; skb_pull(skb, 1); } /* * Push down the X.25 header */ net_dbg_ratelimited("x25_sendmsg: Building X.25 Header.\n"); if (msg->msg_flags & MSG_OOB) { if (x25->neighbour->extended) { asmptr = skb_push(skb, X25_STD_MIN_LEN); *asmptr++ = ((x25->lci >> 8) & 0x0F) | X25_GFI_EXTSEQ; *asmptr++ = (x25->lci >> 0) & 0xFF; *asmptr++ = X25_INTERRUPT; } else { asmptr = skb_push(skb, X25_STD_MIN_LEN); *asmptr++ = ((x25->lci >> 8) & 0x0F) | X25_GFI_STDSEQ; *asmptr++ = (x25->lci >> 0) & 0xFF; *asmptr++ = X25_INTERRUPT; } } else { if (x25->neighbour->extended) { /* Build an Extended X.25 header */ asmptr = skb_push(skb, X25_EXT_MIN_LEN); *asmptr++ = ((x25->lci >> 8) & 0x0F) | X25_GFI_EXTSEQ; *asmptr++ = (x25->lci >> 0) & 0xFF; *asmptr++ = X25_DATA; *asmptr++ = X25_DATA; } else { /* Build an Standard X.25 header */ asmptr = skb_push(skb, X25_STD_MIN_LEN); *asmptr++ = ((x25->lci >> 8) & 0x0F) | X25_GFI_STDSEQ; *asmptr++ = (x25->lci >> 0) & 0xFF; *asmptr++ = X25_DATA; } if (qbit) skb->data[0] |= X25_Q_BIT; } net_dbg_ratelimited("x25_sendmsg: Built header.\n"); net_dbg_ratelimited("x25_sendmsg: Transmitting buffer\n"); rc = -ENOTCONN; if (sk->sk_state != TCP_ESTABLISHED) goto out_kfree_skb; if (msg->msg_flags & MSG_OOB) skb_queue_tail(&x25->interrupt_out_queue, skb); else { rc = x25_output(sk, skb); len = rc; if (rc < 0) kfree_skb(skb); else if (test_bit(X25_Q_BIT_FLAG, &x25->flags)) len++; } x25_kick(sk); rc = len; out: release_sock(sk); return rc; out_kfree_skb: kfree_skb(skb); goto out; } static int x25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct x25_sock *x25 = x25_sk(sk); DECLARE_SOCKADDR(struct sockaddr_x25 *, sx25, msg->msg_name); size_t copied; int qbit, header_len; struct sk_buff *skb; unsigned char *asmptr; int rc = -ENOTCONN; lock_sock(sk); if (x25->neighbour == NULL) goto out; header_len = x25->neighbour->extended ? X25_EXT_MIN_LEN : X25_STD_MIN_LEN; /* * This works for seqpacket too. The receiver has ordered the queue for * us! We do one quick check first though */ if (sk->sk_state != TCP_ESTABLISHED) goto out; if (flags & MSG_OOB) { rc = -EINVAL; if (sock_flag(sk, SOCK_URGINLINE) || !skb_peek(&x25->interrupt_in_queue)) goto out; skb = skb_dequeue(&x25->interrupt_in_queue); if (!pskb_may_pull(skb, X25_STD_MIN_LEN)) goto out_free_dgram; skb_pull(skb, X25_STD_MIN_LEN); /* * No Q bit information on Interrupt data. */ if (test_bit(X25_Q_BIT_FLAG, &x25->flags)) { asmptr = skb_push(skb, 1); *asmptr = 0x00; } msg->msg_flags |= MSG_OOB; } else { /* Now we can treat all alike */ release_sock(sk); skb = skb_recv_datagram(sk, flags, &rc); lock_sock(sk); if (!skb) goto out; if (!pskb_may_pull(skb, header_len)) goto out_free_dgram; qbit = (skb->data[0] & X25_Q_BIT) == X25_Q_BIT; skb_pull(skb, header_len); if (test_bit(X25_Q_BIT_FLAG, &x25->flags)) { asmptr = skb_push(skb, 1); *asmptr = qbit; } } skb_reset_transport_header(skb); copied = skb->len; if (copied > size) { copied = size; msg->msg_flags |= MSG_TRUNC; } /* Currently, each datagram always contains a complete record */ msg->msg_flags |= MSG_EOR; rc = skb_copy_datagram_msg(skb, 0, msg, copied); if (rc) goto out_free_dgram; if (sx25) { sx25->sx25_family = AF_X25; sx25->sx25_addr = x25->dest_addr; msg->msg_namelen = sizeof(*sx25); } x25_check_rbuf(sk); rc = copied; out_free_dgram: skb_free_datagram(sk, skb); out: release_sock(sk); return rc; } static int x25_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) { struct sock *sk = sock->sk; struct x25_sock *x25 = x25_sk(sk); void __user *argp = (void __user *)arg; int rc; switch (cmd) { case TIOCOUTQ: { int amount; amount = sk->sk_sndbuf - sk_wmem_alloc_get(sk); if (amount < 0) amount = 0; rc = put_user(amount, (unsigned int __user *)argp); break; } case TIOCINQ: { struct sk_buff *skb; int amount = 0; /* * These two are safe on a single CPU system as * only user tasks fiddle here */ lock_sock(sk); if ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) amount = skb->len; release_sock(sk); rc = put_user(amount, (unsigned int __user *)argp); break; } case SIOCGIFADDR: case SIOCSIFADDR: case SIOCGIFDSTADDR: case SIOCSIFDSTADDR: case SIOCGIFBRDADDR: case SIOCSIFBRDADDR: case SIOCGIFNETMASK: case SIOCSIFNETMASK: case SIOCGIFMETRIC: case SIOCSIFMETRIC: rc = -EINVAL; break; case SIOCADDRT: case SIOCDELRT: rc = -EPERM; if (!capable(CAP_NET_ADMIN)) break; rc = x25_route_ioctl(cmd, argp); break; case SIOCX25GSUBSCRIP: rc = x25_subscr_ioctl(cmd, argp); break; case SIOCX25SSUBSCRIP: rc = -EPERM; if (!capable(CAP_NET_ADMIN)) break; rc = x25_subscr_ioctl(cmd, argp); break; case SIOCX25GFACILITIES: { lock_sock(sk); rc = copy_to_user(argp, &x25->facilities, sizeof(x25->facilities)) ? -EFAULT : 0; release_sock(sk); break; } case SIOCX25SFACILITIES: { struct x25_facilities facilities; rc = -EFAULT; if (copy_from_user(&facilities, argp, sizeof(facilities))) break; rc = -EINVAL; lock_sock(sk); if (sk->sk_state != TCP_LISTEN && sk->sk_state != TCP_CLOSE) goto out_fac_release; if (facilities.pacsize_in < X25_PS16 || facilities.pacsize_in > X25_PS4096) goto out_fac_release; if (facilities.pacsize_out < X25_PS16 || facilities.pacsize_out > X25_PS4096) goto out_fac_release; if (facilities.winsize_in < 1 || facilities.winsize_in > 127) goto out_fac_release; if (facilities.throughput) { int out = facilities.throughput & 0xf0; int in = facilities.throughput & 0x0f; if (!out) facilities.throughput |= X25_DEFAULT_THROUGHPUT << 4; else if (out < 0x30 || out > 0xD0) goto out_fac_release; if (!in) facilities.throughput |= X25_DEFAULT_THROUGHPUT; else if (in < 0x03 || in > 0x0D) goto out_fac_release; } if (facilities.reverse && (facilities.reverse & 0x81) != 0x81) goto out_fac_release; x25->facilities = facilities; rc = 0; out_fac_release: release_sock(sk); break; } case SIOCX25GDTEFACILITIES: { lock_sock(sk); rc = copy_to_user(argp, &x25->dte_facilities, sizeof(x25->dte_facilities)); release_sock(sk); if (rc) rc = -EFAULT; break; } case SIOCX25SDTEFACILITIES: { struct x25_dte_facilities dtefacs; rc = -EFAULT; if (copy_from_user(&dtefacs, argp, sizeof(dtefacs))) break; rc = -EINVAL; lock_sock(sk); if (sk->sk_state != TCP_LISTEN && sk->sk_state != TCP_CLOSE) goto out_dtefac_release; if (dtefacs.calling_len > X25_MAX_AE_LEN) goto out_dtefac_release; if (dtefacs.called_len > X25_MAX_AE_LEN) goto out_dtefac_release; x25->dte_facilities = dtefacs; rc = 0; out_dtefac_release: release_sock(sk); break; } case SIOCX25GCALLUSERDATA: { lock_sock(sk); rc = copy_to_user(argp, &x25->calluserdata, sizeof(x25->calluserdata)) ? -EFAULT : 0; release_sock(sk); break; } case SIOCX25SCALLUSERDATA: { struct x25_calluserdata calluserdata; rc = -EFAULT; if (copy_from_user(&calluserdata, argp, sizeof(calluserdata))) break; rc = -EINVAL; if (calluserdata.cudlength > X25_MAX_CUD_LEN) break; lock_sock(sk); x25->calluserdata = calluserdata; release_sock(sk); rc = 0; break; } case SIOCX25GCAUSEDIAG: { lock_sock(sk); rc = copy_to_user(argp, &x25->causediag, sizeof(x25->causediag)) ? -EFAULT : 0; release_sock(sk); break; } case SIOCX25SCAUSEDIAG: { struct x25_causediag causediag; rc = -EFAULT; if (copy_from_user(&causediag, argp, sizeof(causediag))) break; lock_sock(sk); x25->causediag = causediag; release_sock(sk); rc = 0; break; } case SIOCX25SCUDMATCHLEN: { struct x25_subaddr sub_addr; rc = -EINVAL; lock_sock(sk); if(sk->sk_state != TCP_CLOSE) goto out_cud_release; rc = -EFAULT; if (copy_from_user(&sub_addr, argp, sizeof(sub_addr))) goto out_cud_release; rc = -EINVAL; if (sub_addr.cudmatchlength > X25_MAX_CUD_LEN) goto out_cud_release; x25->cudmatchlength = sub_addr.cudmatchlength; rc = 0; out_cud_release: release_sock(sk); break; } case SIOCX25CALLACCPTAPPRV: { rc = -EINVAL; lock_sock(sk); if (sk->sk_state == TCP_CLOSE) { clear_bit(X25_ACCPT_APPRV_FLAG, &x25->flags); rc = 0; } release_sock(sk); break; } case SIOCX25SENDCALLACCPT: { rc = -EINVAL; lock_sock(sk); if (sk->sk_state != TCP_ESTABLISHED) goto out_sendcallaccpt_release; /* must call accptapprv above */ if (test_bit(X25_ACCPT_APPRV_FLAG, &x25->flags)) goto out_sendcallaccpt_release; x25_write_internal(sk, X25_CALL_ACCEPTED); x25->state = X25_STATE_3; rc = 0; out_sendcallaccpt_release: release_sock(sk); break; } default: rc = -ENOIOCTLCMD; break; } return rc; } static const struct net_proto_family x25_family_ops = { .family = AF_X25, .create = x25_create, .owner = THIS_MODULE, }; #ifdef CONFIG_COMPAT static int compat_x25_subscr_ioctl(unsigned int cmd, struct compat_x25_subscrip_struct __user *x25_subscr32) { struct compat_x25_subscrip_struct x25_subscr; struct x25_neigh *nb; struct net_device *dev; int rc = -EINVAL; rc = -EFAULT; if (copy_from_user(&x25_subscr, x25_subscr32, sizeof(*x25_subscr32))) goto out; rc = -EINVAL; dev = x25_dev_get(x25_subscr.device); if (dev == NULL) goto out; nb = x25_get_neigh(dev); if (nb == NULL) goto out_dev_put; dev_put(dev); if (cmd == SIOCX25GSUBSCRIP) { read_lock_bh(&x25_neigh_list_lock); x25_subscr.extended = nb->extended; x25_subscr.global_facil_mask = nb->global_facil_mask; read_unlock_bh(&x25_neigh_list_lock); rc = copy_to_user(x25_subscr32, &x25_subscr, sizeof(*x25_subscr32)) ? -EFAULT : 0; } else { rc = -EINVAL; if (x25_subscr.extended == 0 || x25_subscr.extended == 1) { rc = 0; write_lock_bh(&x25_neigh_list_lock); nb->extended = x25_subscr.extended; nb->global_facil_mask = x25_subscr.global_facil_mask; write_unlock_bh(&x25_neigh_list_lock); } } x25_neigh_put(nb); out: return rc; out_dev_put: dev_put(dev); goto out; } static int compat_x25_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) { void __user *argp = compat_ptr(arg); int rc = -ENOIOCTLCMD; switch(cmd) { case TIOCOUTQ: case TIOCINQ: rc = x25_ioctl(sock, cmd, (unsigned long)argp); break; case SIOCGIFADDR: case SIOCSIFADDR: case SIOCGIFDSTADDR: case SIOCSIFDSTADDR: case SIOCGIFBRDADDR: case SIOCSIFBRDADDR: case SIOCGIFNETMASK: case SIOCSIFNETMASK: case SIOCGIFMETRIC: case SIOCSIFMETRIC: rc = -EINVAL; break; case SIOCADDRT: case SIOCDELRT: rc = -EPERM; if (!capable(CAP_NET_ADMIN)) break; rc = x25_route_ioctl(cmd, argp); break; case SIOCX25GSUBSCRIP: rc = compat_x25_subscr_ioctl(cmd, argp); break; case SIOCX25SSUBSCRIP: rc = -EPERM; if (!capable(CAP_NET_ADMIN)) break; rc = compat_x25_subscr_ioctl(cmd, argp); break; case SIOCX25GFACILITIES: case SIOCX25SFACILITIES: case SIOCX25GDTEFACILITIES: case SIOCX25SDTEFACILITIES: case SIOCX25GCALLUSERDATA: case SIOCX25SCALLUSERDATA: case SIOCX25GCAUSEDIAG: case SIOCX25SCAUSEDIAG: case SIOCX25SCUDMATCHLEN: case SIOCX25CALLACCPTAPPRV: case SIOCX25SENDCALLACCPT: rc = x25_ioctl(sock, cmd, (unsigned long)argp); break; default: rc = -ENOIOCTLCMD; break; } return rc; } #endif static const struct proto_ops x25_proto_ops = { .family = AF_X25, .owner = THIS_MODULE, .release = x25_release, .bind = x25_bind, .connect = x25_connect, .socketpair = sock_no_socketpair, .accept = x25_accept, .getname = x25_getname, .poll = datagram_poll, .ioctl = x25_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = compat_x25_ioctl, #endif .gettstamp = sock_gettstamp, .listen = x25_listen, .shutdown = sock_no_shutdown, .setsockopt = x25_setsockopt, .getsockopt = x25_getsockopt, .sendmsg = x25_sendmsg, .recvmsg = x25_recvmsg, .mmap = sock_no_mmap, }; static struct packet_type x25_packet_type __read_mostly = { .type = cpu_to_be16(ETH_P_X25), .func = x25_lapb_receive_frame, }; static struct notifier_block x25_dev_notifier = { .notifier_call = x25_device_event, }; void x25_kill_by_neigh(struct x25_neigh *nb) { struct sock *s; write_lock_bh(&x25_list_lock); sk_for_each(s, &x25_list) { if (x25_sk(s)->neighbour == nb) { write_unlock_bh(&x25_list_lock); lock_sock(s); x25_disconnect(s, ENETUNREACH, 0, 0); release_sock(s); write_lock_bh(&x25_list_lock); } } write_unlock_bh(&x25_list_lock); /* Remove any related forwards */ x25_clear_forward_by_dev(nb->dev); } static int __init x25_init(void) { int rc; rc = proto_register(&x25_proto, 0); if (rc) goto out; rc = sock_register(&x25_family_ops); if (rc) goto out_proto; dev_add_pack(&x25_packet_type); rc = register_netdevice_notifier(&x25_dev_notifier); if (rc) goto out_sock; rc = x25_register_sysctl(); if (rc) goto out_dev; rc = x25_proc_init(); if (rc) goto out_sysctl; pr_info("Linux Version 0.2\n"); out: return rc; out_sysctl: x25_unregister_sysctl(); out_dev: unregister_netdevice_notifier(&x25_dev_notifier); out_sock: dev_remove_pack(&x25_packet_type); sock_unregister(AF_X25); out_proto: proto_unregister(&x25_proto); goto out; } module_init(x25_init); static void __exit x25_exit(void) { x25_proc_exit(); x25_link_free(); x25_route_free(); x25_unregister_sysctl(); unregister_netdevice_notifier(&x25_dev_notifier); dev_remove_pack(&x25_packet_type); sock_unregister(AF_X25); proto_unregister(&x25_proto); } module_exit(x25_exit); MODULE_AUTHOR("Jonathan Naylor <g4klx@g4klx.demon.co.uk>"); MODULE_DESCRIPTION("The X.25 Packet Layer network layer protocol"); MODULE_LICENSE("GPL"); MODULE_ALIAS_NETPROTO(PF_X25); |
| 14 1 14 14 14 14 13 13 13 13 13 13 13 13 169 169 168 169 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 | // SPDX-License-Identifier: GPL-2.0-or-later /* */ #include <linux/gfp.h> #include <linux/init.h> #include <linux/ratelimit.h> #include <linux/usb.h> #include <linux/usb/audio.h> #include <linux/slab.h> #include <sound/core.h> #include <sound/pcm.h> #include <sound/pcm_params.h> #include "usbaudio.h" #include "helper.h" #include "card.h" #include "endpoint.h" #include "pcm.h" #include "clock.h" #include "quirks.h" enum { EP_STATE_STOPPED, EP_STATE_RUNNING, EP_STATE_STOPPING, }; /* interface refcounting */ struct snd_usb_iface_ref { unsigned char iface; bool need_setup; int opened; int altset; struct list_head list; }; /* clock refcounting */ struct snd_usb_clock_ref { unsigned char clock; atomic_t locked; int opened; int rate; bool need_setup; struct list_head list; }; /* * snd_usb_endpoint is a model that abstracts everything related to an * USB endpoint and its streaming. * * There are functions to activate and deactivate the streaming URBs and * optional callbacks to let the pcm logic handle the actual content of the * packets for playback and record. Thus, the bus streaming and the audio * handlers are fully decoupled. * * There are two different types of endpoints in audio applications. * * SND_USB_ENDPOINT_TYPE_DATA handles full audio data payload for both * inbound and outbound traffic. * * SND_USB_ENDPOINT_TYPE_SYNC endpoints are for inbound traffic only and * expect the payload to carry Q10.14 / Q16.16 formatted sync information * (3 or 4 bytes). * * Each endpoint has to be configured prior to being used by calling * snd_usb_endpoint_set_params(). * * The model incorporates a reference counting, so that multiple users * can call snd_usb_endpoint_start() and snd_usb_endpoint_stop(), and * only the first user will effectively start the URBs, and only the last * one to stop it will tear the URBs down again. */ /* * convert a sampling rate into our full speed format (fs/1000 in Q16.16) * this will overflow at approx 524 kHz */ static inline unsigned get_usb_full_speed_rate(unsigned int rate) { return ((rate << 13) + 62) / 125; } /* * convert a sampling rate into USB high speed format (fs/8000 in Q16.16) * this will overflow at approx 4 MHz */ static inline unsigned get_usb_high_speed_rate(unsigned int rate) { return ((rate << 10) + 62) / 125; } /* * release a urb data */ static void release_urb_ctx(struct snd_urb_ctx *u) { if (u->urb && u->buffer_size) usb_free_coherent(u->ep->chip->dev, u->buffer_size, u->urb->transfer_buffer, u->urb->transfer_dma); usb_free_urb(u->urb); u->urb = NULL; u->buffer_size = 0; } static const char *usb_error_string(int err) { switch (err) { case -ENODEV: return "no device"; case -ENOENT: return "endpoint not enabled"; case -EPIPE: return "endpoint stalled"; case -ENOSPC: return "not enough bandwidth"; case -ESHUTDOWN: return "device disabled"; case -EHOSTUNREACH: return "device suspended"; case -EINVAL: case -EAGAIN: case -EFBIG: case -EMSGSIZE: return "internal error"; default: return "unknown error"; } } static inline bool ep_state_running(struct snd_usb_endpoint *ep) { return atomic_read(&ep->state) == EP_STATE_RUNNING; } static inline bool ep_state_update(struct snd_usb_endpoint *ep, int old, int new) { return atomic_try_cmpxchg(&ep->state, &old, new); } /** * snd_usb_endpoint_implicit_feedback_sink: Report endpoint usage type * * @ep: The snd_usb_endpoint * * Determine whether an endpoint is driven by an implicit feedback * data endpoint source. */ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep) { return ep->implicit_fb_sync && usb_pipeout(ep->pipe); } /* * Return the number of samples to be sent in the next packet * for streaming based on information derived from sync endpoints * * This won't be used for implicit feedback which takes the packet size * returned from the sync source */ static int slave_next_packet_size(struct snd_usb_endpoint *ep, unsigned int avail) { unsigned long flags; unsigned int phase; int ret; if (ep->fill_max) return ep->maxframesize; spin_lock_irqsave(&ep->lock, flags); phase = (ep->phase & 0xffff) + (ep->freqm << ep->datainterval); ret = min(phase >> 16, ep->maxframesize); if (avail && ret >= avail) ret = -EAGAIN; else ep->phase = phase; spin_unlock_irqrestore(&ep->lock, flags); return ret; } /* * Return the number of samples to be sent in the next packet * for adaptive and synchronous endpoints */ static int next_packet_size(struct snd_usb_endpoint *ep, unsigned int avail) { unsigned int sample_accum; int ret; if (ep->fill_max) return ep->maxframesize; sample_accum = ep->sample_accum + ep->sample_rem; if (sample_accum >= ep->pps) { sample_accum -= ep->pps; ret = ep->packsize[1]; } else { ret = ep->packsize[0]; } if (avail && ret >= avail) ret = -EAGAIN; else ep->sample_accum = sample_accum; return ret; } /* * snd_usb_endpoint_next_packet_size: Return the number of samples to be sent * in the next packet * * If the size is equal or exceeds @avail, don't proceed but return -EAGAIN * Exception: @avail = 0 for skipping the check. */ int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx, int idx, unsigned int avail) { unsigned int packet; packet = ctx->packet_size[idx]; if (packet) { if (avail && packet >= avail) return -EAGAIN; return packet; } if (ep->sync_source) return slave_next_packet_size(ep, avail); else return next_packet_size(ep, avail); } static void call_retire_callback(struct snd_usb_endpoint *ep, struct urb *urb) { struct snd_usb_substream *data_subs; data_subs = READ_ONCE(ep->data_subs); if (data_subs && ep->retire_data_urb) ep->retire_data_urb(data_subs, urb); } static void retire_outbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { call_retire_callback(ep, urb_ctx->urb); } static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep, struct snd_usb_endpoint *sender, const struct urb *urb); static void retire_inbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { struct urb *urb = urb_ctx->urb; struct snd_usb_endpoint *sync_sink; if (unlikely(ep->skip_packets > 0)) { ep->skip_packets--; return; } sync_sink = READ_ONCE(ep->sync_sink); if (sync_sink) snd_usb_handle_sync_urb(sync_sink, ep, urb); call_retire_callback(ep, urb); } static inline bool has_tx_length_quirk(struct snd_usb_audio *chip) { return chip->quirk_flags & QUIRK_FLAG_TX_LENGTH; } static void prepare_silent_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx) { struct urb *urb = ctx->urb; unsigned int offs = 0; unsigned int extra = 0; __le32 packet_length; int i; /* For tx_length_quirk, put packet length at start of packet */ if (has_tx_length_quirk(ep->chip)) extra = sizeof(packet_length); for (i = 0; i < ctx->packets; ++i) { unsigned int offset; unsigned int length; int counts; counts = snd_usb_endpoint_next_packet_size(ep, ctx, i, 0); length = counts * ep->stride; /* number of silent bytes */ offset = offs * ep->stride + extra * i; urb->iso_frame_desc[i].offset = offset; urb->iso_frame_desc[i].length = length + extra; if (extra) { packet_length = cpu_to_le32(length); memcpy(urb->transfer_buffer + offset, &packet_length, sizeof(packet_length)); } memset(urb->transfer_buffer + offset + extra, ep->silence_value, length); offs += counts; } urb->number_of_packets = ctx->packets; urb->transfer_buffer_length = offs * ep->stride + ctx->packets * extra; ctx->queued = 0; } /* * Prepare a PLAYBACK urb for submission to the bus. */ static int prepare_outbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx, bool in_stream_lock) { struct urb *urb = ctx->urb; unsigned char *cp = urb->transfer_buffer; struct snd_usb_substream *data_subs; urb->dev = ep->chip->dev; /* we need to set this at each time */ switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: data_subs = READ_ONCE(ep->data_subs); if (data_subs && ep->prepare_data_urb) return ep->prepare_data_urb(data_subs, urb, in_stream_lock); /* no data provider, so send silence */ prepare_silent_urb(ep, ctx); break; case SND_USB_ENDPOINT_TYPE_SYNC: if (snd_usb_get_speed(ep->chip->dev) >= USB_SPEED_HIGH) { /* * fill the length and offset of each urb descriptor. * the fixed 12.13 frequency is passed as 16.16 through the pipe. */ urb->iso_frame_desc[0].length = 4; urb->iso_frame_desc[0].offset = 0; cp[0] = ep->freqn; cp[1] = ep->freqn >> 8; cp[2] = ep->freqn >> 16; cp[3] = ep->freqn >> 24; } else { /* * fill the length and offset of each urb descriptor. * the fixed 10.14 frequency is passed through the pipe. */ urb->iso_frame_desc[0].length = 3; urb->iso_frame_desc[0].offset = 0; cp[0] = ep->freqn >> 2; cp[1] = ep->freqn >> 10; cp[2] = ep->freqn >> 18; } break; } return 0; } /* * Prepare a CAPTURE or SYNC urb for submission to the bus. */ static int prepare_inbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { int i, offs; struct urb *urb = urb_ctx->urb; urb->dev = ep->chip->dev; /* we need to set this at each time */ switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: offs = 0; for (i = 0; i < urb_ctx->packets; i++) { urb->iso_frame_desc[i].offset = offs; urb->iso_frame_desc[i].length = ep->curpacksize; offs += ep->curpacksize; } urb->transfer_buffer_length = offs; urb->number_of_packets = urb_ctx->packets; break; case SND_USB_ENDPOINT_TYPE_SYNC: urb->iso_frame_desc[0].length = min(4u, ep->syncmaxsize); urb->iso_frame_desc[0].offset = 0; break; } return 0; } /* notify an error as XRUN to the assigned PCM data substream */ static void notify_xrun(struct snd_usb_endpoint *ep) { struct snd_usb_substream *data_subs; struct snd_pcm_substream *psubs; data_subs = READ_ONCE(ep->data_subs); if (!data_subs) return; psubs = data_subs->pcm_substream; if (psubs && psubs->runtime && psubs->runtime->state == SNDRV_PCM_STATE_RUNNING) snd_pcm_stop_xrun(psubs); } static struct snd_usb_packet_info * next_packet_fifo_enqueue(struct snd_usb_endpoint *ep) { struct snd_usb_packet_info *p; p = ep->next_packet + (ep->next_packet_head + ep->next_packet_queued) % ARRAY_SIZE(ep->next_packet); ep->next_packet_queued++; return p; } static struct snd_usb_packet_info * next_packet_fifo_dequeue(struct snd_usb_endpoint *ep) { struct snd_usb_packet_info *p; p = ep->next_packet + ep->next_packet_head; ep->next_packet_head++; ep->next_packet_head %= ARRAY_SIZE(ep->next_packet); ep->next_packet_queued--; return p; } static void push_back_to_ready_list(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx) { unsigned long flags; spin_lock_irqsave(&ep->lock, flags); list_add_tail(&ctx->ready_list, &ep->ready_playback_urbs); spin_unlock_irqrestore(&ep->lock, flags); } /* * Send output urbs that have been prepared previously. URBs are dequeued * from ep->ready_playback_urbs and in case there aren't any available * or there are no packets that have been prepared, this function does * nothing. * * The reason why the functionality of sending and preparing URBs is separated * is that host controllers don't guarantee the order in which they return * inbound and outbound packets to their submitters. * * This function is used both for implicit feedback endpoints and in low- * latency playback mode. */ int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, bool in_stream_lock) { bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep); while (ep_state_running(ep)) { unsigned long flags; struct snd_usb_packet_info *packet; struct snd_urb_ctx *ctx = NULL; int err, i; spin_lock_irqsave(&ep->lock, flags); if ((!implicit_fb || ep->next_packet_queued > 0) && !list_empty(&ep->ready_playback_urbs)) { /* take URB out of FIFO */ ctx = list_first_entry(&ep->ready_playback_urbs, struct snd_urb_ctx, ready_list); list_del_init(&ctx->ready_list); if (implicit_fb) packet = next_packet_fifo_dequeue(ep); } spin_unlock_irqrestore(&ep->lock, flags); if (ctx == NULL) break; /* copy over the length information */ if (implicit_fb) { for (i = 0; i < packet->packets; i++) ctx->packet_size[i] = packet->packet_size[i]; } /* call the data handler to fill in playback data */ err = prepare_outbound_urb(ep, ctx, in_stream_lock); /* can be stopped during prepare callback */ if (unlikely(!ep_state_running(ep))) break; if (err < 0) { /* push back to ready list again for -EAGAIN */ if (err == -EAGAIN) { push_back_to_ready_list(ep, ctx); break; } if (!in_stream_lock) notify_xrun(ep); return -EPIPE; } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(ctx->urb, GFP_ATOMIC); else err = -ENODEV; if (err < 0) { if (!atomic_read(&ep->chip->shutdown)) { usb_audio_err(ep->chip, "Unable to submit urb #%d: %d at %s\n", ctx->index, err, __func__); if (!in_stream_lock) notify_xrun(ep); } return -EPIPE; } set_bit(ctx->index, &ep->active_mask); atomic_inc(&ep->submitted_urbs); } return 0; } /* * complete callback for urbs */ static void snd_complete_urb(struct urb *urb) { struct snd_urb_ctx *ctx = urb->context; struct snd_usb_endpoint *ep = ctx->ep; int err; if (unlikely(urb->status == -ENOENT || /* unlinked */ urb->status == -ENODEV || /* device removed */ urb->status == -ECONNRESET || /* unlinked */ urb->status == -ESHUTDOWN)) /* device disabled */ goto exit_clear; /* device disconnected */ if (unlikely(atomic_read(&ep->chip->shutdown))) goto exit_clear; if (unlikely(!ep_state_running(ep))) goto exit_clear; if (usb_pipeout(ep->pipe)) { retire_outbound_urb(ep, ctx); /* can be stopped during retire callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; /* in low-latency and implicit-feedback modes, push back the * URB to ready list at first, then process as much as possible */ if (ep->lowlatency_playback || snd_usb_endpoint_implicit_feedback_sink(ep)) { push_back_to_ready_list(ep, ctx); clear_bit(ctx->index, &ep->active_mask); snd_usb_queue_pending_output_urbs(ep, false); /* decrement at last, and check xrun */ if (atomic_dec_and_test(&ep->submitted_urbs) && !snd_usb_endpoint_implicit_feedback_sink(ep)) notify_xrun(ep); return; } /* in non-lowlatency mode, no error handling for prepare */ prepare_outbound_urb(ep, ctx, false); /* can be stopped during prepare callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; } else { retire_inbound_urb(ep, ctx); /* can be stopped during retire callback */ if (unlikely(!ep_state_running(ep))) goto exit_clear; prepare_inbound_urb(ep, ctx); } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(urb, GFP_ATOMIC); else err = -ENODEV; if (err == 0) return; if (!atomic_read(&ep->chip->shutdown)) { usb_audio_err(ep->chip, "cannot submit urb (err = %d)\n", err); notify_xrun(ep); } exit_clear: clear_bit(ctx->index, &ep->active_mask); atomic_dec(&ep->submitted_urbs); } /* * Find or create a refcount object for the given interface * * The objects are released altogether in snd_usb_endpoint_free_all() */ static struct snd_usb_iface_ref * iface_ref_find(struct snd_usb_audio *chip, int iface) { struct snd_usb_iface_ref *ip; list_for_each_entry(ip, &chip->iface_ref_list, list) if (ip->iface == iface) return ip; ip = kzalloc(sizeof(*ip), GFP_KERNEL); if (!ip) return NULL; ip->iface = iface; list_add_tail(&ip->list, &chip->iface_ref_list); return ip; } /* Similarly, a refcount object for clock */ static struct snd_usb_clock_ref * clock_ref_find(struct snd_usb_audio *chip, int clock) { struct snd_usb_clock_ref *ref; list_for_each_entry(ref, &chip->clock_ref_list, list) if (ref->clock == clock) return ref; ref = kzalloc(sizeof(*ref), GFP_KERNEL); if (!ref) return NULL; ref->clock = clock; atomic_set(&ref->locked, 0); list_add_tail(&ref->list, &chip->clock_ref_list); return ref; } /* * Get the existing endpoint object corresponding EP * Returns NULL if not present. */ struct snd_usb_endpoint * snd_usb_get_endpoint(struct snd_usb_audio *chip, int ep_num) { struct snd_usb_endpoint *ep; list_for_each_entry(ep, &chip->ep_list, list) { if (ep->ep_num == ep_num) return ep; } return NULL; } #define ep_type_name(type) \ (type == SND_USB_ENDPOINT_TYPE_DATA ? "data" : "sync") /** * snd_usb_add_endpoint: Add an endpoint to an USB audio chip * * @chip: The chip * @ep_num: The number of the endpoint to use * @type: SND_USB_ENDPOINT_TYPE_DATA or SND_USB_ENDPOINT_TYPE_SYNC * * If the requested endpoint has not been added to the given chip before, * a new instance is created. * * Returns zero on success or a negative error code. * * New endpoints will be added to chip->ep_list and freed by * calling snd_usb_endpoint_free_all(). * * For SND_USB_ENDPOINT_TYPE_SYNC, the caller needs to guarantee that * bNumEndpoints > 1 beforehand. */ int snd_usb_add_endpoint(struct snd_usb_audio *chip, int ep_num, int type) { struct snd_usb_endpoint *ep; bool is_playback; ep = snd_usb_get_endpoint(chip, ep_num); if (ep) return 0; usb_audio_dbg(chip, "Creating new %s endpoint #%x\n", ep_type_name(type), ep_num); ep = kzalloc(sizeof(*ep), GFP_KERNEL); if (!ep) return -ENOMEM; ep->chip = chip; spin_lock_init(&ep->lock); ep->type = type; ep->ep_num = ep_num; INIT_LIST_HEAD(&ep->ready_playback_urbs); atomic_set(&ep->submitted_urbs, 0); is_playback = ((ep_num & USB_ENDPOINT_DIR_MASK) == USB_DIR_OUT); ep_num &= USB_ENDPOINT_NUMBER_MASK; if (is_playback) ep->pipe = usb_sndisocpipe(chip->dev, ep_num); else ep->pipe = usb_rcvisocpipe(chip->dev, ep_num); list_add_tail(&ep->list, &chip->ep_list); return 0; } /* Set up syncinterval and maxsyncsize for a sync EP */ static void endpoint_set_syncinterval(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct usb_host_interface *alts; struct usb_endpoint_descriptor *desc; alts = snd_usb_get_host_interface(chip, ep->iface, ep->altsetting); if (!alts) return; desc = get_endpoint(alts, ep->ep_idx); if (desc->bLength >= USB_DT_ENDPOINT_AUDIO_SIZE && desc->bRefresh >= 1 && desc->bRefresh <= 9) ep->syncinterval = desc->bRefresh; else if (snd_usb_get_speed(chip->dev) == USB_SPEED_FULL) ep->syncinterval = 1; else if (desc->bInterval >= 1 && desc->bInterval <= 16) ep->syncinterval = desc->bInterval - 1; else ep->syncinterval = 3; ep->syncmaxsize = le16_to_cpu(desc->wMaxPacketSize); } static bool endpoint_compatible(struct snd_usb_endpoint *ep, const struct audioformat *fp, const struct snd_pcm_hw_params *params) { if (!ep->opened) return false; if (ep->cur_audiofmt != fp) return false; if (ep->cur_rate != params_rate(params) || ep->cur_format != params_format(params) || ep->cur_period_frames != params_period_size(params) || ep->cur_buffer_periods != params_periods(params)) return false; return true; } /* * Check whether the given fp and hw params are compatible with the current * setup of the target EP for implicit feedback sync */ bool snd_usb_endpoint_compatible(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep, const struct audioformat *fp, const struct snd_pcm_hw_params *params) { bool ret; mutex_lock(&chip->mutex); ret = endpoint_compatible(ep, fp, params); mutex_unlock(&chip->mutex); return ret; } /* * snd_usb_endpoint_open: Open the endpoint * * Called from hw_params to assign the endpoint to the substream. * It's reference-counted, and only the first opener is allowed to set up * arbitrary parameters. The later opener must be compatible with the * former opened parameters. * The endpoint needs to be closed via snd_usb_endpoint_close() later. * * Note that this function doesn't configure the endpoint. The substream * needs to set it up later via snd_usb_endpoint_set_params() and * snd_usb_endpoint_prepare(). */ struct snd_usb_endpoint * snd_usb_endpoint_open(struct snd_usb_audio *chip, const struct audioformat *fp, const struct snd_pcm_hw_params *params, bool is_sync_ep, bool fixed_rate) { struct snd_usb_endpoint *ep; int ep_num = is_sync_ep ? fp->sync_ep : fp->endpoint; mutex_lock(&chip->mutex); ep = snd_usb_get_endpoint(chip, ep_num); if (!ep) { usb_audio_err(chip, "Cannot find EP 0x%x to open\n", ep_num); goto unlock; } if (!ep->opened) { if (is_sync_ep) { ep->iface = fp->sync_iface; ep->altsetting = fp->sync_altsetting; ep->ep_idx = fp->sync_ep_idx; } else { ep->iface = fp->iface; ep->altsetting = fp->altsetting; ep->ep_idx = fp->ep_idx; } usb_audio_dbg(chip, "Open EP 0x%x, iface=%d:%d, idx=%d\n", ep_num, ep->iface, ep->altsetting, ep->ep_idx); ep->iface_ref = iface_ref_find(chip, ep->iface); if (!ep->iface_ref) { ep = NULL; goto unlock; } if (fp->protocol != UAC_VERSION_1) { ep->clock_ref = clock_ref_find(chip, fp->clock); if (!ep->clock_ref) { ep = NULL; goto unlock; } ep->clock_ref->opened++; } ep->cur_audiofmt = fp; ep->cur_channels = fp->channels; ep->cur_rate = params_rate(params); ep->cur_format = params_format(params); ep->cur_frame_bytes = snd_pcm_format_physical_width(ep->cur_format) * ep->cur_channels / 8; ep->cur_period_frames = params_period_size(params); ep->cur_period_bytes = ep->cur_period_frames * ep->cur_frame_bytes; ep->cur_buffer_periods = params_periods(params); if (ep->type == SND_USB_ENDPOINT_TYPE_SYNC) endpoint_set_syncinterval(chip, ep); ep->implicit_fb_sync = fp->implicit_fb; ep->need_setup = true; ep->need_prepare = true; ep->fixed_rate = fixed_rate; usb_audio_dbg(chip, " channels=%d, rate=%d, format=%s, period_bytes=%d, periods=%d, implicit_fb=%d\n", ep->cur_channels, ep->cur_rate, snd_pcm_format_name(ep->cur_format), ep->cur_period_bytes, ep->cur_buffer_periods, ep->implicit_fb_sync); } else { if (WARN_ON(!ep->iface_ref)) { ep = NULL; goto unlock; } if (!endpoint_compatible(ep, fp, params)) { usb_audio_err(chip, "Incompatible EP setup for 0x%x\n", ep_num); ep = NULL; goto unlock; } usb_audio_dbg(chip, "Reopened EP 0x%x (count %d)\n", ep_num, ep->opened); } if (!ep->iface_ref->opened++) ep->iface_ref->need_setup = true; ep->opened++; unlock: mutex_unlock(&chip->mutex); return ep; } /* * snd_usb_endpoint_set_sync: Link data and sync endpoints * * Pass NULL to sync_ep to unlink again */ void snd_usb_endpoint_set_sync(struct snd_usb_audio *chip, struct snd_usb_endpoint *data_ep, struct snd_usb_endpoint *sync_ep) { data_ep->sync_source = sync_ep; } /* * Set data endpoint callbacks and the assigned data stream * * Called at PCM trigger and cleanups. * Pass NULL to deactivate each callback. */ void snd_usb_endpoint_set_callback(struct snd_usb_endpoint *ep, int (*prepare)(struct snd_usb_substream *subs, struct urb *urb, bool in_stream_lock), void (*retire)(struct snd_usb_substream *subs, struct urb *urb), struct snd_usb_substream *data_subs) { ep->prepare_data_urb = prepare; ep->retire_data_urb = retire; if (data_subs) ep->lowlatency_playback = data_subs->lowlatency_playback; else ep->lowlatency_playback = false; WRITE_ONCE(ep->data_subs, data_subs); } static int endpoint_set_interface(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep, bool set) { int altset = set ? ep->altsetting : 0; int err; int retries = 0; const int max_retries = 5; if (ep->iface_ref->altset == altset) return 0; /* already disconnected? */ if (unlikely(atomic_read(&chip->shutdown))) return -ENODEV; usb_audio_dbg(chip, "Setting usb interface %d:%d for EP 0x%x\n", ep->iface, altset, ep->ep_num); retry: err = usb_set_interface(chip->dev, ep->iface, altset); if (err < 0) { if (err == -EPROTO && ++retries <= max_retries) { msleep(5 * (1 << (retries - 1))); goto retry; } usb_audio_err_ratelimited( chip, "%d:%d: usb_set_interface failed (%d)\n", ep->iface, altset, err); return err; } if (chip->quirk_flags & QUIRK_FLAG_IFACE_DELAY) msleep(50); ep->iface_ref->altset = altset; return 0; } /* * snd_usb_endpoint_close: Close the endpoint * * Unreference the already opened endpoint via snd_usb_endpoint_open(). */ void snd_usb_endpoint_close(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { mutex_lock(&chip->mutex); usb_audio_dbg(chip, "Closing EP 0x%x (count %d)\n", ep->ep_num, ep->opened); if (!--ep->iface_ref->opened && !(chip->quirk_flags & QUIRK_FLAG_IFACE_SKIP_CLOSE)) endpoint_set_interface(chip, ep, false); if (!--ep->opened) { if (ep->clock_ref) { if (!--ep->clock_ref->opened) ep->clock_ref->rate = 0; } ep->iface = 0; ep->altsetting = 0; ep->cur_audiofmt = NULL; ep->cur_rate = 0; ep->iface_ref = NULL; ep->clock_ref = NULL; usb_audio_dbg(chip, "EP 0x%x closed\n", ep->ep_num); } mutex_unlock(&chip->mutex); } /* Prepare for suspening EP, called from the main suspend handler */ void snd_usb_endpoint_suspend(struct snd_usb_endpoint *ep) { ep->need_prepare = true; if (ep->iface_ref) ep->iface_ref->need_setup = true; if (ep->clock_ref) ep->clock_ref->rate = 0; } /* * wait until all urbs are processed. */ static int wait_clear_urbs(struct snd_usb_endpoint *ep) { unsigned long end_time = jiffies + msecs_to_jiffies(1000); int alive; if (atomic_read(&ep->state) != EP_STATE_STOPPING) return 0; do { alive = atomic_read(&ep->submitted_urbs); if (!alive) break; schedule_timeout_uninterruptible(1); } while (time_before(jiffies, end_time)); if (alive) usb_audio_err(ep->chip, "timeout: still %d active urbs on EP #%x\n", alive, ep->ep_num); if (ep_state_update(ep, EP_STATE_STOPPING, EP_STATE_STOPPED)) { ep->sync_sink = NULL; snd_usb_endpoint_set_callback(ep, NULL, NULL, NULL); } return 0; } /* sync the pending stop operation; * this function itself doesn't trigger the stop operation */ void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep) { if (ep) wait_clear_urbs(ep); } /* * Stop active urbs * * This function moves the EP to STOPPING state if it's being RUNNING. */ static int stop_urbs(struct snd_usb_endpoint *ep, bool force, bool keep_pending) { unsigned int i; unsigned long flags; if (!force && atomic_read(&ep->running)) return -EBUSY; if (!ep_state_update(ep, EP_STATE_RUNNING, EP_STATE_STOPPING)) return 0; spin_lock_irqsave(&ep->lock, flags); INIT_LIST_HEAD(&ep->ready_playback_urbs); ep->next_packet_head = 0; ep->next_packet_queued = 0; spin_unlock_irqrestore(&ep->lock, flags); if (keep_pending) return 0; for (i = 0; i < ep->nurbs; i++) { if (test_bit(i, &ep->active_mask)) { if (!test_and_set_bit(i, &ep->unlink_mask)) { struct urb *u = ep->urb[i].urb; usb_unlink_urb(u); } } } return 0; } /* * release an endpoint's urbs */ static int release_urbs(struct snd_usb_endpoint *ep, bool force) { int i, err; /* route incoming urbs to nirvana */ snd_usb_endpoint_set_callback(ep, NULL, NULL, NULL); /* stop and unlink urbs */ err = stop_urbs(ep, force, false); if (err) return err; wait_clear_urbs(ep); for (i = 0; i < ep->nurbs; i++) release_urb_ctx(&ep->urb[i]); usb_free_coherent(ep->chip->dev, SYNC_URBS * 4, ep->syncbuf, ep->sync_dma); ep->syncbuf = NULL; ep->nurbs = 0; return 0; } /* * configure a data endpoint */ static int data_ep_set_params(struct snd_usb_endpoint *ep) { struct snd_usb_audio *chip = ep->chip; unsigned int maxsize, minsize, packs_per_ms, max_packs_per_urb; unsigned int max_packs_per_period, urbs_per_period, urb_packs; unsigned int max_urbs, i; const struct audioformat *fmt = ep->cur_audiofmt; int frame_bits = ep->cur_frame_bytes * 8; int tx_length_quirk = (has_tx_length_quirk(chip) && usb_pipeout(ep->pipe)); usb_audio_dbg(chip, "Setting params for data EP 0x%x, pipe 0x%x\n", ep->ep_num, ep->pipe); if (ep->cur_format == SNDRV_PCM_FORMAT_DSD_U16_LE && fmt->dsd_dop) { /* * When operating in DSD DOP mode, the size of a sample frame * in hardware differs from the actual physical format width * because we need to make room for the DOP markers. */ frame_bits += ep->cur_channels << 3; } ep->datainterval = fmt->datainterval; ep->stride = frame_bits >> 3; switch (ep->cur_format) { case SNDRV_PCM_FORMAT_U8: ep->silence_value = 0x80; break; case SNDRV_PCM_FORMAT_DSD_U8: case SNDRV_PCM_FORMAT_DSD_U16_LE: case SNDRV_PCM_FORMAT_DSD_U32_LE: case SNDRV_PCM_FORMAT_DSD_U16_BE: case SNDRV_PCM_FORMAT_DSD_U32_BE: ep->silence_value = 0x69; break; default: ep->silence_value = 0; } /* assume max. frequency is 50% higher than nominal */ ep->freqmax = ep->freqn + (ep->freqn >> 1); /* Round up freqmax to nearest integer in order to calculate maximum * packet size, which must represent a whole number of frames. * This is accomplished by adding 0x0.ffff before converting the * Q16.16 format into integer. * In order to accurately calculate the maximum packet size when * the data interval is more than 1 (i.e. ep->datainterval > 0), * multiply by the data interval prior to rounding. For instance, * a freqmax of 41 kHz will result in a max packet size of 6 (5.125) * frames with a data interval of 1, but 11 (10.25) frames with a * data interval of 2. * (ep->freqmax << ep->datainterval overflows at 8.192 MHz for the * maximum datainterval value of 3, at USB full speed, higher for * USB high speed, noting that ep->freqmax is in units of * frames per packet in Q16.16 format.) */ maxsize = (((ep->freqmax << ep->datainterval) + 0xffff) >> 16) * (frame_bits >> 3); if (tx_length_quirk) maxsize += sizeof(__le32); /* Space for length descriptor */ /* but wMaxPacketSize might reduce this */ if (ep->maxpacksize && ep->maxpacksize < maxsize) { /* whatever fits into a max. size packet */ unsigned int data_maxsize = maxsize = ep->maxpacksize; if (tx_length_quirk) /* Need to remove the length descriptor to calc freq */ data_maxsize -= sizeof(__le32); ep->freqmax = (data_maxsize / (frame_bits >> 3)) << (16 - ep->datainterval); } if (ep->fill_max) ep->curpacksize = ep->maxpacksize; else ep->curpacksize = maxsize; if (snd_usb_get_speed(chip->dev) != USB_SPEED_FULL) { packs_per_ms = 8 >> ep->datainterval; max_packs_per_urb = MAX_PACKS_HS; } else { packs_per_ms = 1; max_packs_per_urb = MAX_PACKS; } if (ep->sync_source && !ep->implicit_fb_sync) max_packs_per_urb = min(max_packs_per_urb, 1U << ep->sync_source->syncinterval); max_packs_per_urb = max(1u, max_packs_per_urb >> ep->datainterval); /* * Capture endpoints need to use small URBs because there's no way * to tell in advance where the next period will end, and we don't * want the next URB to complete much after the period ends. * * Playback endpoints with implicit sync much use the same parameters * as their corresponding capture endpoint. */ if (usb_pipein(ep->pipe) || ep->implicit_fb_sync) { /* make capture URBs <= 1 ms and smaller than a period */ urb_packs = min(max_packs_per_urb, packs_per_ms); while (urb_packs > 1 && urb_packs * maxsize >= ep->cur_period_bytes) urb_packs >>= 1; ep->nurbs = MAX_URBS; /* * Playback endpoints without implicit sync are adjusted so that * a period fits as evenly as possible in the smallest number of * URBs. The total number of URBs is adjusted to the size of the * ALSA buffer, subject to the MAX_URBS and MAX_QUEUE limits. */ } else { /* determine how small a packet can be */ minsize = (ep->freqn >> (16 - ep->datainterval)) * (frame_bits >> 3); /* with sync from device, assume it can be 12% lower */ if (ep->sync_source) minsize -= minsize >> 3; minsize = max(minsize, 1u); /* how many packets will contain an entire ALSA period? */ max_packs_per_period = DIV_ROUND_UP(ep->cur_period_bytes, minsize); /* how many URBs will contain a period? */ urbs_per_period = DIV_ROUND_UP(max_packs_per_period, max_packs_per_urb); /* how many packets are needed in each URB? */ urb_packs = DIV_ROUND_UP(max_packs_per_period, urbs_per_period); /* limit the number of frames in a single URB */ ep->max_urb_frames = DIV_ROUND_UP(ep->cur_period_frames, urbs_per_period); /* try to use enough URBs to contain an entire ALSA buffer */ max_urbs = min((unsigned) MAX_URBS, MAX_QUEUE * packs_per_ms / urb_packs); ep->nurbs = min(max_urbs, urbs_per_period * ep->cur_buffer_periods); } /* allocate and initialize data urbs */ for (i = 0; i < ep->nurbs; i++) { struct snd_urb_ctx *u = &ep->urb[i]; u->index = i; u->ep = ep; u->packets = urb_packs; u->buffer_size = maxsize * u->packets; if (fmt->fmt_type == UAC_FORMAT_TYPE_II) u->packets++; /* for transfer delimiter */ u->urb = usb_alloc_urb(u->packets, GFP_KERNEL); if (!u->urb) goto out_of_memory; u->urb->transfer_buffer = usb_alloc_coherent(chip->dev, u->buffer_size, GFP_KERNEL, &u->urb->transfer_dma); if (!u->urb->transfer_buffer) goto out_of_memory; u->urb->pipe = ep->pipe; u->urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP; u->urb->interval = 1 << ep->datainterval; u->urb->context = u; u->urb->complete = snd_complete_urb; INIT_LIST_HEAD(&u->ready_list); } return 0; out_of_memory: release_urbs(ep, false); return -ENOMEM; } /* * configure a sync endpoint */ static int sync_ep_set_params(struct snd_usb_endpoint *ep) { struct snd_usb_audio *chip = ep->chip; int i; usb_audio_dbg(chip, "Setting params for sync EP 0x%x, pipe 0x%x\n", ep->ep_num, ep->pipe); ep->syncbuf = usb_alloc_coherent(chip->dev, SYNC_URBS * 4, GFP_KERNEL, &ep->sync_dma); if (!ep->syncbuf) return -ENOMEM; ep->nurbs = SYNC_URBS; for (i = 0; i < SYNC_URBS; i++) { struct snd_urb_ctx *u = &ep->urb[i]; u->index = i; u->ep = ep; u->packets = 1; u->urb = usb_alloc_urb(1, GFP_KERNEL); if (!u->urb) goto out_of_memory; u->urb->transfer_buffer = ep->syncbuf + i * 4; u->urb->transfer_dma = ep->sync_dma + i * 4; u->urb->transfer_buffer_length = 4; u->urb->pipe = ep->pipe; u->urb->transfer_flags = URB_NO_TRANSFER_DMA_MAP; u->urb->number_of_packets = 1; u->urb->interval = 1 << ep->syncinterval; u->urb->context = u; u->urb->complete = snd_complete_urb; } return 0; out_of_memory: release_urbs(ep, false); return -ENOMEM; } /* update the rate of the referred clock; return the actual rate */ static int update_clock_ref_rate(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct snd_usb_clock_ref *clock = ep->clock_ref; int rate = ep->cur_rate; if (!clock || clock->rate == rate) return rate; if (clock->rate) { if (atomic_read(&clock->locked)) return clock->rate; if (clock->rate != rate) { usb_audio_err(chip, "Mismatched sample rate %d vs %d for EP 0x%x\n", clock->rate, rate, ep->ep_num); return clock->rate; } } clock->rate = rate; clock->need_setup = true; return rate; } /* * snd_usb_endpoint_set_params: configure an snd_usb_endpoint * * It's called either from hw_params callback. * Determine the number of URBs to be used on this endpoint. * An endpoint must be configured before it can be started. * An endpoint that is already running can not be reconfigured. */ int snd_usb_endpoint_set_params(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { const struct audioformat *fmt = ep->cur_audiofmt; int err = 0; mutex_lock(&chip->mutex); if (!ep->need_setup) goto unlock; /* release old buffers, if any */ err = release_urbs(ep, false); if (err < 0) goto unlock; ep->datainterval = fmt->datainterval; ep->maxpacksize = fmt->maxpacksize; ep->fill_max = !!(fmt->attributes & UAC_EP_CS_ATTR_FILL_MAX); if (snd_usb_get_speed(chip->dev) == USB_SPEED_FULL) { ep->freqn = get_usb_full_speed_rate(ep->cur_rate); ep->pps = 1000 >> ep->datainterval; } else { ep->freqn = get_usb_high_speed_rate(ep->cur_rate); ep->pps = 8000 >> ep->datainterval; } ep->sample_rem = ep->cur_rate % ep->pps; ep->packsize[0] = ep->cur_rate / ep->pps; ep->packsize[1] = (ep->cur_rate + (ep->pps - 1)) / ep->pps; /* calculate the frequency in 16.16 format */ ep->freqm = ep->freqn; ep->freqshift = INT_MIN; ep->phase = 0; switch (ep->type) { case SND_USB_ENDPOINT_TYPE_DATA: err = data_ep_set_params(ep); break; case SND_USB_ENDPOINT_TYPE_SYNC: err = sync_ep_set_params(ep); break; default: err = -EINVAL; } usb_audio_dbg(chip, "Set up %d URBS, ret=%d\n", ep->nurbs, err); if (err < 0) goto unlock; /* some unit conversions in runtime */ ep->maxframesize = ep->maxpacksize / ep->cur_frame_bytes; ep->curframesize = ep->curpacksize / ep->cur_frame_bytes; err = update_clock_ref_rate(chip, ep); if (err >= 0) { ep->need_setup = false; err = 0; } unlock: mutex_unlock(&chip->mutex); return err; } static int init_sample_rate(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { struct snd_usb_clock_ref *clock = ep->clock_ref; int rate, err; rate = update_clock_ref_rate(chip, ep); if (rate < 0) return rate; if (clock && !clock->need_setup) return 0; if (!ep->fixed_rate) { err = snd_usb_init_sample_rate(chip, ep->cur_audiofmt, rate); if (err < 0) { if (clock) clock->rate = 0; /* reset rate */ return err; } } if (clock) clock->need_setup = false; return 0; } /* * snd_usb_endpoint_prepare: Prepare the endpoint * * This function sets up the EP to be fully usable state. * It's called either from prepare callback. * The function checks need_setup flag, and performs nothing unless needed, * so it's safe to call this multiple times. * * This returns zero if unchanged, 1 if the configuration has changed, * or a negative error code. */ int snd_usb_endpoint_prepare(struct snd_usb_audio *chip, struct snd_usb_endpoint *ep) { bool iface_first; int err = 0; mutex_lock(&chip->mutex); if (WARN_ON(!ep->iface_ref)) goto unlock; if (!ep->need_prepare) goto unlock; /* If the interface has been already set up, just set EP parameters */ if (!ep->iface_ref->need_setup) { /* sample rate setup of UAC1 is per endpoint, and we need * to update at each EP configuration */ if (ep->cur_audiofmt->protocol == UAC_VERSION_1) { err = init_sample_rate(chip, ep); if (err < 0) goto unlock; } goto done; } /* Need to deselect altsetting at first */ endpoint_set_interface(chip, ep, false); /* Some UAC1 devices (e.g. Yamaha THR10) need the host interface * to be set up before parameter setups */ iface_first = ep->cur_audiofmt->protocol == UAC_VERSION_1; /* Workaround for devices that require the interface setup at first like UAC1 */ if (chip->quirk_flags & QUIRK_FLAG_SET_IFACE_FIRST) iface_first = true; if (iface_first) { err = endpoint_set_interface(chip, ep, true); if (err < 0) goto unlock; } err = snd_usb_init_pitch(chip, ep->cur_audiofmt); if (err < 0) goto unlock; err = init_sample_rate(chip, ep); if (err < 0) goto unlock; err = snd_usb_select_mode_quirk(chip, ep->cur_audiofmt); if (err < 0) goto unlock; /* for UAC2/3, enable the interface altset here at last */ if (!iface_first) { err = endpoint_set_interface(chip, ep, true); if (err < 0) goto unlock; } ep->iface_ref->need_setup = false; done: ep->need_prepare = false; err = 1; unlock: mutex_unlock(&chip->mutex); return err; } EXPORT_SYMBOL_GPL(snd_usb_endpoint_prepare); /* get the current rate set to the given clock by any endpoint */ int snd_usb_endpoint_get_clock_rate(struct snd_usb_audio *chip, int clock) { struct snd_usb_clock_ref *ref; int rate = 0; if (!clock) return 0; mutex_lock(&chip->mutex); list_for_each_entry(ref, &chip->clock_ref_list, list) { if (ref->clock == clock) { rate = ref->rate; break; } } mutex_unlock(&chip->mutex); return rate; } /** * snd_usb_endpoint_start: start an snd_usb_endpoint * * @ep: the endpoint to start * * A call to this function will increment the running count of the endpoint. * In case it is not already running, the URBs for this endpoint will be * submitted. Otherwise, this function does nothing. * * Must be balanced to calls of snd_usb_endpoint_stop(). * * Returns an error if the URB submission failed, 0 in all other cases. */ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep) { bool is_playback = usb_pipeout(ep->pipe); int err; unsigned int i; if (atomic_read(&ep->chip->shutdown)) return -EBADFD; if (ep->sync_source) WRITE_ONCE(ep->sync_source->sync_sink, ep); usb_audio_dbg(ep->chip, "Starting %s EP 0x%x (running %d)\n", ep_type_name(ep->type), ep->ep_num, atomic_read(&ep->running)); /* already running? */ if (atomic_inc_return(&ep->running) != 1) return 0; if (ep->clock_ref) atomic_inc(&ep->clock_ref->locked); ep->active_mask = 0; ep->unlink_mask = 0; ep->phase = 0; ep->sample_accum = 0; snd_usb_endpoint_start_quirk(ep); /* * If this endpoint has a data endpoint as implicit feedback source, * don't start the urbs here. Instead, mark them all as available, * wait for the record urbs to return and queue the playback urbs * from that context. */ if (!ep_state_update(ep, EP_STATE_STOPPED, EP_STATE_RUNNING)) goto __error; if (snd_usb_endpoint_implicit_feedback_sink(ep) && !(ep->chip->quirk_flags & QUIRK_FLAG_PLAYBACK_FIRST)) { usb_audio_dbg(ep->chip, "No URB submission due to implicit fb sync\n"); i = 0; goto fill_rest; } for (i = 0; i < ep->nurbs; i++) { struct urb *urb = ep->urb[i].urb; if (snd_BUG_ON(!urb)) goto __error; if (is_playback) err = prepare_outbound_urb(ep, urb->context, true); else err = prepare_inbound_urb(ep, urb->context); if (err < 0) { /* stop filling at applptr */ if (err == -EAGAIN) break; usb_audio_dbg(ep->chip, "EP 0x%x: failed to prepare urb: %d\n", ep->ep_num, err); goto __error; } if (!atomic_read(&ep->chip->shutdown)) err = usb_submit_urb(urb, GFP_ATOMIC); else err = -ENODEV; if (err < 0) { if (!atomic_read(&ep->chip->shutdown)) usb_audio_err(ep->chip, "cannot submit urb %d, error %d: %s\n", i, err, usb_error_string(err)); goto __error; } set_bit(i, &ep->active_mask); atomic_inc(&ep->submitted_urbs); } if (!i) { usb_audio_dbg(ep->chip, "XRUN at starting EP 0x%x\n", ep->ep_num); goto __error; } usb_audio_dbg(ep->chip, "%d URBs submitted for EP 0x%x\n", i, ep->ep_num); fill_rest: /* put the remaining URBs to ready list */ if (is_playback) { for (; i < ep->nurbs; i++) push_back_to_ready_list(ep, ep->urb + i); } return 0; __error: snd_usb_endpoint_stop(ep, false); return -EPIPE; } /** * snd_usb_endpoint_stop: stop an snd_usb_endpoint * * @ep: the endpoint to stop (may be NULL) * @keep_pending: keep in-flight URBs * * A call to this function will decrement the running count of the endpoint. * In case the last user has requested the endpoint stop, the URBs will * actually be deactivated. * * Must be balanced to calls of snd_usb_endpoint_start(). * * The caller needs to synchronize the pending stop operation via * snd_usb_endpoint_sync_pending_stop(). */ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, bool keep_pending) { if (!ep) return; usb_audio_dbg(ep->chip, "Stopping %s EP 0x%x (running %d)\n", ep_type_name(ep->type), ep->ep_num, atomic_read(&ep->running)); if (snd_BUG_ON(!atomic_read(&ep->running))) return; if (!atomic_dec_return(&ep->running)) { if (ep->sync_source) WRITE_ONCE(ep->sync_source->sync_sink, NULL); stop_urbs(ep, false, keep_pending); if (ep->clock_ref) atomic_dec(&ep->clock_ref->locked); if (ep->chip->quirk_flags & QUIRK_FLAG_FORCE_IFACE_RESET && usb_pipeout(ep->pipe)) { ep->need_prepare = true; if (ep->iface_ref) ep->iface_ref->need_setup = true; } } } /** * snd_usb_endpoint_release: Tear down an snd_usb_endpoint * * @ep: the endpoint to release * * This function does not care for the endpoint's running count but will tear * down all the streaming URBs immediately. */ void snd_usb_endpoint_release(struct snd_usb_endpoint *ep) { release_urbs(ep, true); } /** * snd_usb_endpoint_free_all: Free the resources of an snd_usb_endpoint * @chip: The chip * * This free all endpoints and those resources */ void snd_usb_endpoint_free_all(struct snd_usb_audio *chip) { struct snd_usb_endpoint *ep, *en; struct snd_usb_iface_ref *ip, *in; struct snd_usb_clock_ref *cp, *cn; list_for_each_entry_safe(ep, en, &chip->ep_list, list) kfree(ep); list_for_each_entry_safe(ip, in, &chip->iface_ref_list, list) kfree(ip); list_for_each_entry_safe(cp, cn, &chip->clock_ref_list, list) kfree(cp); } /* * snd_usb_handle_sync_urb: parse an USB sync packet * * @ep: the endpoint to handle the packet * @sender: the sending endpoint * @urb: the received packet * * This function is called from the context of an endpoint that received * the packet and is used to let another endpoint object handle the payload. */ static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep, struct snd_usb_endpoint *sender, const struct urb *urb) { int shift; unsigned int f; unsigned long flags; snd_BUG_ON(ep == sender); /* * In case the endpoint is operating in implicit feedback mode, prepare * a new outbound URB that has the same layout as the received packet * and add it to the list of pending urbs. queue_pending_output_urbs() * will take care of them later. */ if (snd_usb_endpoint_implicit_feedback_sink(ep) && atomic_read(&ep->running)) { /* implicit feedback case */ int i, bytes = 0; struct snd_urb_ctx *in_ctx; struct snd_usb_packet_info *out_packet; in_ctx = urb->context; /* Count overall packet size */ for (i = 0; i < in_ctx->packets; i++) if (urb->iso_frame_desc[i].status == 0) bytes += urb->iso_frame_desc[i].actual_length; /* * skip empty packets. At least M-Audio's Fast Track Ultra stops * streaming once it received a 0-byte OUT URB */ if (bytes == 0) return; spin_lock_irqsave(&ep->lock, flags); if (ep->next_packet_queued >= ARRAY_SIZE(ep->next_packet)) { spin_unlock_irqrestore(&ep->lock, flags); usb_audio_err(ep->chip, "next package FIFO overflow EP 0x%x\n", ep->ep_num); notify_xrun(ep); return; } out_packet = next_packet_fifo_enqueue(ep); /* * Iterate through the inbound packet and prepare the lengths * for the output packet. The OUT packet we are about to send * will have the same amount of payload bytes per stride as the * IN packet we just received. Since the actual size is scaled * by the stride, use the sender stride to calculate the length * in case the number of channels differ between the implicitly * fed-back endpoint and the synchronizing endpoint. */ out_packet->packets = in_ctx->packets; for (i = 0; i < in_ctx->packets; i++) { if (urb->iso_frame_desc[i].status == 0) out_packet->packet_size[i] = urb->iso_frame_desc[i].actual_length / sender->stride; else out_packet->packet_size[i] = 0; } spin_unlock_irqrestore(&ep->lock, flags); snd_usb_queue_pending_output_urbs(ep, false); return; } /* * process after playback sync complete * * Full speed devices report feedback values in 10.14 format as samples * per frame, high speed devices in 16.16 format as samples per * microframe. * * Because the Audio Class 1 spec was written before USB 2.0, many high * speed devices use a wrong interpretation, some others use an * entirely different format. * * Therefore, we cannot predict what format any particular device uses * and must detect it automatically. */ if (urb->iso_frame_desc[0].status != 0 || urb->iso_frame_desc[0].actual_length < 3) return; f = le32_to_cpup(urb->transfer_buffer); if (urb->iso_frame_desc[0].actual_length == 3) f &= 0x00ffffff; else f &= 0x0fffffff; if (f == 0) return; if (unlikely(sender->tenor_fb_quirk)) { /* * Devices based on Tenor 8802 chipsets (TEAC UD-H01 * and others) sometimes change the feedback value * by +/- 0x1.0000. */ if (f < ep->freqn - 0x8000) f += 0xf000; else if (f > ep->freqn + 0x8000) f -= 0xf000; } else if (unlikely(ep->freqshift == INT_MIN)) { /* * The first time we see a feedback value, determine its format * by shifting it left or right until it matches the nominal * frequency value. This assumes that the feedback does not * differ from the nominal value more than +50% or -25%. */ shift = 0; while (f < ep->freqn - ep->freqn / 4) { f <<= 1; shift++; } while (f > ep->freqn + ep->freqn / 2) { f >>= 1; shift--; } ep->freqshift = shift; } else if (ep->freqshift >= 0) f <<= ep->freqshift; else f >>= -ep->freqshift; if (likely(f >= ep->freqn - ep->freqn / 8 && f <= ep->freqmax)) { /* * If the frequency looks valid, set it. * This value is referred to in prepare_playback_urb(). */ spin_lock_irqsave(&ep->lock, flags); ep->freqm = f; spin_unlock_irqrestore(&ep->lock, flags); } else { /* * Out of range; maybe the shift value is wrong. * Reset it so that we autodetect again the next time. */ ep->freqshift = INT_MIN; } } |
| 2 1 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | // SPDX-License-Identifier: GPL-2.0-or-later /* RomFS storage access routines * * Copyright © 2007 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ #include <linux/fs.h> #include <linux/mtd/super.h> #include <linux/buffer_head.h> #include "internal.h" #if !defined(CONFIG_ROMFS_ON_MTD) && !defined(CONFIG_ROMFS_ON_BLOCK) #error no ROMFS backing store interface configured #endif #ifdef CONFIG_ROMFS_ON_MTD #define ROMFS_MTD_READ(sb, ...) mtd_read((sb)->s_mtd, ##__VA_ARGS__) /* * read data from an romfs image on an MTD device */ static int romfs_mtd_read(struct super_block *sb, unsigned long pos, void *buf, size_t buflen) { size_t rlen; int ret; ret = ROMFS_MTD_READ(sb, pos, buflen, &rlen, buf); return (ret < 0 || rlen != buflen) ? -EIO : 0; } /* * determine the length of a string in a romfs image on an MTD device */ static ssize_t romfs_mtd_strnlen(struct super_block *sb, unsigned long pos, size_t maxlen) { ssize_t n = 0; size_t segment; u_char buf[16], *p; size_t len; int ret; /* scan the string up to 16 bytes at a time */ while (maxlen > 0) { segment = min_t(size_t, maxlen, 16); ret = ROMFS_MTD_READ(sb, pos, segment, &len, buf); if (ret < 0) return ret; p = memchr(buf, 0, len); if (p) return n + (p - buf); maxlen -= len; pos += len; n += len; } return n; } /* * compare a string to one in a romfs image on MTD * - return 1 if matched, 0 if differ, -ve if error */ static int romfs_mtd_strcmp(struct super_block *sb, unsigned long pos, const char *str, size_t size) { u_char buf[17]; size_t len, segment; int ret; /* scan the string up to 16 bytes at a time, and attempt to grab the * trailing NUL whilst we're at it */ buf[0] = 0xff; while (size > 0) { segment = min_t(size_t, size + 1, 17); ret = ROMFS_MTD_READ(sb, pos, segment, &len, buf); if (ret < 0) return ret; len--; if (memcmp(buf, str, len) != 0) return 0; buf[0] = buf[len]; size -= len; pos += len; str += len; } /* check the trailing NUL was */ if (buf[0]) return 0; return 1; } #endif /* CONFIG_ROMFS_ON_MTD */ #ifdef CONFIG_ROMFS_ON_BLOCK /* * read data from an romfs image on a block device */ static int romfs_blk_read(struct super_block *sb, unsigned long pos, void *buf, size_t buflen) { struct buffer_head *bh; unsigned long offset; size_t segment; /* copy the string up to blocksize bytes at a time */ while (buflen > 0) { offset = pos & (ROMBSIZE - 1); segment = min_t(size_t, buflen, ROMBSIZE - offset); bh = sb_bread(sb, pos >> ROMBSBITS); if (!bh) return -EIO; memcpy(buf, bh->b_data + offset, segment); brelse(bh); buf += segment; buflen -= segment; pos += segment; } return 0; } /* * determine the length of a string in romfs on a block device */ static ssize_t romfs_blk_strnlen(struct super_block *sb, unsigned long pos, size_t limit) { struct buffer_head *bh; unsigned long offset; ssize_t n = 0; size_t segment; u_char *buf, *p; /* scan the string up to blocksize bytes at a time */ while (limit > 0) { offset = pos & (ROMBSIZE - 1); segment = min_t(size_t, limit, ROMBSIZE - offset); bh = sb_bread(sb, pos >> ROMBSBITS); if (!bh) return -EIO; buf = bh->b_data + offset; p = memchr(buf, 0, segment); brelse(bh); if (p) return n + (p - buf); limit -= segment; pos += segment; n += segment; } return n; } /* * compare a string to one in a romfs image on a block device * - return 1 if matched, 0 if differ, -ve if error */ static int romfs_blk_strcmp(struct super_block *sb, unsigned long pos, const char *str, size_t size) { struct buffer_head *bh; unsigned long offset; size_t segment; bool matched, terminated = false; /* compare string up to a block at a time */ while (size > 0) { offset = pos & (ROMBSIZE - 1); segment = min_t(size_t, size, ROMBSIZE - offset); bh = sb_bread(sb, pos >> ROMBSBITS); if (!bh) return -EIO; matched = (memcmp(bh->b_data + offset, str, segment) == 0); size -= segment; pos += segment; str += segment; if (matched && size == 0 && offset + segment < ROMBSIZE) { if (!bh->b_data[offset + segment]) terminated = true; else matched = false; } brelse(bh); if (!matched) return 0; } if (!terminated) { /* the terminating NUL must be on the first byte of the next * block */ BUG_ON((pos & (ROMBSIZE - 1)) != 0); bh = sb_bread(sb, pos >> ROMBSBITS); if (!bh) return -EIO; matched = !bh->b_data[0]; brelse(bh); if (!matched) return 0; } return 1; } #endif /* CONFIG_ROMFS_ON_BLOCK */ /* * read data from the romfs image */ int romfs_dev_read(struct super_block *sb, unsigned long pos, void *buf, size_t buflen) { size_t limit; limit = romfs_maxsize(sb); if (pos >= limit || buflen > limit - pos) return -EIO; #ifdef CONFIG_ROMFS_ON_MTD if (sb->s_mtd) return romfs_mtd_read(sb, pos, buf, buflen); #endif #ifdef CONFIG_ROMFS_ON_BLOCK if (sb->s_bdev) return romfs_blk_read(sb, pos, buf, buflen); #endif return -EIO; } /* * determine the length of a string in romfs */ ssize_t romfs_dev_strnlen(struct super_block *sb, unsigned long pos, size_t maxlen) { size_t limit; limit = romfs_maxsize(sb); if (pos >= limit) return -EIO; if (maxlen > limit - pos) maxlen = limit - pos; #ifdef CONFIG_ROMFS_ON_MTD if (sb->s_mtd) return romfs_mtd_strnlen(sb, pos, maxlen); #endif #ifdef CONFIG_ROMFS_ON_BLOCK if (sb->s_bdev) return romfs_blk_strnlen(sb, pos, maxlen); #endif return -EIO; } /* * compare a string to one in romfs * - the string to be compared to, str, may not be NUL-terminated; instead the * string is of the specified size * - return 1 if matched, 0 if differ, -ve if error */ int romfs_dev_strcmp(struct super_block *sb, unsigned long pos, const char *str, size_t size) { size_t limit; limit = romfs_maxsize(sb); if (pos >= limit) return -EIO; if (size > ROMFS_MAXFN) return -ENAMETOOLONG; if (size + 1 > limit - pos) return -EIO; #ifdef CONFIG_ROMFS_ON_MTD if (sb->s_mtd) return romfs_mtd_strcmp(sb, pos, str, size); #endif #ifdef CONFIG_ROMFS_ON_BLOCK if (sb->s_bdev) return romfs_blk_strcmp(sb, pos, str, size); #endif return -EIO; } |
| 1 2 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 | // SPDX-License-Identifier: GPL-2.0-or-later /* * E3C EC100 demodulator driver * * Copyright (C) 2009 Antti Palosaari <crope@iki.fi> */ #include <media/dvb_frontend.h> #include "ec100.h" struct ec100_state { struct i2c_adapter *i2c; struct dvb_frontend frontend; struct ec100_config config; u16 ber; }; /* write single register */ static int ec100_write_reg(struct ec100_state *state, u8 reg, u8 val) { int ret; u8 buf[2] = {reg, val}; struct i2c_msg msg[1] = { { .addr = state->config.demod_address, .flags = 0, .len = sizeof(buf), .buf = buf, } }; ret = i2c_transfer(state->i2c, msg, 1); if (ret == 1) { ret = 0; } else { dev_warn(&state->i2c->dev, "%s: i2c wr failed=%d reg=%02x\n", KBUILD_MODNAME, ret, reg); ret = -EREMOTEIO; } return ret; } /* read single register */ static int ec100_read_reg(struct ec100_state *state, u8 reg, u8 *val) { int ret; struct i2c_msg msg[2] = { { .addr = state->config.demod_address, .flags = 0, .len = 1, .buf = ® }, { .addr = state->config.demod_address, .flags = I2C_M_RD, .len = 1, .buf = val } }; ret = i2c_transfer(state->i2c, msg, 2); if (ret == 2) { ret = 0; } else { dev_warn(&state->i2c->dev, "%s: i2c rd failed=%d reg=%02x\n", KBUILD_MODNAME, ret, reg); ret = -EREMOTEIO; } return ret; } static int ec100_set_frontend(struct dvb_frontend *fe) { struct dtv_frontend_properties *c = &fe->dtv_property_cache; struct ec100_state *state = fe->demodulator_priv; int ret; u8 tmp, tmp2; dev_dbg(&state->i2c->dev, "%s: frequency=%d bandwidth_hz=%d\n", __func__, c->frequency, c->bandwidth_hz); /* program tuner */ if (fe->ops.tuner_ops.set_params) fe->ops.tuner_ops.set_params(fe); ret = ec100_write_reg(state, 0x04, 0x06); if (ret) goto error; ret = ec100_write_reg(state, 0x67, 0x58); if (ret) goto error; ret = ec100_write_reg(state, 0x05, 0x18); if (ret) goto error; /* reg/bw | 6 | 7 | 8 -------+------+------+------ A 0x1b | 0xa1 | 0xe7 | 0x2c A 0x1c | 0x55 | 0x63 | 0x72 -------+------+------+------ B 0x1b | 0xb7 | 0x00 | 0x49 B 0x1c | 0x55 | 0x64 | 0x72 */ switch (c->bandwidth_hz) { case 6000000: tmp = 0xb7; tmp2 = 0x55; break; case 7000000: tmp = 0x00; tmp2 = 0x64; break; case 8000000: default: tmp = 0x49; tmp2 = 0x72; } ret = ec100_write_reg(state, 0x1b, tmp); if (ret) goto error; ret = ec100_write_reg(state, 0x1c, tmp2); if (ret) goto error; ret = ec100_write_reg(state, 0x0c, 0xbb); /* if freq */ if (ret) goto error; ret = ec100_write_reg(state, 0x0d, 0x31); /* if freq */ if (ret) goto error; ret = ec100_write_reg(state, 0x08, 0x24); if (ret) goto error; ret = ec100_write_reg(state, 0x00, 0x00); /* go */ if (ret) goto error; ret = ec100_write_reg(state, 0x00, 0x20); /* go */ if (ret) goto error; return ret; error: dev_dbg(&state->i2c->dev, "%s: failed=%d\n", __func__, ret); return ret; } static int ec100_get_tune_settings(struct dvb_frontend *fe, struct dvb_frontend_tune_settings *fesettings) { fesettings->min_delay_ms = 300; fesettings->step_size = 0; fesettings->max_drift = 0; return 0; } static int ec100_read_status(struct dvb_frontend *fe, enum fe_status *status) { struct ec100_state *state = fe->demodulator_priv; int ret; u8 tmp; *status = 0; ret = ec100_read_reg(state, 0x42, &tmp); if (ret) goto error; if (tmp & 0x80) { /* bit7 set - have lock */ *status |= FE_HAS_SIGNAL | FE_HAS_CARRIER | FE_HAS_VITERBI | FE_HAS_SYNC | FE_HAS_LOCK; } else { ret = ec100_read_reg(state, 0x01, &tmp); if (ret) goto error; if (tmp & 0x10) { /* bit4 set - have signal */ *status |= FE_HAS_SIGNAL; if (!(tmp & 0x01)) { /* bit0 clear - have ~valid signal */ *status |= FE_HAS_CARRIER | FE_HAS_VITERBI; } } } return ret; error: dev_dbg(&state->i2c->dev, "%s: failed=%d\n", __func__, ret); return ret; } static int ec100_read_ber(struct dvb_frontend *fe, u32 *ber) { struct ec100_state *state = fe->demodulator_priv; int ret; u8 tmp, tmp2; u16 ber2; *ber = 0; ret = ec100_read_reg(state, 0x65, &tmp); if (ret) goto error; ret = ec100_read_reg(state, 0x66, &tmp2); if (ret) goto error; ber2 = (tmp2 << 8) | tmp; /* if counter overflow or clear */ if (ber2 < state->ber) *ber = ber2; else *ber = ber2 - state->ber; state->ber = ber2; return ret; error: dev_dbg(&state->i2c->dev, "%s: failed=%d\n", __func__, ret); return ret; } static int ec100_read_signal_strength(struct dvb_frontend *fe, u16 *strength) { struct ec100_state *state = fe->demodulator_priv; int ret; u8 tmp; ret = ec100_read_reg(state, 0x24, &tmp); if (ret) { *strength = 0; goto error; } *strength = ((tmp << 8) | tmp); return ret; error: dev_dbg(&state->i2c->dev, "%s: failed=%d\n", __func__, ret); return ret; } static int ec100_read_snr(struct dvb_frontend *fe, u16 *snr) { *snr = 0; return 0; } static int ec100_read_ucblocks(struct dvb_frontend *fe, u32 *ucblocks) { *ucblocks = 0; return 0; } static void ec100_release(struct dvb_frontend *fe) { struct ec100_state *state = fe->demodulator_priv; kfree(state); } static const struct dvb_frontend_ops ec100_ops; struct dvb_frontend *ec100_attach(const struct ec100_config *config, struct i2c_adapter *i2c) { int ret; struct ec100_state *state = NULL; u8 tmp; /* allocate memory for the internal state */ state = kzalloc(sizeof(struct ec100_state), GFP_KERNEL); if (state == NULL) goto error; /* setup the state */ state->i2c = i2c; memcpy(&state->config, config, sizeof(struct ec100_config)); /* check if the demod is there */ ret = ec100_read_reg(state, 0x33, &tmp); if (ret || tmp != 0x0b) goto error; /* create dvb_frontend */ memcpy(&state->frontend.ops, &ec100_ops, sizeof(struct dvb_frontend_ops)); state->frontend.demodulator_priv = state; return &state->frontend; error: kfree(state); return NULL; } EXPORT_SYMBOL_GPL(ec100_attach); static const struct dvb_frontend_ops ec100_ops = { .delsys = { SYS_DVBT }, .info = { .name = "E3C EC100 DVB-T", .caps = FE_CAN_FEC_1_2 | FE_CAN_FEC_2_3 | FE_CAN_FEC_3_4 | FE_CAN_FEC_5_6 | FE_CAN_FEC_7_8 | FE_CAN_FEC_AUTO | FE_CAN_QPSK | FE_CAN_QAM_16 | FE_CAN_QAM_64 | FE_CAN_QAM_AUTO | FE_CAN_TRANSMISSION_MODE_AUTO | FE_CAN_GUARD_INTERVAL_AUTO | FE_CAN_HIERARCHY_AUTO | FE_CAN_MUTE_TS }, .release = ec100_release, .set_frontend = ec100_set_frontend, .get_tune_settings = ec100_get_tune_settings, .read_status = ec100_read_status, .read_ber = ec100_read_ber, .read_signal_strength = ec100_read_signal_strength, .read_snr = ec100_read_snr, .read_ucblocks = ec100_read_ucblocks, }; MODULE_AUTHOR("Antti Palosaari <crope@iki.fi>"); MODULE_DESCRIPTION("E3C EC100 DVB-T demodulator driver"); MODULE_LICENSE("GPL"); |
| 1 46 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | // SPDX-License-Identifier: GPL-2.0+ /* * Fast-charge control for Apple "MFi" devices * * Copyright (C) 2019 Bastien Nocera <hadess@hadess.net> */ /* Standard include files */ #include <linux/module.h> #include <linux/power_supply.h> #include <linux/slab.h> #include <linux/usb.h> MODULE_AUTHOR("Bastien Nocera <hadess@hadess.net>"); MODULE_DESCRIPTION("Fast-charge control for Apple \"MFi\" devices"); MODULE_LICENSE("GPL"); #define TRICKLE_CURRENT_MA 0 #define FAST_CURRENT_MA 2500 #define APPLE_VENDOR_ID 0x05ac /* Apple */ /* The product ID is defined as starting with 0x12nn, as per the * "Choosing an Apple Device USB Configuration" section in * release R9 (2012) of the "MFi Accessory Hardware Specification" * * To distinguish an Apple device, a USB host can check the device * descriptor of attached USB devices for the following fields: * ■ Vendor ID: 0x05AC * ■ Product ID: 0x12nn * * Those checks will be done in .match() and .probe(). */ static const struct usb_device_id mfi_fc_id_table[] = { { .idVendor = APPLE_VENDOR_ID, .match_flags = USB_DEVICE_ID_MATCH_VENDOR }, {}, }; MODULE_DEVICE_TABLE(usb, mfi_fc_id_table); /* Driver-local specific stuff */ struct mfi_device { struct usb_device *udev; struct power_supply *battery; int charge_type; }; static int apple_mfi_fc_set_charge_type(struct mfi_device *mfi, const union power_supply_propval *val) { int current_ma; int retval; __u8 request_type; if (mfi->charge_type == val->intval) { dev_dbg(&mfi->udev->dev, "charge type %d already set\n", mfi->charge_type); return 0; } switch (val->intval) { case POWER_SUPPLY_CHARGE_TYPE_TRICKLE: current_ma = TRICKLE_CURRENT_MA; break; case POWER_SUPPLY_CHARGE_TYPE_FAST: current_ma = FAST_CURRENT_MA; break; default: return -EINVAL; } request_type = USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE; retval = usb_control_msg(mfi->udev, usb_sndctrlpipe(mfi->udev, 0), 0x40, /* Vendor‐defined power request */ request_type, current_ma, /* wValue, current offset */ current_ma, /* wIndex, current offset */ NULL, 0, USB_CTRL_GET_TIMEOUT); if (retval) { dev_dbg(&mfi->udev->dev, "retval = %d\n", retval); return retval; } mfi->charge_type = val->intval; return 0; } static int apple_mfi_fc_get_property(struct power_supply *psy, enum power_supply_property psp, union power_supply_propval *val) { struct mfi_device *mfi = power_supply_get_drvdata(psy); dev_dbg(&mfi->udev->dev, "prop: %d\n", psp); switch (psp) { case POWER_SUPPLY_PROP_CHARGE_TYPE: val->intval = mfi->charge_type; break; case POWER_SUPPLY_PROP_SCOPE: val->intval = POWER_SUPPLY_SCOPE_DEVICE; break; default: return -ENODATA; } return 0; } static int apple_mfi_fc_set_property(struct power_supply *psy, enum power_supply_property psp, const union power_supply_propval *val) { struct mfi_device *mfi = power_supply_get_drvdata(psy); int ret; dev_dbg(&mfi->udev->dev, "prop: %d\n", psp); ret = pm_runtime_get_sync(&mfi->udev->dev); if (ret < 0) { pm_runtime_put_noidle(&mfi->udev->dev); return ret; } switch (psp) { case POWER_SUPPLY_PROP_CHARGE_TYPE: ret = apple_mfi_fc_set_charge_type(mfi, val); break; default: ret = -EINVAL; } pm_runtime_mark_last_busy(&mfi->udev->dev); pm_runtime_put_autosuspend(&mfi->udev->dev); return ret; } static int apple_mfi_fc_property_is_writeable(struct power_supply *psy, enum power_supply_property psp) { switch (psp) { case POWER_SUPPLY_PROP_CHARGE_TYPE: return 1; default: return 0; } } static enum power_supply_property apple_mfi_fc_properties[] = { POWER_SUPPLY_PROP_CHARGE_TYPE, POWER_SUPPLY_PROP_SCOPE }; static const struct power_supply_desc apple_mfi_fc_desc = { .name = "apple_mfi_fastcharge", .type = POWER_SUPPLY_TYPE_BATTERY, .properties = apple_mfi_fc_properties, .num_properties = ARRAY_SIZE(apple_mfi_fc_properties), .get_property = apple_mfi_fc_get_property, .set_property = apple_mfi_fc_set_property, .property_is_writeable = apple_mfi_fc_property_is_writeable }; static bool mfi_fc_match(struct usb_device *udev) { int idProduct; idProduct = le16_to_cpu(udev->descriptor.idProduct); /* See comment above mfi_fc_id_table[] */ return (idProduct >= 0x1200 && idProduct <= 0x12ff); } static int mfi_fc_probe(struct usb_device *udev) { struct power_supply_config battery_cfg = {}; struct mfi_device *mfi = NULL; int err; if (!mfi_fc_match(udev)) return -ENODEV; mfi = kzalloc(sizeof(struct mfi_device), GFP_KERNEL); if (!mfi) return -ENOMEM; battery_cfg.drv_data = mfi; mfi->charge_type = POWER_SUPPLY_CHARGE_TYPE_TRICKLE; mfi->battery = power_supply_register(&udev->dev, &apple_mfi_fc_desc, &battery_cfg); if (IS_ERR(mfi->battery)) { dev_err(&udev->dev, "Can't register battery\n"); err = PTR_ERR(mfi->battery); kfree(mfi); return err; } mfi->udev = usb_get_dev(udev); dev_set_drvdata(&udev->dev, mfi); return 0; } static void mfi_fc_disconnect(struct usb_device *udev) { struct mfi_device *mfi; mfi = dev_get_drvdata(&udev->dev); if (mfi->battery) power_supply_unregister(mfi->battery); dev_set_drvdata(&udev->dev, NULL); usb_put_dev(mfi->udev); kfree(mfi); } static struct usb_device_driver mfi_fc_driver = { .name = "apple-mfi-fastcharge", .probe = mfi_fc_probe, .disconnect = mfi_fc_disconnect, .id_table = mfi_fc_id_table, .match = mfi_fc_match, .generic_subclass = 1, }; static int __init mfi_fc_driver_init(void) { return usb_register_device_driver(&mfi_fc_driver, THIS_MODULE); } static void __exit mfi_fc_driver_exit(void) { usb_deregister_device_driver(&mfi_fc_driver); } module_init(mfi_fc_driver_init); module_exit(mfi_fc_driver_exit); |
| 298 5 293 57 58 57 57 57 57 57 60 15 58 56 15 2 12 57 56 11 2 9 2 9 1 9 11 56 57 73 1 70 3 59 17 63 23 10 10 2 2 55 55 54 5 50 55 54 49 6 72 16 44 13 3 11 5 60 6 6 1 55 54 1 54 55 55 55 1 54 5 5 2 5 3 44 47 55 55 2 50 46 2 2 54 55 55 50 50 73 1 72 17 1 2 1 1 3 52 29 29 27 46 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 | // SPDX-License-Identifier: GPL-2.0-only /* * linux/fs/exec.c * * Copyright (C) 1991, 1992 Linus Torvalds */ /* * #!-checking implemented by tytso. */ /* * Demand-loading implemented 01.12.91 - no need to read anything but * the header into memory. The inode of the executable is put into * "current->executable", and page faults do the actual loading. Clean. * * Once more I can proudly say that linux stood up to being changed: it * was less than 2 hours work to get demand-loading completely implemented. * * Demand loading changed July 1993 by Eric Youngdale. Use mmap instead, * current->executable is only used by the procfs. This allows a dispatch * table to check for several different types of binary formats. We keep * trying until we recognize the file or we run out of supported binary * formats. */ #include <linux/kernel_read_file.h> #include <linux/slab.h> #include <linux/file.h> #include <linux/fdtable.h> #include <linux/mm.h> #include <linux/stat.h> #include <linux/fcntl.h> #include <linux/swap.h> #include <linux/string.h> #include <linux/init.h> #include <linux/sched/mm.h> #include <linux/sched/coredump.h> #include <linux/sched/signal.h> #include <linux/sched/numa_balancing.h> #include <linux/sched/task.h> #include <linux/pagemap.h> #include <linux/perf_event.h> #include <linux/highmem.h> #include <linux/spinlock.h> #include <linux/key.h> #include <linux/personality.h> #include <linux/binfmts.h> #include <linux/utsname.h> #include <linux/pid_namespace.h> #include <linux/module.h> #include <linux/namei.h> #include <linux/mount.h> #include <linux/security.h> #include <linux/syscalls.h> #include <linux/tsacct_kern.h> #include <linux/cn_proc.h> #include <linux/audit.h> #include <linux/kmod.h> #include <linux/fsnotify.h> #include <linux/fs_struct.h> #include <linux/oom.h> #include <linux/compat.h> #include <linux/vmalloc.h> #include <linux/io_uring.h> #include <linux/syscall_user_dispatch.h> #include <linux/coredump.h> #include <linux/time_namespace.h> #include <linux/user_events.h> #include <linux/rseq.h> #include <linux/ksm.h> #include <linux/uaccess.h> #include <asm/mmu_context.h> #include <asm/tlb.h> #include <trace/events/task.h> #include "internal.h" #include <trace/events/sched.h> /* For vma exec functions. */ #include "../mm/internal.h" static int bprm_creds_from_file(struct linux_binprm *bprm); int suid_dumpable = 0; static LIST_HEAD(formats); static DEFINE_RWLOCK(binfmt_lock); void __register_binfmt(struct linux_binfmt * fmt, int insert) { write_lock(&binfmt_lock); insert ? list_add(&fmt->lh, &formats) : list_add_tail(&fmt->lh, &formats); write_unlock(&binfmt_lock); } EXPORT_SYMBOL(__register_binfmt); void unregister_binfmt(struct linux_binfmt * fmt) { write_lock(&binfmt_lock); list_del(&fmt->lh); write_unlock(&binfmt_lock); } EXPORT_SYMBOL(unregister_binfmt); static inline void put_binfmt(struct linux_binfmt * fmt) { module_put(fmt->module); } bool path_noexec(const struct path *path) { return (path->mnt->mnt_flags & MNT_NOEXEC) || (path->mnt->mnt_sb->s_iflags & SB_I_NOEXEC); } #ifdef CONFIG_MMU /* * The nascent bprm->mm is not visible until exec_mmap() but it can * use a lot of memory, account these pages in current->mm temporary * for oom_badness()->get_mm_rss(). Once exec succeeds or fails, we * change the counter back via acct_arg_size(0). */ static void acct_arg_size(struct linux_binprm *bprm, unsigned long pages) { struct mm_struct *mm = current->mm; long diff = (long)(pages - bprm->vma_pages); if (!mm || !diff) return; bprm->vma_pages = pages; add_mm_counter(mm, MM_ANONPAGES, diff); } static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos, int write) { struct page *page; struct vm_area_struct *vma = bprm->vma; struct mm_struct *mm = bprm->mm; int ret; /* * Avoid relying on expanding the stack down in GUP (which * does not work for STACK_GROWSUP anyway), and just do it * ahead of time. */ if (!mmap_read_lock_maybe_expand(mm, vma, pos, write)) return NULL; /* * We are doing an exec(). 'current' is the process * doing the exec and 'mm' is the new process's mm. */ ret = get_user_pages_remote(mm, pos, 1, write ? FOLL_WRITE : 0, &page, NULL); mmap_read_unlock(mm); if (ret <= 0) return NULL; if (write) acct_arg_size(bprm, vma_pages(vma)); return page; } static void put_arg_page(struct page *page) { put_page(page); } static void free_arg_pages(struct linux_binprm *bprm) { } static void flush_arg_page(struct linux_binprm *bprm, unsigned long pos, struct page *page) { flush_cache_page(bprm->vma, pos, page_to_pfn(page)); } static bool valid_arg_len(struct linux_binprm *bprm, long len) { return len <= MAX_ARG_STRLEN; } #else static inline void acct_arg_size(struct linux_binprm *bprm, unsigned long pages) { } static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos, int write) { struct page *page; page = bprm->page[pos / PAGE_SIZE]; if (!page && write) { page = alloc_page(GFP_HIGHUSER|__GFP_ZERO); if (!page) return NULL; bprm->page[pos / PAGE_SIZE] = page; } return page; } static void put_arg_page(struct page *page) { } static void free_arg_page(struct linux_binprm *bprm, int i) { if (bprm->page[i]) { __free_page(bprm->page[i]); bprm->page[i] = NULL; } } static void free_arg_pages(struct linux_binprm *bprm) { int i; for (i = 0; i < MAX_ARG_PAGES; i++) free_arg_page(bprm, i); } static void flush_arg_page(struct linux_binprm *bprm, unsigned long pos, struct page *page) { } static bool valid_arg_len(struct linux_binprm *bprm, long len) { return len <= bprm->p; } #endif /* CONFIG_MMU */ /* * Create a new mm_struct and populate it with a temporary stack * vm_area_struct. We don't have enough context at this point to set the stack * flags, permissions, and offset, so we use temporary values. We'll update * them later in setup_arg_pages(). */ static int bprm_mm_init(struct linux_binprm *bprm) { int err; struct mm_struct *mm = NULL; bprm->mm = mm = mm_alloc(); err = -ENOMEM; if (!mm) goto err; /* Save current stack limit for all calculations made during exec. */ task_lock(current->group_leader); bprm->rlim_stack = current->signal->rlim[RLIMIT_STACK]; task_unlock(current->group_leader); #ifndef CONFIG_MMU bprm->p = PAGE_SIZE * MAX_ARG_PAGES - sizeof(void *); #else err = create_init_stack_vma(bprm->mm, &bprm->vma, &bprm->p); if (err) goto err; #endif return 0; err: if (mm) { bprm->mm = NULL; mmdrop(mm); } return err; } struct user_arg_ptr { #ifdef CONFIG_COMPAT bool is_compat; #endif union { const char __user *const __user *native; #ifdef CONFIG_COMPAT const compat_uptr_t __user *compat; #endif } ptr; }; static const char __user *get_user_arg_ptr(struct user_arg_ptr argv, int nr) { const char __user *native; #ifdef CONFIG_COMPAT if (unlikely(argv.is_compat)) { compat_uptr_t compat; if (get_user(compat, argv.ptr.compat + nr)) return ERR_PTR(-EFAULT); return compat_ptr(compat); } #endif if (get_user(native, argv.ptr.native + nr)) return ERR_PTR(-EFAULT); return native; } /* * count() counts the number of strings in array ARGV. */ static int count(struct user_arg_ptr argv, int max) { int i = 0; if (argv.ptr.native != NULL) { for (;;) { const char __user *p = get_user_arg_ptr(argv, i); if (!p) break; if (IS_ERR(p)) return -EFAULT; if (i >= max) return -E2BIG; ++i; if (fatal_signal_pending(current)) return -ERESTARTNOHAND; cond_resched(); } } return i; } static int count_strings_kernel(const char *const *argv) { int i; if (!argv) return 0; for (i = 0; argv[i]; ++i) { if (i >= MAX_ARG_STRINGS) return -E2BIG; if (fatal_signal_pending(current)) return -ERESTARTNOHAND; cond_resched(); } return i; } static inline int bprm_set_stack_limit(struct linux_binprm *bprm, unsigned long limit) { #ifdef CONFIG_MMU /* Avoid a pathological bprm->p. */ if (bprm->p < limit) return -E2BIG; bprm->argmin = bprm->p - limit; #endif return 0; } static inline bool bprm_hit_stack_limit(struct linux_binprm *bprm) { #ifdef CONFIG_MMU return bprm->p < bprm->argmin; #else return false; #endif } /* * Calculate bprm->argmin from: * - _STK_LIM * - ARG_MAX * - bprm->rlim_stack.rlim_cur * - bprm->argc * - bprm->envc * - bprm->p */ static int bprm_stack_limits(struct linux_binprm *bprm) { unsigned long limit, ptr_size; /* * Limit to 1/4 of the max stack size or 3/4 of _STK_LIM * (whichever is smaller) for the argv+env strings. * This ensures that: * - the remaining binfmt code will not run out of stack space, * - the program will have a reasonable amount of stack left * to work from. */ limit = _STK_LIM / 4 * 3; limit = min(limit, bprm->rlim_stack.rlim_cur / 4); /* * We've historically supported up to 32 pages (ARG_MAX) * of argument strings even with small stacks */ limit = max_t(unsigned long, limit, ARG_MAX); /* Reject totally pathological counts. */ if (bprm->argc < 0 || bprm->envc < 0) return -E2BIG; /* * We must account for the size of all the argv and envp pointers to * the argv and envp strings, since they will also take up space in * the stack. They aren't stored until much later when we can't * signal to the parent that the child has run out of stack space. * Instead, calculate it here so it's possible to fail gracefully. * * In the case of argc = 0, make sure there is space for adding a * empty string (which will bump argc to 1), to ensure confused * userspace programs don't start processing from argv[1], thinking * argc can never be 0, to keep them from walking envp by accident. * See do_execveat_common(). */ if (check_add_overflow(max(bprm->argc, 1), bprm->envc, &ptr_size) || check_mul_overflow(ptr_size, sizeof(void *), &ptr_size)) return -E2BIG; if (limit <= ptr_size) return -E2BIG; limit -= ptr_size; return bprm_set_stack_limit(bprm, limit); } /* * 'copy_strings()' copies argument/environment strings from the old * processes's memory to the new process's stack. The call to get_user_pages() * ensures the destination page is created and not swapped out. */ static int copy_strings(int argc, struct user_arg_ptr argv, struct linux_binprm *bprm) { struct page *kmapped_page = NULL; char *kaddr = NULL; unsigned long kpos = 0; int ret; while (argc-- > 0) { const char __user *str; int len; unsigned long pos; ret = -EFAULT; str = get_user_arg_ptr(argv, argc); if (IS_ERR(str)) goto out; len = strnlen_user(str, MAX_ARG_STRLEN); if (!len) goto out; ret = -E2BIG; if (!valid_arg_len(bprm, len)) goto out; /* We're going to work our way backwards. */ pos = bprm->p; str += len; bprm->p -= len; if (bprm_hit_stack_limit(bprm)) goto out; while (len > 0) { int offset, bytes_to_copy; if (fatal_signal_pending(current)) { ret = -ERESTARTNOHAND; goto out; } cond_resched(); offset = pos % PAGE_SIZE; if (offset == 0) offset = PAGE_SIZE; bytes_to_copy = offset; if (bytes_to_copy > len) bytes_to_copy = len; offset -= bytes_to_copy; pos -= bytes_to_copy; str -= bytes_to_copy; len -= bytes_to_copy; if (!kmapped_page || kpos != (pos & PAGE_MASK)) { struct page *page; page = get_arg_page(bprm, pos, 1); if (!page) { ret = -E2BIG; goto out; } if (kmapped_page) { flush_dcache_page(kmapped_page); kunmap_local(kaddr); put_arg_page(kmapped_page); } kmapped_page = page; kaddr = kmap_local_page(kmapped_page); kpos = pos & PAGE_MASK; flush_arg_page(bprm, kpos, kmapped_page); } if (copy_from_user(kaddr+offset, str, bytes_to_copy)) { ret = -EFAULT; goto out; } } } ret = 0; out: if (kmapped_page) { flush_dcache_page(kmapped_page); kunmap_local(kaddr); put_arg_page(kmapped_page); } return ret; } /* * Copy and argument/environment string from the kernel to the processes stack. */ int copy_string_kernel(const char *arg, struct linux_binprm *bprm) { int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating NUL */; unsigned long pos = bprm->p; if (len == 0) return -EFAULT; if (!valid_arg_len(bprm, len)) return -E2BIG; /* We're going to work our way backwards. */ arg += len; bprm->p -= len; if (bprm_hit_stack_limit(bprm)) return -E2BIG; while (len > 0) { unsigned int bytes_to_copy = min_t(unsigned int, len, min_not_zero(offset_in_page(pos), PAGE_SIZE)); struct page *page; pos -= bytes_to_copy; arg -= bytes_to_copy; len -= bytes_to_copy; page = get_arg_page(bprm, pos, 1); if (!page) return -E2BIG; flush_arg_page(bprm, pos & PAGE_MASK, page); memcpy_to_page(page, offset_in_page(pos), arg, bytes_to_copy); put_arg_page(page); } return 0; } EXPORT_SYMBOL(copy_string_kernel); static int copy_strings_kernel(int argc, const char *const *argv, struct linux_binprm *bprm) { while (argc-- > 0) { int ret = copy_string_kernel(argv[argc], bprm); if (ret < 0) return ret; if (fatal_signal_pending(current)) return -ERESTARTNOHAND; cond_resched(); } return 0; } #ifdef CONFIG_MMU /* * Finalizes the stack vm_area_struct. The flags and permissions are updated, * the stack is optionally relocated, and some extra space is added. */ int setup_arg_pages(struct linux_binprm *bprm, unsigned long stack_top, int executable_stack) { unsigned long ret; unsigned long stack_shift; struct mm_struct *mm = current->mm; struct vm_area_struct *vma = bprm->vma; struct vm_area_struct *prev = NULL; unsigned long vm_flags; unsigned long stack_base; unsigned long stack_size; unsigned long stack_expand; unsigned long rlim_stack; struct mmu_gather tlb; struct vma_iterator vmi; #ifdef CONFIG_STACK_GROWSUP /* Limit stack size */ stack_base = bprm->rlim_stack.rlim_max; stack_base = calc_max_stack_size(stack_base); /* Add space for stack randomization. */ if (current->flags & PF_RANDOMIZE) stack_base += (STACK_RND_MASK << PAGE_SHIFT); /* Make sure we didn't let the argument array grow too large. */ if (vma->vm_end - vma->vm_start > stack_base) return -ENOMEM; stack_base = PAGE_ALIGN(stack_top - stack_base); stack_shift = vma->vm_start - stack_base; mm->arg_start = bprm->p - stack_shift; bprm->p = vma->vm_end - stack_shift; #else stack_top = arch_align_stack(stack_top); stack_top = PAGE_ALIGN(stack_top); if (unlikely(stack_top < mmap_min_addr) || unlikely(vma->vm_end - vma->vm_start >= stack_top - mmap_min_addr)) return -ENOMEM; stack_shift = vma->vm_end - stack_top; bprm->p -= stack_shift; mm->arg_start = bprm->p; #endif bprm->exec -= stack_shift; if (mmap_write_lock_killable(mm)) return -EINTR; vm_flags = VM_STACK_FLAGS; /* * Adjust stack execute permissions; explicitly enable for * EXSTACK_ENABLE_X, disable for EXSTACK_DISABLE_X and leave alone * (arch default) otherwise. */ if (unlikely(executable_stack == EXSTACK_ENABLE_X)) vm_flags |= VM_EXEC; else if (executable_stack == EXSTACK_DISABLE_X) vm_flags &= ~VM_EXEC; vm_flags |= mm->def_flags; vm_flags |= VM_STACK_INCOMPLETE_SETUP; vma_iter_init(&vmi, mm, vma->vm_start); tlb_gather_mmu(&tlb, mm); ret = mprotect_fixup(&vmi, &tlb, vma, &prev, vma->vm_start, vma->vm_end, vm_flags); tlb_finish_mmu(&tlb); if (ret) goto out_unlock; BUG_ON(prev != vma); if (unlikely(vm_flags & VM_EXEC)) { pr_warn_once("process '%pD4' started with executable stack\n", bprm->file); } /* Move stack pages down in memory. */ if (stack_shift) { /* * During bprm_mm_init(), we create a temporary stack at STACK_TOP_MAX. Once * the binfmt code determines where the new stack should reside, we shift it to * its final location. */ ret = relocate_vma_down(vma, stack_shift); if (ret) goto out_unlock; } /* mprotect_fixup is overkill to remove the temporary stack flags */ vm_flags_clear(vma, VM_STACK_INCOMPLETE_SETUP); stack_expand = 131072UL; /* randomly 32*4k (or 2*64k) pages */ stack_size = vma->vm_end - vma->vm_start; /* * Align this down to a page boundary as expand_stack * will align it up. */ rlim_stack = bprm->rlim_stack.rlim_cur & PAGE_MASK; stack_expand = min(rlim_stack, stack_size + stack_expand); #ifdef CONFIG_STACK_GROWSUP stack_base = vma->vm_start + stack_expand; #else stack_base = vma->vm_end - stack_expand; #endif current->mm->start_stack = bprm->p; ret = expand_stack_locked(vma, stack_base); if (ret) ret = -EFAULT; out_unlock: mmap_write_unlock(mm); return ret; } EXPORT_SYMBOL(setup_arg_pages); #else /* * Transfer the program arguments and environment from the holding pages * onto the stack. The provided stack pointer is adjusted accordingly. */ int transfer_args_to_stack(struct linux_binprm *bprm, unsigned long *sp_location) { unsigned long index, stop, sp; int ret = 0; stop = bprm->p >> PAGE_SHIFT; sp = *sp_location; for (index = MAX_ARG_PAGES - 1; index >= stop; index--) { unsigned int offset = index == stop ? bprm->p & ~PAGE_MASK : 0; char *src = kmap_local_page(bprm->page[index]) + offset; sp -= PAGE_SIZE - offset; if (copy_to_user((void *) sp, src, PAGE_SIZE - offset) != 0) ret = -EFAULT; kunmap_local(src); if (ret) goto out; } bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE; *sp_location = sp; out: return ret; } EXPORT_SYMBOL(transfer_args_to_stack); #endif /* CONFIG_MMU */ /* * On success, caller must call do_close_execat() on the returned * struct file to close it. */ static struct file *do_open_execat(int fd, struct filename *name, int flags) { int err; struct file *file __free(fput) = NULL; struct open_flags open_exec_flags = { .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, .acc_mode = MAY_EXEC, .intent = LOOKUP_OPEN, .lookup_flags = LOOKUP_FOLLOW, }; if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH | AT_EXECVE_CHECK)) != 0) return ERR_PTR(-EINVAL); if (flags & AT_SYMLINK_NOFOLLOW) open_exec_flags.lookup_flags &= ~LOOKUP_FOLLOW; if (flags & AT_EMPTY_PATH) open_exec_flags.lookup_flags |= LOOKUP_EMPTY; file = do_filp_open(fd, name, &open_exec_flags); if (IS_ERR(file)) return file; /* * In the past the regular type check was here. It moved to may_open() in * 633fb6ac3980 ("exec: move S_ISREG() check earlier"). Since then it is * an invariant that all non-regular files error out before we get here. */ if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode)) || path_noexec(&file->f_path)) return ERR_PTR(-EACCES); err = exe_file_deny_write_access(file); if (err) return ERR_PTR(err); return no_free_ptr(file); } /** * open_exec - Open a path name for execution * * @name: path name to open with the intent of executing it. * * Returns ERR_PTR on failure or allocated struct file on success. * * As this is a wrapper for the internal do_open_execat(), callers * must call exe_file_allow_write_access() before fput() on release. Also see * do_close_execat(). */ struct file *open_exec(const char *name) { struct filename *filename = getname_kernel(name); struct file *f = ERR_CAST(filename); if (!IS_ERR(filename)) { f = do_open_execat(AT_FDCWD, filename, 0); putname(filename); } return f; } EXPORT_SYMBOL(open_exec); #if defined(CONFIG_BINFMT_FLAT) || defined(CONFIG_BINFMT_ELF_FDPIC) ssize_t read_code(struct file *file, unsigned long addr, loff_t pos, size_t len) { ssize_t res = vfs_read(file, (void __user *)addr, len, &pos); if (res > 0) flush_icache_user_range(addr, addr + len); return res; } EXPORT_SYMBOL(read_code); #endif /* * Maps the mm_struct mm into the current task struct. * On success, this function returns with exec_update_lock * held for writing. */ static int exec_mmap(struct mm_struct *mm) { struct task_struct *tsk; struct mm_struct *old_mm, *active_mm; int ret; /* Notify parent that we're no longer interested in the old VM */ tsk = current; old_mm = current->mm; exec_mm_release(tsk, old_mm); ret = down_write_killable(&tsk->signal->exec_update_lock); if (ret) return ret; if (old_mm) { /* * If there is a pending fatal signal perhaps a signal * whose default action is to create a coredump get * out and die instead of going through with the exec. */ ret = mmap_read_lock_killable(old_mm); if (ret) { up_write(&tsk->signal->exec_update_lock); return ret; } } task_lock(tsk); membarrier_exec_mmap(mm); local_irq_disable(); active_mm = tsk->active_mm; tsk->active_mm = mm; tsk->mm = mm; mm_init_cid(mm, tsk); /* * This prevents preemption while active_mm is being loaded and * it and mm are being updated, which could cause problems for * lazy tlb mm refcounting when these are updated by context * switches. Not all architectures can handle irqs off over * activate_mm yet. */ if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM)) local_irq_enable(); activate_mm(active_mm, mm); if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM)) local_irq_enable(); lru_gen_add_mm(mm); task_unlock(tsk); lru_gen_use_mm(mm); if (old_mm) { mmap_read_unlock(old_mm); BUG_ON(active_mm != old_mm); setmax_mm_hiwater_rss(&tsk->signal->maxrss, old_mm); mm_update_next_owner(old_mm); mmput(old_mm); return 0; } mmdrop_lazy_tlb(active_mm); return 0; } static int de_thread(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; struct sighand_struct *oldsighand = tsk->sighand; spinlock_t *lock = &oldsighand->siglock; if (thread_group_empty(tsk)) goto no_thread_group; /* * Kill all other threads in the thread group. */ spin_lock_irq(lock); if ((sig->flags & SIGNAL_GROUP_EXIT) || sig->group_exec_task) { /* * Another group action in progress, just * return so that the signal is processed. */ spin_unlock_irq(lock); return -EAGAIN; } sig->group_exec_task = tsk; sig->notify_count = zap_other_threads(tsk); if (!thread_group_leader(tsk)) sig->notify_count--; while (sig->notify_count) { __set_current_state(TASK_KILLABLE); spin_unlock_irq(lock); schedule(); if (__fatal_signal_pending(tsk)) goto killed; spin_lock_irq(lock); } spin_unlock_irq(lock); /* * At this point all other threads have exited, all we have to * do is to wait for the thread group leader to become inactive, * and to assume its PID: */ if (!thread_group_leader(tsk)) { struct task_struct *leader = tsk->group_leader; for (;;) { cgroup_threadgroup_change_begin(tsk); write_lock_irq(&tasklist_lock); /* * Do this under tasklist_lock to ensure that * exit_notify() can't miss ->group_exec_task */ sig->notify_count = -1; if (likely(leader->exit_state)) break; __set_current_state(TASK_KILLABLE); write_unlock_irq(&tasklist_lock); cgroup_threadgroup_change_end(tsk); schedule(); if (__fatal_signal_pending(tsk)) goto killed; } /* * The only record we have of the real-time age of a * process, regardless of execs it's done, is start_time. * All the past CPU time is accumulated in signal_struct * from sister threads now dead. But in this non-leader * exec, nothing survives from the original leader thread, * whose birth marks the true age of this process now. * When we take on its identity by switching to its PID, we * also take its birthdate (always earlier than our own). */ tsk->start_time = leader->start_time; tsk->start_boottime = leader->start_boottime; BUG_ON(!same_thread_group(leader, tsk)); /* * An exec() starts a new thread group with the * TGID of the previous thread group. Rehash the * two threads with a switched PID, and release * the former thread group leader: */ /* Become a process group leader with the old leader's pid. * The old leader becomes a thread of the this thread group. */ exchange_tids(tsk, leader); transfer_pid(leader, tsk, PIDTYPE_TGID); transfer_pid(leader, tsk, PIDTYPE_PGID); transfer_pid(leader, tsk, PIDTYPE_SID); list_replace_rcu(&leader->tasks, &tsk->tasks); list_replace_init(&leader->sibling, &tsk->sibling); tsk->group_leader = tsk; leader->group_leader = tsk; tsk->exit_signal = SIGCHLD; leader->exit_signal = -1; BUG_ON(leader->exit_state != EXIT_ZOMBIE); leader->exit_state = EXIT_DEAD; /* * We are going to release_task()->ptrace_unlink() silently, * the tracer can sleep in do_wait(). EXIT_DEAD guarantees * the tracer won't block again waiting for this thread. */ if (unlikely(leader->ptrace)) __wake_up_parent(leader, leader->parent); write_unlock_irq(&tasklist_lock); cgroup_threadgroup_change_end(tsk); release_task(leader); } sig->group_exec_task = NULL; sig->notify_count = 0; no_thread_group: /* we have changed execution domain */ tsk->exit_signal = SIGCHLD; BUG_ON(!thread_group_leader(tsk)); return 0; killed: /* protects against exit_notify() and __exit_signal() */ read_lock(&tasklist_lock); sig->group_exec_task = NULL; sig->notify_count = 0; read_unlock(&tasklist_lock); return -EAGAIN; } /* * This function makes sure the current process has its own signal table, * so that flush_signal_handlers can later reset the handlers without * disturbing other processes. (Other processes might share the signal * table via the CLONE_SIGHAND option to clone().) */ static int unshare_sighand(struct task_struct *me) { struct sighand_struct *oldsighand = me->sighand; if (refcount_read(&oldsighand->count) != 1) { struct sighand_struct *newsighand; /* * This ->sighand is shared with the CLONE_SIGHAND * but not CLONE_THREAD task, switch to the new one. */ newsighand = kmem_cache_alloc(sighand_cachep, GFP_KERNEL); if (!newsighand) return -ENOMEM; refcount_set(&newsighand->count, 1); write_lock_irq(&tasklist_lock); spin_lock(&oldsighand->siglock); memcpy(newsighand->action, oldsighand->action, sizeof(newsighand->action)); rcu_assign_pointer(me->sighand, newsighand); spin_unlock(&oldsighand->siglock); write_unlock_irq(&tasklist_lock); __cleanup_sighand(oldsighand); } return 0; } /* * This is unlocked -- the string will always be NUL-terminated, but * may show overlapping contents if racing concurrent reads. */ void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) { size_t len = min(strlen(buf), sizeof(tsk->comm) - 1); trace_task_rename(tsk, buf); memcpy(tsk->comm, buf, len); memset(&tsk->comm[len], 0, sizeof(tsk->comm) - len); perf_event_comm(tsk, exec); } /* * Calling this is the point of no return. None of the failures will be * seen by userspace since either the process is already taking a fatal * signal (via de_thread() or coredump), or will have SEGV raised * (after exec_mmap()) by search_binary_handler (see below). */ int begin_new_exec(struct linux_binprm * bprm) { struct task_struct *me = current; int retval; /* Once we are committed compute the creds */ retval = bprm_creds_from_file(bprm); if (retval) return retval; /* * This tracepoint marks the point before flushing the old exec where * the current task is still unchanged, but errors are fatal (point of * no return). The later "sched_process_exec" tracepoint is called after * the current task has successfully switched to the new exec. */ trace_sched_prepare_exec(current, bprm); /* * Ensure all future errors are fatal. */ bprm->point_of_no_return = true; /* Make this the only thread in the thread group */ retval = de_thread(me); if (retval) goto out; /* see the comment in check_unsafe_exec() */ current->fs->in_exec = 0; /* * Cancel any io_uring activity across execve */ io_uring_task_cancel(); /* Ensure the files table is not shared. */ retval = unshare_files(); if (retval) goto out; /* * Must be called _before_ exec_mmap() as bprm->mm is * not visible until then. Doing it here also ensures * we don't race against replace_mm_exe_file(). */ retval = set_mm_exe_file(bprm->mm, bprm->file); if (retval) goto out; /* If the binary is not readable then enforce mm->dumpable=0 */ would_dump(bprm, bprm->file); if (bprm->have_execfd) would_dump(bprm, bprm->executable); /* * Release all of the old mmap stuff */ acct_arg_size(bprm, 0); retval = exec_mmap(bprm->mm); if (retval) goto out; bprm->mm = NULL; retval = exec_task_namespaces(); if (retval) goto out_unlock; #ifdef CONFIG_POSIX_TIMERS spin_lock_irq(&me->sighand->siglock); posix_cpu_timers_exit(me); spin_unlock_irq(&me->sighand->siglock); exit_itimers(me); flush_itimer_signals(); #endif /* * Make the signal table private. */ retval = unshare_sighand(me); if (retval) goto out_unlock; me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_NOFREEZE | PF_NO_SETAFFINITY); flush_thread(); me->personality &= ~bprm->per_clear; clear_syscall_work_syscall_user_dispatch(me); /* * We have to apply CLOEXEC before we change whether the process is * dumpable (in setup_new_exec) to avoid a race with a process in userspace * trying to access the should-be-closed file descriptors of a process * undergoing exec(2). */ do_close_on_exec(me->files); if (bprm->secureexec) { /* Make sure parent cannot signal privileged process. */ me->pdeath_signal = 0; /* * For secureexec, reset the stack limit to sane default to * avoid bad behavior from the prior rlimits. This has to * happen before arch_pick_mmap_layout(), which examines * RLIMIT_STACK, but after the point of no return to avoid * needing to clean up the change on failure. */ if (bprm->rlim_stack.rlim_cur > _STK_LIM) bprm->rlim_stack.rlim_cur = _STK_LIM; } me->sas_ss_sp = me->sas_ss_size = 0; /* * Figure out dumpability. Note that this checking only of current * is wrong, but userspace depends on it. This should be testing * bprm->secureexec instead. */ if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP || !(uid_eq(current_euid(), current_uid()) && gid_eq(current_egid(), current_gid()))) set_dumpable(current->mm, suid_dumpable); else set_dumpable(current->mm, SUID_DUMP_USER); perf_event_exec(); /* * If the original filename was empty, alloc_bprm() made up a path * that will probably not be useful to admins running ps or similar. * Let's fix it up to be something reasonable. */ if (bprm->comm_from_dentry) { /* * Hold RCU lock to keep the name from being freed behind our back. * Use acquire semantics to make sure the terminating NUL from * __d_alloc() is seen. * * Note, we're deliberately sloppy here. We don't need to care about * detecting a concurrent rename and just want a terminated name. */ rcu_read_lock(); __set_task_comm(me, smp_load_acquire(&bprm->file->f_path.dentry->d_name.name), true); rcu_read_unlock(); } else { __set_task_comm(me, kbasename(bprm->filename), true); } /* An exec changes our domain. We are no longer part of the thread group */ WRITE_ONCE(me->self_exec_id, me->self_exec_id + 1); flush_signal_handlers(me, 0); retval = set_cred_ucounts(bprm->cred); if (retval < 0) goto out_unlock; /* * install the new credentials for this executable */ security_bprm_committing_creds(bprm); commit_creds(bprm->cred); bprm->cred = NULL; /* * Disable monitoring for regular users * when executing setuid binaries. Must * wait until new credentials are committed * by commit_creds() above */ if (get_dumpable(me->mm) != SUID_DUMP_USER) perf_event_exit_task(me); /* * cred_guard_mutex must be held at least to this point to prevent * ptrace_attach() from altering our determination of the task's * credentials; any time after this it may be unlocked. */ security_bprm_committed_creds(bprm); /* Pass the opened binary to the interpreter. */ if (bprm->have_execfd) { retval = get_unused_fd_flags(0); if (retval < 0) goto out_unlock; fd_install(retval, bprm->executable); bprm->executable = NULL; bprm->execfd = retval; } return 0; out_unlock: up_write(&me->signal->exec_update_lock); if (!bprm->cred) mutex_unlock(&me->signal->cred_guard_mutex); out: return retval; } EXPORT_SYMBOL(begin_new_exec); void would_dump(struct linux_binprm *bprm, struct file *file) { struct inode *inode = file_inode(file); struct mnt_idmap *idmap = file_mnt_idmap(file); if (inode_permission(idmap, inode, MAY_READ) < 0) { struct user_namespace *old, *user_ns; bprm->interp_flags |= BINPRM_FLAGS_ENFORCE_NONDUMP; /* Ensure mm->user_ns contains the executable */ user_ns = old = bprm->mm->user_ns; while ((user_ns != &init_user_ns) && !privileged_wrt_inode_uidgid(user_ns, idmap, inode)) user_ns = user_ns->parent; if (old != user_ns) { bprm->mm->user_ns = get_user_ns(user_ns); put_user_ns(old); } } } EXPORT_SYMBOL(would_dump); void setup_new_exec(struct linux_binprm * bprm) { /* Setup things that can depend upon the personality */ struct task_struct *me = current; arch_pick_mmap_layout(me->mm, &bprm->rlim_stack); arch_setup_new_exec(); /* Set the new mm task size. We have to do that late because it may * depend on TIF_32BIT which is only updated in flush_thread() on * some architectures like powerpc */ me->mm->task_size = TASK_SIZE; up_write(&me->signal->exec_update_lock); mutex_unlock(&me->signal->cred_guard_mutex); } EXPORT_SYMBOL(setup_new_exec); /* Runs immediately before start_thread() takes over. */ void finalize_exec(struct linux_binprm *bprm) { /* Store any stack rlimit changes before starting thread. */ task_lock(current->group_leader); current->signal->rlim[RLIMIT_STACK] = bprm->rlim_stack; task_unlock(current->group_leader); } EXPORT_SYMBOL(finalize_exec); /* * Prepare credentials and lock ->cred_guard_mutex. * setup_new_exec() commits the new creds and drops the lock. * Or, if exec fails before, free_bprm() should release ->cred * and unlock. */ static int prepare_bprm_creds(struct linux_binprm *bprm) { if (mutex_lock_interruptible(¤t->signal->cred_guard_mutex)) return -ERESTARTNOINTR; bprm->cred = prepare_exec_creds(); if (likely(bprm->cred)) return 0; mutex_unlock(¤t->signal->cred_guard_mutex); return -ENOMEM; } /* Matches do_open_execat() */ static void do_close_execat(struct file *file) { if (!file) return; exe_file_allow_write_access(file); fput(file); } static void free_bprm(struct linux_binprm *bprm) { if (bprm->mm) { acct_arg_size(bprm, 0); mmput(bprm->mm); } free_arg_pages(bprm); if (bprm->cred) { /* in case exec fails before de_thread() succeeds */ current->fs->in_exec = 0; mutex_unlock(¤t->signal->cred_guard_mutex); abort_creds(bprm->cred); } do_close_execat(bprm->file); if (bprm->executable) fput(bprm->executable); /* If a binfmt changed the interp, free it. */ if (bprm->interp != bprm->filename) kfree(bprm->interp); kfree(bprm->fdpath); kfree(bprm); } static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int flags) { struct linux_binprm *bprm; struct file *file; int retval = -ENOMEM; file = do_open_execat(fd, filename, flags); if (IS_ERR(file)) return ERR_CAST(file); bprm = kzalloc(sizeof(*bprm), GFP_KERNEL); if (!bprm) { do_close_execat(file); return ERR_PTR(-ENOMEM); } bprm->file = file; if (fd == AT_FDCWD || filename->name[0] == '/') { bprm->filename = filename->name; } else { if (filename->name[0] == '\0') { bprm->fdpath = kasprintf(GFP_KERNEL, "/dev/fd/%d", fd); bprm->comm_from_dentry = 1; } else { bprm->fdpath = kasprintf(GFP_KERNEL, "/dev/fd/%d/%s", fd, filename->name); } if (!bprm->fdpath) goto out_free; /* * Record that a name derived from an O_CLOEXEC fd will be * inaccessible after exec. This allows the code in exec to * choose to fail when the executable is not mmaped into the * interpreter and an open file descriptor is not passed to * the interpreter. This makes for a better user experience * than having the interpreter start and then immediately fail * when it finds the executable is inaccessible. */ if (get_close_on_exec(fd)) bprm->interp_flags |= BINPRM_FLAGS_PATH_INACCESSIBLE; bprm->filename = bprm->fdpath; } bprm->interp = bprm->filename; /* * At this point, security_file_open() has already been called (with * __FMODE_EXEC) and access control checks for AT_EXECVE_CHECK will * stop just after the security_bprm_creds_for_exec() call in * bprm_execve(). Indeed, the kernel should not try to parse the * content of the file with exec_binprm() nor change the calling * thread, which means that the following security functions will not * be called: * - security_bprm_check() * - security_bprm_creds_from_file() * - security_bprm_committing_creds() * - security_bprm_committed_creds() */ bprm->is_check = !!(flags & AT_EXECVE_CHECK); retval = bprm_mm_init(bprm); if (!retval) return bprm; out_free: free_bprm(bprm); return ERR_PTR(retval); } int bprm_change_interp(const char *interp, struct linux_binprm *bprm) { /* If a binfmt changed the interp, free it first. */ if (bprm->interp != bprm->filename) kfree(bprm->interp); bprm->interp = kstrdup(interp, GFP_KERNEL); if (!bprm->interp) return -ENOMEM; return 0; } EXPORT_SYMBOL(bprm_change_interp); /* * determine how safe it is to execute the proposed program * - the caller must hold ->cred_guard_mutex to protect against * PTRACE_ATTACH or seccomp thread-sync */ static void check_unsafe_exec(struct linux_binprm *bprm) { struct task_struct *p = current, *t; unsigned n_fs; if (p->ptrace) bprm->unsafe |= LSM_UNSAFE_PTRACE; /* * This isn't strictly necessary, but it makes it harder for LSMs to * mess up. */ if (task_no_new_privs(current)) bprm->unsafe |= LSM_UNSAFE_NO_NEW_PRIVS; /* * If another task is sharing our fs, we cannot safely * suid exec because the differently privileged task * will be able to manipulate the current directory, etc. * It would be nice to force an unshare instead... * * Otherwise we set fs->in_exec = 1 to deny clone(CLONE_FS) * from another sub-thread until de_thread() succeeds, this * state is protected by cred_guard_mutex we hold. */ n_fs = 1; spin_lock(&p->fs->lock); rcu_read_lock(); for_other_threads(p, t) { if (t->fs == p->fs) n_fs++; } rcu_read_unlock(); /* "users" and "in_exec" locked for copy_fs() */ if (p->fs->users > n_fs) bprm->unsafe |= LSM_UNSAFE_SHARE; else p->fs->in_exec = 1; spin_unlock(&p->fs->lock); } static void bprm_fill_uid(struct linux_binprm *bprm, struct file *file) { /* Handle suid and sgid on files */ struct mnt_idmap *idmap; struct inode *inode = file_inode(file); unsigned int mode; vfsuid_t vfsuid; vfsgid_t vfsgid; int err; if (!mnt_may_suid(file->f_path.mnt)) return; if (task_no_new_privs(current)) return; mode = READ_ONCE(inode->i_mode); if (!(mode & (S_ISUID|S_ISGID))) return; idmap = file_mnt_idmap(file); /* Be careful if suid/sgid is set */ inode_lock(inode); /* Atomically reload and check mode/uid/gid now that lock held. */ mode = inode->i_mode; vfsuid = i_uid_into_vfsuid(idmap, inode); vfsgid = i_gid_into_vfsgid(idmap, inode); err = inode_permission(idmap, inode, MAY_EXEC); inode_unlock(inode); /* Did the exec bit vanish out from under us? Give up. */ if (err) return; /* We ignore suid/sgid if there are no mappings for them in the ns */ if (!vfsuid_has_mapping(bprm->cred->user_ns, vfsuid) || !vfsgid_has_mapping(bprm->cred->user_ns, vfsgid)) return; if (mode & S_ISUID) { bprm->per_clear |= PER_CLEAR_ON_SETID; bprm->cred->euid = vfsuid_into_kuid(vfsuid); } if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { bprm->per_clear |= PER_CLEAR_ON_SETID; bprm->cred->egid = vfsgid_into_kgid(vfsgid); } } /* * Compute brpm->cred based upon the final binary. */ static int bprm_creds_from_file(struct linux_binprm *bprm) { /* Compute creds based on which file? */ struct file *file = bprm->execfd_creds ? bprm->executable : bprm->file; bprm_fill_uid(bprm, file); return security_bprm_creds_from_file(bprm, file); } /* * Fill the binprm structure from the inode. * Read the first BINPRM_BUF_SIZE bytes * * This may be called multiple times for binary chains (scripts for example). */ static int prepare_binprm(struct linux_binprm *bprm) { loff_t pos = 0; memset(bprm->buf, 0, BINPRM_BUF_SIZE); return kernel_read(bprm->file, bprm->buf, BINPRM_BUF_SIZE, &pos); } /* * Arguments are '\0' separated strings found at the location bprm->p * points to; chop off the first by relocating brpm->p to right after * the first '\0' encountered. */ int remove_arg_zero(struct linux_binprm *bprm) { unsigned long offset; char *kaddr; struct page *page; if (!bprm->argc) return 0; do { offset = bprm->p & ~PAGE_MASK; page = get_arg_page(bprm, bprm->p, 0); if (!page) return -EFAULT; kaddr = kmap_local_page(page); for (; offset < PAGE_SIZE && kaddr[offset]; offset++, bprm->p++) ; kunmap_local(kaddr); put_arg_page(page); } while (offset == PAGE_SIZE); bprm->p++; bprm->argc--; return 0; } EXPORT_SYMBOL(remove_arg_zero); /* * cycle the list of binary formats handler, until one recognizes the image */ static int search_binary_handler(struct linux_binprm *bprm) { struct linux_binfmt *fmt; int retval; retval = prepare_binprm(bprm); if (retval < 0) return retval; retval = security_bprm_check(bprm); if (retval) return retval; read_lock(&binfmt_lock); list_for_each_entry(fmt, &formats, lh) { if (!try_module_get(fmt->module)) continue; read_unlock(&binfmt_lock); retval = fmt->load_binary(bprm); read_lock(&binfmt_lock); put_binfmt(fmt); if (bprm->point_of_no_return || (retval != -ENOEXEC)) { read_unlock(&binfmt_lock); return retval; } } read_unlock(&binfmt_lock); return -ENOEXEC; } /* binfmt handlers will call back into begin_new_exec() on success. */ static int exec_binprm(struct linux_binprm *bprm) { pid_t old_pid, old_vpid; int ret, depth; /* Need to fetch pid before load_binary changes it */ old_pid = current->pid; rcu_read_lock(); old_vpid = task_pid_nr_ns(current, task_active_pid_ns(current->parent)); rcu_read_unlock(); /* This allows 4 levels of binfmt rewrites before failing hard. */ for (depth = 0;; depth++) { struct file *exec; if (depth > 5) return -ELOOP; ret = search_binary_handler(bprm); if (ret < 0) return ret; if (!bprm->interpreter) break; exec = bprm->file; bprm->file = bprm->interpreter; bprm->interpreter = NULL; exe_file_allow_write_access(exec); if (unlikely(bprm->have_execfd)) { if (bprm->executable) { fput(exec); return -ENOEXEC; } bprm->executable = exec; } else fput(exec); } audit_bprm(bprm); trace_sched_process_exec(current, old_pid, bprm); ptrace_event(PTRACE_EVENT_EXEC, old_vpid); proc_exec_connector(current); return 0; } static int bprm_execve(struct linux_binprm *bprm) { int retval; retval = prepare_bprm_creds(bprm); if (retval) return retval; /* * Check for unsafe execution states before exec_binprm(), which * will call back into begin_new_exec(), into bprm_creds_from_file(), * where setuid-ness is evaluated. */ check_unsafe_exec(bprm); current->in_execve = 1; sched_mm_cid_before_execve(current); sched_exec(); /* Set the unchanging part of bprm->cred */ retval = security_bprm_creds_for_exec(bprm); if (retval || bprm->is_check) goto out; retval = exec_binprm(bprm); if (retval < 0) goto out; sched_mm_cid_after_execve(current); rseq_execve(current); /* execve succeeded */ current->in_execve = 0; user_events_execve(current); acct_update_integrals(current); task_numa_free(current, false); return retval; out: /* * If past the point of no return ensure the code never * returns to the userspace process. Use an existing fatal * signal if present otherwise terminate the process with * SIGSEGV. */ if (bprm->point_of_no_return && !fatal_signal_pending(current)) force_fatal_sig(SIGSEGV); sched_mm_cid_after_execve(current); rseq_set_notify_resume(current); current->in_execve = 0; return retval; } static int do_execveat_common(int fd, struct filename *filename, struct user_arg_ptr argv, struct user_arg_ptr envp, int flags) { struct linux_binprm *bprm; int retval; if (IS_ERR(filename)) return PTR_ERR(filename); /* * We move the actual failure in case of RLIMIT_NPROC excess from * set*uid() to execve() because too many poorly written programs * don't check setuid() return code. Here we additionally recheck * whether NPROC limit is still exceeded. */ if ((current->flags & PF_NPROC_EXCEEDED) && is_rlimit_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { retval = -EAGAIN; goto out_ret; } /* We're below the limit (still or again), so we don't want to make * further execve() calls fail. */ current->flags &= ~PF_NPROC_EXCEEDED; bprm = alloc_bprm(fd, filename, flags); if (IS_ERR(bprm)) { retval = PTR_ERR(bprm); goto out_ret; } retval = count(argv, MAX_ARG_STRINGS); if (retval < 0) goto out_free; bprm->argc = retval; retval = count(envp, MAX_ARG_STRINGS); if (retval < 0) goto out_free; bprm->envc = retval; retval = bprm_stack_limits(bprm); if (retval < 0) goto out_free; retval = copy_string_kernel(bprm->filename, bprm); if (retval < 0) goto out_free; bprm->exec = bprm->p; retval = copy_strings(bprm->envc, envp, bprm); if (retval < 0) goto out_free; retval = copy_strings(bprm->argc, argv, bprm); if (retval < 0) goto out_free; /* * When argv is empty, add an empty string ("") as argv[0] to * ensure confused userspace programs that start processing * from argv[1] won't end up walking envp. See also * bprm_stack_limits(). */ if (bprm->argc == 0) { retval = copy_string_kernel("", bprm); if (retval < 0) goto out_free; bprm->argc = 1; pr_warn_once("process '%s' launched '%s' with NULL argv: empty string added\n", current->comm, bprm->filename); } retval = bprm_execve(bprm); out_free: free_bprm(bprm); out_ret: putname(filename); return retval; } int kernel_execve(const char *kernel_filename, const char *const *argv, const char *const *envp) { struct filename *filename; struct linux_binprm *bprm; int fd = AT_FDCWD; int retval; /* It is non-sense for kernel threads to call execve */ if (WARN_ON_ONCE(current->flags & PF_KTHREAD)) return -EINVAL; filename = getname_kernel(kernel_filename); if (IS_ERR(filename)) return PTR_ERR(filename); bprm = alloc_bprm(fd, filename, 0); if (IS_ERR(bprm)) { retval = PTR_ERR(bprm); goto out_ret; } retval = count_strings_kernel(argv); if (WARN_ON_ONCE(retval == 0)) retval = -EINVAL; if (retval < 0) goto out_free; bprm->argc = retval; retval = count_strings_kernel(envp); if (retval < 0) goto out_free; bprm->envc = retval; retval = bprm_stack_limits(bprm); if (retval < 0) goto out_free; retval = copy_string_kernel(bprm->filename, bprm); if (retval < 0) goto out_free; bprm->exec = bprm->p; retval = copy_strings_kernel(bprm->envc, envp, bprm); if (retval < 0) goto out_free; retval = copy_strings_kernel(bprm->argc, argv, bprm); if (retval < 0) goto out_free; retval = bprm_execve(bprm); out_free: free_bprm(bprm); out_ret: putname(filename); return retval; } static int do_execve(struct filename *filename, const char __user *const __user *__argv, const char __user *const __user *__envp) { struct user_arg_ptr argv = { .ptr.native = __argv }; struct user_arg_ptr envp = { .ptr.native = __envp }; return do_execveat_common(AT_FDCWD, filename, argv, envp, 0); } static int do_execveat(int fd, struct filename *filename, const char __user *const __user *__argv, const char __user *const __user *__envp, int flags) { struct user_arg_ptr argv = { .ptr.native = __argv }; struct user_arg_ptr envp = { .ptr.native = __envp }; return do_execveat_common(fd, filename, argv, envp, flags); } #ifdef CONFIG_COMPAT static int compat_do_execve(struct filename *filename, const compat_uptr_t __user *__argv, const compat_uptr_t __user *__envp) { struct user_arg_ptr argv = { .is_compat = true, .ptr.compat = __argv, }; struct user_arg_ptr envp = { .is_compat = true, .ptr.compat = __envp, }; return do_execveat_common(AT_FDCWD, filename, argv, envp, 0); } static int compat_do_execveat(int fd, struct filename *filename, const compat_uptr_t __user *__argv, const compat_uptr_t __user *__envp, int flags) { struct user_arg_ptr argv = { .is_compat = true, .ptr.compat = __argv, }; struct user_arg_ptr envp = { .is_compat = true, .ptr.compat = __envp, }; return do_execveat_common(fd, filename, argv, envp, flags); } #endif void set_binfmt(struct linux_binfmt *new) { struct mm_struct *mm = current->mm; if (mm->binfmt) module_put(mm->binfmt->module); mm->binfmt = new; if (new) __module_get(new->module); } EXPORT_SYMBOL(set_binfmt); /* * set_dumpable stores three-value SUID_DUMP_* into mm->flags. */ void set_dumpable(struct mm_struct *mm, int value) { if (WARN_ON((unsigned)value > SUID_DUMP_ROOT)) return; set_mask_bits(&mm->flags, MMF_DUMPABLE_MASK, value); } SYSCALL_DEFINE3(execve, const char __user *, filename, const char __user *const __user *, argv, const char __user *const __user *, envp) { return do_execve(getname(filename), argv, envp); } SYSCALL_DEFINE5(execveat, int, fd, const char __user *, filename, const char __user *const __user *, argv, const char __user *const __user *, envp, int, flags) { return do_execveat(fd, getname_uflags(filename, flags), argv, envp, flags); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename, const compat_uptr_t __user *, argv, const compat_uptr_t __user *, envp) { return compat_do_execve(getname(filename), argv, envp); } COMPAT_SYSCALL_DEFINE5(execveat, int, fd, const char __user *, filename, const compat_uptr_t __user *, argv, const compat_uptr_t __user *, envp, int, flags) { return compat_do_execveat(fd, getname_uflags(filename, flags), argv, envp, flags); } #endif #ifdef CONFIG_SYSCTL static int proc_dointvec_minmax_coredump(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { int error = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (!error) validate_coredump_safety(); return error; } static const struct ctl_table fs_exec_sysctls[] = { { .procname = "suid_dumpable", .data = &suid_dumpable, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec_minmax_coredump, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, }; static int __init init_fs_exec_sysctls(void) { register_sysctl_init("fs", fs_exec_sysctls); return 0; } fs_initcall(init_fs_exec_sysctls); #endif /* CONFIG_SYSCTL */ #ifdef CONFIG_EXEC_KUNIT_TEST #include "tests/exec_kunit.c" #endif |
| 2 2 2 2 3 3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 | // SPDX-License-Identifier: GPL-2.0-only /* * Line 6 Linux USB driver * * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) * Emil Myhrman (emil.myhrman@gmail.com) */ #include <linux/wait.h> #include <linux/usb.h> #include <linux/slab.h> #include <linux/module.h> #include <linux/leds.h> #include <sound/core.h> #include <sound/control.h> #include "capture.h" #include "driver.h" #include "playback.h" enum line6_device_type { LINE6_GUITARPORT, LINE6_PODSTUDIO_GX, LINE6_PODSTUDIO_UX1, LINE6_PODSTUDIO_UX2, LINE6_TONEPORT_GX, LINE6_TONEPORT_UX1, LINE6_TONEPORT_UX2, }; struct usb_line6_toneport; struct toneport_led { struct led_classdev dev; char name[64]; struct usb_line6_toneport *toneport; bool registered; }; struct usb_line6_toneport { /* Generic Line 6 USB data */ struct usb_line6 line6; /* Source selector */ int source; /* Serial number of device */ u32 serial_number; /* Firmware version (x 100) */ u8 firmware_version; /* Device type */ enum line6_device_type type; /* LED instances */ struct toneport_led leds[2]; }; #define line6_to_toneport(x) container_of(x, struct usb_line6_toneport, line6) static int toneport_send_cmd(struct usb_device *usbdev, int cmd1, int cmd2); #define TONEPORT_PCM_DELAY 1 static const struct snd_ratden toneport_ratden = { .num_min = 44100, .num_max = 44100, .num_step = 1, .den = 1 }; static struct line6_pcm_properties toneport_pcm_properties = { .playback_hw = { .info = (SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_PAUSE | SNDRV_PCM_INFO_SYNC_START), .formats = SNDRV_PCM_FMTBIT_S16_LE, .rates = SNDRV_PCM_RATE_KNOT, .rate_min = 44100, .rate_max = 44100, .channels_min = 2, .channels_max = 2, .buffer_bytes_max = 60000, .period_bytes_min = 64, .period_bytes_max = 8192, .periods_min = 1, .periods_max = 1024}, .capture_hw = { .info = (SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_SYNC_START), .formats = SNDRV_PCM_FMTBIT_S16_LE, .rates = SNDRV_PCM_RATE_KNOT, .rate_min = 44100, .rate_max = 44100, .channels_min = 2, .channels_max = 2, .buffer_bytes_max = 60000, .period_bytes_min = 64, .period_bytes_max = 8192, .periods_min = 1, .periods_max = 1024}, .rates = { .nrats = 1, .rats = &toneport_ratden}, .bytes_per_channel = 2 }; static const struct { const char *name; int code; } toneport_source_info[] = { {"Microphone", 0x0a01}, {"Line", 0x0801}, {"Instrument", 0x0b01}, {"Inst & Mic", 0x0901} }; static int toneport_send_cmd(struct usb_device *usbdev, int cmd1, int cmd2) { int ret; ret = usb_control_msg_send(usbdev, 0, 0x67, USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT, cmd1, cmd2, NULL, 0, LINE6_TIMEOUT, GFP_KERNEL); if (ret) { dev_err(&usbdev->dev, "send failed (error %d)\n", ret); return ret; } return 0; } /* monitor info callback */ static int snd_toneport_monitor_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; uinfo->count = 1; uinfo->value.integer.min = 0; uinfo->value.integer.max = 256; return 0; } /* monitor get callback */ static int snd_toneport_monitor_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_line6_pcm *line6pcm = snd_kcontrol_chip(kcontrol); ucontrol->value.integer.value[0] = line6pcm->volume_monitor; return 0; } /* monitor put callback */ static int snd_toneport_monitor_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_line6_pcm *line6pcm = snd_kcontrol_chip(kcontrol); int err; if (ucontrol->value.integer.value[0] == line6pcm->volume_monitor) return 0; line6pcm->volume_monitor = ucontrol->value.integer.value[0]; if (line6pcm->volume_monitor > 0) { err = line6_pcm_acquire(line6pcm, LINE6_STREAM_MONITOR, true); if (err < 0) { line6pcm->volume_monitor = 0; line6_pcm_release(line6pcm, LINE6_STREAM_MONITOR); return err; } } else { line6_pcm_release(line6pcm, LINE6_STREAM_MONITOR); } return 1; } /* source info callback */ static int snd_toneport_source_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { const int size = ARRAY_SIZE(toneport_source_info); uinfo->type = SNDRV_CTL_ELEM_TYPE_ENUMERATED; uinfo->count = 1; uinfo->value.enumerated.items = size; if (uinfo->value.enumerated.item >= size) uinfo->value.enumerated.item = size - 1; strcpy(uinfo->value.enumerated.name, toneport_source_info[uinfo->value.enumerated.item].name); return 0; } /* source get callback */ static int snd_toneport_source_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_line6_pcm *line6pcm = snd_kcontrol_chip(kcontrol); struct usb_line6_toneport *toneport = line6_to_toneport(line6pcm->line6); ucontrol->value.enumerated.item[0] = toneport->source; return 0; } /* source put callback */ static int snd_toneport_source_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_line6_pcm *line6pcm = snd_kcontrol_chip(kcontrol); struct usb_line6_toneport *toneport = line6_to_toneport(line6pcm->line6); unsigned int source; source = ucontrol->value.enumerated.item[0]; if (source >= ARRAY_SIZE(toneport_source_info)) return -EINVAL; if (source == toneport->source) return 0; toneport->source = source; toneport_send_cmd(toneport->line6.usbdev, toneport_source_info[source].code, 0x0000); return 1; } static void toneport_startup(struct usb_line6 *line6) { line6_pcm_acquire(line6->line6pcm, LINE6_STREAM_MONITOR, true); } /* control definition */ static const struct snd_kcontrol_new toneport_control_monitor = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = "Monitor Playback Volume", .index = 0, .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, .info = snd_toneport_monitor_info, .get = snd_toneport_monitor_get, .put = snd_toneport_monitor_put }; /* source selector definition */ static const struct snd_kcontrol_new toneport_control_source = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = "PCM Capture Source", .index = 0, .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, .info = snd_toneport_source_info, .get = snd_toneport_source_get, .put = snd_toneport_source_put }; /* For the led on Guitarport. Brightness goes from 0x00 to 0x26. Set a value above this to have led blink. (void cmd_0x02(byte red, byte green) */ static bool toneport_has_led(struct usb_line6_toneport *toneport) { switch (toneport->type) { case LINE6_GUITARPORT: case LINE6_TONEPORT_GX: /* add your device here if you are missing support for the LEDs */ return true; default: return false; } } static const char * const toneport_led_colors[2] = { "red", "green" }; static const int toneport_led_init_vals[2] = { 0x00, 0x26 }; static void toneport_update_led(struct usb_line6_toneport *toneport) { toneport_send_cmd(toneport->line6.usbdev, (toneport->leds[0].dev.brightness << 8) | 0x0002, toneport->leds[1].dev.brightness); } static void toneport_led_brightness_set(struct led_classdev *led_cdev, enum led_brightness brightness) { struct toneport_led *leds = container_of(led_cdev, struct toneport_led, dev); toneport_update_led(leds->toneport); } static int toneport_init_leds(struct usb_line6_toneport *toneport) { struct device *dev = &toneport->line6.usbdev->dev; int i, err; for (i = 0; i < 2; i++) { struct toneport_led *led = &toneport->leds[i]; struct led_classdev *leddev = &led->dev; led->toneport = toneport; snprintf(led->name, sizeof(led->name), "%s::%s", dev_name(dev), toneport_led_colors[i]); leddev->name = led->name; leddev->brightness = toneport_led_init_vals[i]; leddev->max_brightness = 0x26; leddev->brightness_set = toneport_led_brightness_set; err = led_classdev_register(dev, leddev); if (err) return err; led->registered = true; } return 0; } static void toneport_remove_leds(struct usb_line6_toneport *toneport) { struct toneport_led *led; int i; for (i = 0; i < 2; i++) { led = &toneport->leds[i]; if (!led->registered) break; led_classdev_unregister(&led->dev); led->registered = false; } } static bool toneport_has_source_select(struct usb_line6_toneport *toneport) { switch (toneport->type) { case LINE6_TONEPORT_UX1: case LINE6_TONEPORT_UX2: case LINE6_PODSTUDIO_UX1: case LINE6_PODSTUDIO_UX2: return true; default: return false; } } /* Setup Toneport device. */ static int toneport_setup(struct usb_line6_toneport *toneport) { u32 *ticks; struct usb_line6 *line6 = &toneport->line6; struct usb_device *usbdev = line6->usbdev; ticks = kmalloc(sizeof(*ticks), GFP_KERNEL); if (!ticks) return -ENOMEM; /* sync time on device with host: */ /* note: 32-bit timestamps overflow in year 2106 */ *ticks = (u32)ktime_get_real_seconds(); line6_write_data(line6, 0x80c6, ticks, 4); kfree(ticks); /* enable device: */ toneport_send_cmd(usbdev, 0x0301, 0x0000); /* initialize source select: */ if (toneport_has_source_select(toneport)) toneport_send_cmd(usbdev, toneport_source_info[toneport->source].code, 0x0000); if (toneport_has_led(toneport)) toneport_update_led(toneport); schedule_delayed_work(&toneport->line6.startup_work, secs_to_jiffies(TONEPORT_PCM_DELAY)); return 0; } /* Toneport device disconnected. */ static void line6_toneport_disconnect(struct usb_line6 *line6) { struct usb_line6_toneport *toneport = line6_to_toneport(line6); if (toneport_has_led(toneport)) toneport_remove_leds(toneport); } /* Try to init Toneport device. */ static int toneport_init(struct usb_line6 *line6, const struct usb_device_id *id) { int err; struct usb_line6_toneport *toneport = line6_to_toneport(line6); toneport->type = id->driver_info; line6->disconnect = line6_toneport_disconnect; line6->startup = toneport_startup; /* initialize PCM subsystem: */ err = line6_init_pcm(line6, &toneport_pcm_properties); if (err < 0) return err; /* register monitor control: */ err = snd_ctl_add(line6->card, snd_ctl_new1(&toneport_control_monitor, line6->line6pcm)); if (err < 0) return err; /* register source select control: */ if (toneport_has_source_select(toneport)) { err = snd_ctl_add(line6->card, snd_ctl_new1(&toneport_control_source, line6->line6pcm)); if (err < 0) return err; } line6_read_serial_number(line6, &toneport->serial_number); line6_read_data(line6, 0x80c2, &toneport->firmware_version, 1); if (toneport_has_led(toneport)) { err = toneport_init_leds(toneport); if (err < 0) return err; } err = toneport_setup(toneport); if (err) return err; /* register audio system: */ return snd_card_register(line6->card); } #ifdef CONFIG_PM /* Resume Toneport device after reset. */ static int toneport_reset_resume(struct usb_interface *interface) { int err; err = toneport_setup(usb_get_intfdata(interface)); if (err) return err; return line6_resume(interface); } #endif #define LINE6_DEVICE(prod) USB_DEVICE(0x0e41, prod) #define LINE6_IF_NUM(prod, n) USB_DEVICE_INTERFACE_NUMBER(0x0e41, prod, n) /* table of devices that work with this driver */ static const struct usb_device_id toneport_id_table[] = { { LINE6_DEVICE(0x4750), .driver_info = LINE6_GUITARPORT }, { LINE6_DEVICE(0x4153), .driver_info = LINE6_PODSTUDIO_GX }, { LINE6_DEVICE(0x4150), .driver_info = LINE6_PODSTUDIO_UX1 }, { LINE6_IF_NUM(0x4151, 0), .driver_info = LINE6_PODSTUDIO_UX2 }, { LINE6_DEVICE(0x4147), .driver_info = LINE6_TONEPORT_GX }, { LINE6_DEVICE(0x4141), .driver_info = LINE6_TONEPORT_UX1 }, { LINE6_IF_NUM(0x4142, 0), .driver_info = LINE6_TONEPORT_UX2 }, {} }; MODULE_DEVICE_TABLE(usb, toneport_id_table); static const struct line6_properties toneport_properties_table[] = { [LINE6_GUITARPORT] = { .id = "GuitarPort", .name = "GuitarPort", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* 1..4 seem to be ok */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_PODSTUDIO_GX] = { .id = "PODStudioGX", .name = "POD Studio GX", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* 1..4 seem to be ok */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_PODSTUDIO_UX1] = { .id = "PODStudioUX1", .name = "POD Studio UX1", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* 1..4 seem to be ok */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_PODSTUDIO_UX2] = { .id = "PODStudioUX2", .name = "POD Studio UX2", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* defaults to 44.1kHz, 16-bit */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_TONEPORT_GX] = { .id = "TonePortGX", .name = "TonePort GX", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* 1..4 seem to be ok */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_TONEPORT_UX1] = { .id = "TonePortUX1", .name = "TonePort UX1", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* 1..4 seem to be ok */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, [LINE6_TONEPORT_UX2] = { .id = "TonePortUX2", .name = "TonePort UX2", .capabilities = LINE6_CAP_PCM, .altsetting = 2, /* defaults to 44.1kHz, 16-bit */ /* no control channel */ .ep_audio_r = 0x82, .ep_audio_w = 0x01, }, }; /* Probe USB device. */ static int toneport_probe(struct usb_interface *interface, const struct usb_device_id *id) { return line6_probe(interface, id, "Line6-TonePort", &toneport_properties_table[id->driver_info], toneport_init, sizeof(struct usb_line6_toneport)); } static struct usb_driver toneport_driver = { .name = KBUILD_MODNAME, .probe = toneport_probe, .disconnect = line6_disconnect, #ifdef CONFIG_PM .suspend = line6_suspend, .resume = line6_resume, .reset_resume = toneport_reset_resume, #endif .id_table = toneport_id_table, }; module_usb_driver(toneport_driver); MODULE_DESCRIPTION("TonePort USB driver"); MODULE_LICENSE("GPL"); |
| 1 1 1 1 1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 | // SPDX-License-Identifier: GPL-2.0+ /* * inode.c -- user mode filesystem api for usb gadget controllers * * Copyright (C) 2003-2004 David Brownell * Copyright (C) 2003 Agilent Technologies */ /* #define VERBOSE_DEBUG */ #include <linux/init.h> #include <linux/module.h> #include <linux/fs.h> #include <linux/fs_context.h> #include <linux/pagemap.h> #include <linux/uts.h> #include <linux/wait.h> #include <linux/compiler.h> #include <linux/uaccess.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/string_choices.h> #include <linux/poll.h> #include <linux/kthread.h> #include <linux/aio.h> #include <linux/uio.h> #include <linux/refcount.h> #include <linux/delay.h> #include <linux/device.h> #include <linux/moduleparam.h> #include <linux/usb/gadgetfs.h> #include <linux/usb/gadget.h> #include <linux/usb/composite.h> /* for USB_GADGET_DELAYED_STATUS */ /* Undef helpers from linux/usb/composite.h as gadgetfs redefines them */ #undef DBG #undef ERROR #undef INFO /* * The gadgetfs API maps each endpoint to a file descriptor so that you * can use standard synchronous read/write calls for I/O. There's some * O_NONBLOCK and O_ASYNC/FASYNC style i/o support. Example usermode * drivers show how this works in practice. You can also use AIO to * eliminate I/O gaps between requests, to help when streaming data. * * Key parts that must be USB-specific are protocols defining how the * read/write operations relate to the hardware state machines. There * are two types of files. One type is for the device, implementing ep0. * The other type is for each IN or OUT endpoint. In both cases, the * user mode driver must configure the hardware before using it. * * - First, dev_config() is called when /dev/gadget/$CHIP is configured * (by writing configuration and device descriptors). Afterwards it * may serve as a source of device events, used to handle all control * requests other than basic enumeration. * * - Then, after a SET_CONFIGURATION control request, ep_config() is * called when each /dev/gadget/ep* file is configured (by writing * endpoint descriptors). Afterwards these files are used to write() * IN data or to read() OUT data. To halt the endpoint, a "wrong * direction" request is issued (like reading an IN endpoint). * * Unlike "usbfs" the only ioctl()s are for things that are rare, and maybe * not possible on all hardware. For example, precise fault handling with * respect to data left in endpoint fifos after aborted operations; or * selective clearing of endpoint halts, to implement SET_INTERFACE. */ #define DRIVER_DESC "USB Gadget filesystem" #define DRIVER_VERSION "24 Aug 2004" static const char driver_desc [] = DRIVER_DESC; static const char shortname [] = "gadgetfs"; MODULE_DESCRIPTION (DRIVER_DESC); MODULE_AUTHOR ("David Brownell"); MODULE_LICENSE ("GPL"); static int ep_open(struct inode *, struct file *); /*----------------------------------------------------------------------*/ #define GADGETFS_MAGIC 0xaee71ee7 /* /dev/gadget/$CHIP represents ep0 and the whole device */ enum ep0_state { /* DISABLED is the initial state. */ STATE_DEV_DISABLED = 0, /* Only one open() of /dev/gadget/$CHIP; only one file tracks * ep0/device i/o modes and binding to the controller. Driver * must always write descriptors to initialize the device, then * the device becomes UNCONNECTED until enumeration. */ STATE_DEV_OPENED, /* From then on, ep0 fd is in either of two basic modes: * - (UN)CONNECTED: read usb_gadgetfs_event(s) from it * - SETUP: read/write will transfer control data and succeed; * or if "wrong direction", performs protocol stall */ STATE_DEV_UNCONNECTED, STATE_DEV_CONNECTED, STATE_DEV_SETUP, /* UNBOUND means the driver closed ep0, so the device won't be * accessible again (DEV_DISABLED) until all fds are closed. */ STATE_DEV_UNBOUND, }; /* enough for the whole queue: most events invalidate others */ #define N_EVENT 5 #define RBUF_SIZE 256 struct dev_data { spinlock_t lock; refcount_t count; int udc_usage; enum ep0_state state; /* P: lock */ struct usb_gadgetfs_event event [N_EVENT]; unsigned ev_next; struct fasync_struct *fasync; u8 current_config; /* drivers reading ep0 MUST handle control requests (SETUP) * reported that way; else the host will time out. */ unsigned usermode_setup : 1, setup_in : 1, setup_can_stall : 1, setup_out_ready : 1, setup_out_error : 1, setup_abort : 1, gadget_registered : 1; unsigned setup_wLength; /* the rest is basically write-once */ struct usb_config_descriptor *config, *hs_config; struct usb_device_descriptor *dev; struct usb_request *req; struct usb_gadget *gadget; struct list_head epfiles; void *buf; wait_queue_head_t wait; struct super_block *sb; struct dentry *dentry; /* except this scratch i/o buffer for ep0 */ u8 rbuf[RBUF_SIZE]; }; static inline void get_dev (struct dev_data *data) { refcount_inc (&data->count); } static void put_dev (struct dev_data *data) { if (likely (!refcount_dec_and_test (&data->count))) return; /* needs no more cleanup */ BUG_ON (waitqueue_active (&data->wait)); kfree (data); } static struct dev_data *dev_new (void) { struct dev_data *dev; dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) return NULL; dev->state = STATE_DEV_DISABLED; refcount_set (&dev->count, 1); spin_lock_init (&dev->lock); INIT_LIST_HEAD (&dev->epfiles); init_waitqueue_head (&dev->wait); return dev; } /*----------------------------------------------------------------------*/ /* other /dev/gadget/$ENDPOINT files represent endpoints */ enum ep_state { STATE_EP_DISABLED = 0, STATE_EP_READY, STATE_EP_ENABLED, STATE_EP_UNBOUND, }; struct ep_data { struct mutex lock; enum ep_state state; refcount_t count; struct dev_data *dev; /* must hold dev->lock before accessing ep or req */ struct usb_ep *ep; struct usb_request *req; ssize_t status; char name [16]; struct usb_endpoint_descriptor desc, hs_desc; struct list_head epfiles; wait_queue_head_t wait; struct dentry *dentry; }; static inline void get_ep (struct ep_data *data) { refcount_inc (&data->count); } static void put_ep (struct ep_data *data) { if (likely (!refcount_dec_and_test (&data->count))) return; put_dev (data->dev); /* needs no more cleanup */ BUG_ON (!list_empty (&data->epfiles)); BUG_ON (waitqueue_active (&data->wait)); kfree (data); } /*----------------------------------------------------------------------*/ /* most "how to use the hardware" policy choices are in userspace: * mapping endpoint roles (which the driver needs) to the capabilities * which the usb controller has. most of those capabilities are exposed * implicitly, starting with the driver name and then endpoint names. */ static const char *CHIP; static DEFINE_MUTEX(sb_mutex); /* Serialize superblock operations */ /*----------------------------------------------------------------------*/ /* NOTE: don't use dev_printk calls before binding to the gadget * at the end of ep0 configuration, or after unbind. */ /* too wordy: dev_printk(level , &(d)->gadget->dev , fmt , ## args) */ #define xprintk(d,level,fmt,args...) \ printk(level "%s: " fmt , shortname , ## args) #ifdef DEBUG #define DBG(dev,fmt,args...) \ xprintk(dev , KERN_DEBUG , fmt , ## args) #else #define DBG(dev,fmt,args...) \ do { } while (0) #endif /* DEBUG */ #ifdef VERBOSE_DEBUG #define VDEBUG DBG #else #define VDEBUG(dev,fmt,args...) \ do { } while (0) #endif /* DEBUG */ #define ERROR(dev,fmt,args...) \ xprintk(dev , KERN_ERR , fmt , ## args) #define INFO(dev,fmt,args...) \ xprintk(dev , KERN_INFO , fmt , ## args) /*----------------------------------------------------------------------*/ /* SYNCHRONOUS ENDPOINT OPERATIONS (bulk/intr/iso) * * After opening, configure non-control endpoints. Then use normal * stream read() and write() requests; and maybe ioctl() to get more * precise FIFO status when recovering from cancellation. */ static void epio_complete (struct usb_ep *ep, struct usb_request *req) { struct ep_data *epdata = ep->driver_data; if (!req->context) return; if (req->status) epdata->status = req->status; else epdata->status = req->actual; complete ((struct completion *)req->context); } /* tasklock endpoint, returning when it's connected. * still need dev->lock to use epdata->ep. */ static int get_ready_ep (unsigned f_flags, struct ep_data *epdata, bool is_write) { int val; if (f_flags & O_NONBLOCK) { if (!mutex_trylock(&epdata->lock)) goto nonblock; if (epdata->state != STATE_EP_ENABLED && (!is_write || epdata->state != STATE_EP_READY)) { mutex_unlock(&epdata->lock); nonblock: val = -EAGAIN; } else val = 0; return val; } val = mutex_lock_interruptible(&epdata->lock); if (val < 0) return val; switch (epdata->state) { case STATE_EP_ENABLED: return 0; case STATE_EP_READY: /* not configured yet */ if (is_write) return 0; fallthrough; case STATE_EP_UNBOUND: /* clean disconnect */ break; // case STATE_EP_DISABLED: /* "can't happen" */ default: /* error! */ pr_debug ("%s: ep %p not available, state %d\n", shortname, epdata, epdata->state); } mutex_unlock(&epdata->lock); return -ENODEV; } static ssize_t ep_io (struct ep_data *epdata, void *buf, unsigned len) { DECLARE_COMPLETION_ONSTACK (done); int value; spin_lock_irq (&epdata->dev->lock); if (likely (epdata->ep != NULL)) { struct usb_request *req = epdata->req; req->context = &done; req->complete = epio_complete; req->buf = buf; req->length = len; value = usb_ep_queue (epdata->ep, req, GFP_ATOMIC); } else value = -ENODEV; spin_unlock_irq (&epdata->dev->lock); if (likely (value == 0)) { value = wait_for_completion_interruptible(&done); if (value != 0) { spin_lock_irq (&epdata->dev->lock); if (likely (epdata->ep != NULL)) { DBG (epdata->dev, "%s i/o interrupted\n", epdata->name); usb_ep_dequeue (epdata->ep, epdata->req); spin_unlock_irq (&epdata->dev->lock); wait_for_completion(&done); if (epdata->status == -ECONNRESET) epdata->status = -EINTR; } else { spin_unlock_irq (&epdata->dev->lock); DBG (epdata->dev, "endpoint gone\n"); wait_for_completion(&done); epdata->status = -ENODEV; } } return epdata->status; } return value; } static int ep_release (struct inode *inode, struct file *fd) { struct ep_data *data = fd->private_data; int value; value = mutex_lock_interruptible(&data->lock); if (value < 0) return value; /* clean up if this can be reopened */ if (data->state != STATE_EP_UNBOUND) { data->state = STATE_EP_DISABLED; data->desc.bDescriptorType = 0; data->hs_desc.bDescriptorType = 0; usb_ep_disable(data->ep); } mutex_unlock(&data->lock); put_ep (data); return 0; } static long ep_ioctl(struct file *fd, unsigned code, unsigned long value) { struct ep_data *data = fd->private_data; int status; if ((status = get_ready_ep (fd->f_flags, data, false)) < 0) return status; spin_lock_irq (&data->dev->lock); if (likely (data->ep != NULL)) { switch (code) { case GADGETFS_FIFO_STATUS: status = usb_ep_fifo_status (data->ep); break; case GADGETFS_FIFO_FLUSH: usb_ep_fifo_flush (data->ep); break; case GADGETFS_CLEAR_HALT: status = usb_ep_clear_halt (data->ep); break; default: status = -ENOTTY; } } else status = -ENODEV; spin_unlock_irq (&data->dev->lock); mutex_unlock(&data->lock); return status; } /*----------------------------------------------------------------------*/ /* ASYNCHRONOUS ENDPOINT I/O OPERATIONS (bulk/intr/iso) */ struct kiocb_priv { struct usb_request *req; struct ep_data *epdata; struct kiocb *iocb; struct mm_struct *mm; struct work_struct work; void *buf; struct iov_iter to; const void *to_free; unsigned actual; }; static int ep_aio_cancel(struct kiocb *iocb) { struct kiocb_priv *priv = iocb->private; struct ep_data *epdata; int value; local_irq_disable(); epdata = priv->epdata; // spin_lock(&epdata->dev->lock); if (likely(epdata && epdata->ep && priv->req)) value = usb_ep_dequeue (epdata->ep, priv->req); else value = -EINVAL; // spin_unlock(&epdata->dev->lock); local_irq_enable(); return value; } static void ep_user_copy_worker(struct work_struct *work) { struct kiocb_priv *priv = container_of(work, struct kiocb_priv, work); struct mm_struct *mm = priv->mm; struct kiocb *iocb = priv->iocb; size_t ret; kthread_use_mm(mm); ret = copy_to_iter(priv->buf, priv->actual, &priv->to); kthread_unuse_mm(mm); if (!ret) ret = -EFAULT; /* completing the iocb can drop the ctx and mm, don't touch mm after */ iocb->ki_complete(iocb, ret); kfree(priv->buf); kfree(priv->to_free); kfree(priv); } static void ep_aio_complete(struct usb_ep *ep, struct usb_request *req) { struct kiocb *iocb = req->context; struct kiocb_priv *priv = iocb->private; struct ep_data *epdata = priv->epdata; /* lock against disconnect (and ideally, cancel) */ spin_lock(&epdata->dev->lock); priv->req = NULL; priv->epdata = NULL; /* if this was a write or a read returning no data then we * don't need to copy anything to userspace, so we can * complete the aio request immediately. */ if (priv->to_free == NULL || unlikely(req->actual == 0)) { kfree(req->buf); kfree(priv->to_free); kfree(priv); iocb->private = NULL; iocb->ki_complete(iocb, req->actual ? req->actual : (long)req->status); } else { /* ep_copy_to_user() won't report both; we hide some faults */ if (unlikely(0 != req->status)) DBG(epdata->dev, "%s fault %d len %d\n", ep->name, req->status, req->actual); priv->buf = req->buf; priv->actual = req->actual; INIT_WORK(&priv->work, ep_user_copy_worker); schedule_work(&priv->work); } usb_ep_free_request(ep, req); spin_unlock(&epdata->dev->lock); put_ep(epdata); } static ssize_t ep_aio(struct kiocb *iocb, struct kiocb_priv *priv, struct ep_data *epdata, char *buf, size_t len) { struct usb_request *req; ssize_t value; iocb->private = priv; priv->iocb = iocb; kiocb_set_cancel_fn(iocb, ep_aio_cancel); get_ep(epdata); priv->epdata = epdata; priv->actual = 0; priv->mm = current->mm; /* mm teardown waits for iocbs in exit_aio() */ /* each kiocb is coupled to one usb_request, but we can't * allocate or submit those if the host disconnected. */ spin_lock_irq(&epdata->dev->lock); value = -ENODEV; if (unlikely(epdata->ep == NULL)) goto fail; req = usb_ep_alloc_request(epdata->ep, GFP_ATOMIC); value = -ENOMEM; if (unlikely(!req)) goto fail; priv->req = req; req->buf = buf; req->length = len; req->complete = ep_aio_complete; req->context = iocb; value = usb_ep_queue(epdata->ep, req, GFP_ATOMIC); if (unlikely(0 != value)) { usb_ep_free_request(epdata->ep, req); goto fail; } spin_unlock_irq(&epdata->dev->lock); return -EIOCBQUEUED; fail: spin_unlock_irq(&epdata->dev->lock); kfree(priv->to_free); kfree(priv); put_ep(epdata); return value; } static ssize_t ep_read_iter(struct kiocb *iocb, struct iov_iter *to) { struct file *file = iocb->ki_filp; struct ep_data *epdata = file->private_data; size_t len = iov_iter_count(to); ssize_t value; char *buf; if ((value = get_ready_ep(file->f_flags, epdata, false)) < 0) return value; /* halt any endpoint by doing a "wrong direction" i/o call */ if (usb_endpoint_dir_in(&epdata->desc)) { if (usb_endpoint_xfer_isoc(&epdata->desc) || !is_sync_kiocb(iocb)) { mutex_unlock(&epdata->lock); return -EINVAL; } DBG (epdata->dev, "%s halt\n", epdata->name); spin_lock_irq(&epdata->dev->lock); if (likely(epdata->ep != NULL)) usb_ep_set_halt(epdata->ep); spin_unlock_irq(&epdata->dev->lock); mutex_unlock(&epdata->lock); return -EBADMSG; } buf = kmalloc(len, GFP_KERNEL); if (unlikely(!buf)) { mutex_unlock(&epdata->lock); return -ENOMEM; } if (is_sync_kiocb(iocb)) { value = ep_io(epdata, buf, len); if (value >= 0 && (copy_to_iter(buf, value, to) != value)) value = -EFAULT; } else { struct kiocb_priv *priv = kzalloc(sizeof *priv, GFP_KERNEL); value = -ENOMEM; if (!priv) goto fail; priv->to_free = dup_iter(&priv->to, to, GFP_KERNEL); if (!iter_is_ubuf(&priv->to) && !priv->to_free) { kfree(priv); goto fail; } value = ep_aio(iocb, priv, epdata, buf, len); if (value == -EIOCBQUEUED) buf = NULL; } fail: kfree(buf); mutex_unlock(&epdata->lock); return value; } static ssize_t ep_config(struct ep_data *, const char *, size_t); static ssize_t ep_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; struct ep_data *epdata = file->private_data; size_t len = iov_iter_count(from); bool configured; ssize_t value; char *buf; if ((value = get_ready_ep(file->f_flags, epdata, true)) < 0) return value; configured = epdata->state == STATE_EP_ENABLED; /* halt any endpoint by doing a "wrong direction" i/o call */ if (configured && !usb_endpoint_dir_in(&epdata->desc)) { if (usb_endpoint_xfer_isoc(&epdata->desc) || !is_sync_kiocb(iocb)) { mutex_unlock(&epdata->lock); return -EINVAL; } DBG (epdata->dev, "%s halt\n", epdata->name); spin_lock_irq(&epdata->dev->lock); if (likely(epdata->ep != NULL)) usb_ep_set_halt(epdata->ep); spin_unlock_irq(&epdata->dev->lock); mutex_unlock(&epdata->lock); return -EBADMSG; } buf = kmalloc(len, GFP_KERNEL); if (unlikely(!buf)) { mutex_unlock(&epdata->lock); return -ENOMEM; } if (unlikely(!copy_from_iter_full(buf, len, from))) { value = -EFAULT; goto out; } if (unlikely(!configured)) { value = ep_config(epdata, buf, len); } else if (is_sync_kiocb(iocb)) { value = ep_io(epdata, buf, len); } else { struct kiocb_priv *priv = kzalloc(sizeof *priv, GFP_KERNEL); value = -ENOMEM; if (priv) { value = ep_aio(iocb, priv, epdata, buf, len); if (value == -EIOCBQUEUED) buf = NULL; } } out: kfree(buf); mutex_unlock(&epdata->lock); return value; } /*----------------------------------------------------------------------*/ /* used after endpoint configuration */ static const struct file_operations ep_io_operations = { .owner = THIS_MODULE, .open = ep_open, .release = ep_release, .unlocked_ioctl = ep_ioctl, .read_iter = ep_read_iter, .write_iter = ep_write_iter, }; /* ENDPOINT INITIALIZATION * * fd = open ("/dev/gadget/$ENDPOINT", O_RDWR) * status = write (fd, descriptors, sizeof descriptors) * * That write establishes the endpoint configuration, configuring * the controller to process bulk, interrupt, or isochronous transfers * at the right maxpacket size, and so on. * * The descriptors are message type 1, identified by a host order u32 * at the beginning of what's written. Descriptor order is: full/low * speed descriptor, then optional high speed descriptor. */ static ssize_t ep_config (struct ep_data *data, const char *buf, size_t len) { struct usb_ep *ep; u32 tag; int value, length = len; if (data->state != STATE_EP_READY) { value = -EL2HLT; goto fail; } value = len; if (len < USB_DT_ENDPOINT_SIZE + 4) goto fail0; /* we might need to change message format someday */ memcpy(&tag, buf, 4); if (tag != 1) { DBG(data->dev, "config %s, bad tag %d\n", data->name, tag); goto fail0; } buf += 4; len -= 4; /* NOTE: audio endpoint extensions not accepted here; * just don't include the extra bytes. */ /* full/low speed descriptor, then high speed */ memcpy(&data->desc, buf, USB_DT_ENDPOINT_SIZE); if (data->desc.bLength != USB_DT_ENDPOINT_SIZE || data->desc.bDescriptorType != USB_DT_ENDPOINT) goto fail0; if (len != USB_DT_ENDPOINT_SIZE) { if (len != 2 * USB_DT_ENDPOINT_SIZE) goto fail0; memcpy(&data->hs_desc, buf + USB_DT_ENDPOINT_SIZE, USB_DT_ENDPOINT_SIZE); if (data->hs_desc.bLength != USB_DT_ENDPOINT_SIZE || data->hs_desc.bDescriptorType != USB_DT_ENDPOINT) { DBG(data->dev, "config %s, bad hs length or type\n", data->name); goto fail0; } } spin_lock_irq (&data->dev->lock); if (data->dev->state == STATE_DEV_UNBOUND) { value = -ENOENT; goto gone; } else { ep = data->ep; if (ep == NULL) { value = -ENODEV; goto gone; } } switch (data->dev->gadget->speed) { case USB_SPEED_LOW: case USB_SPEED_FULL: ep->desc = &data->desc; break; case USB_SPEED_HIGH: /* fails if caller didn't provide that descriptor... */ ep->desc = &data->hs_desc; break; default: DBG(data->dev, "unconnected, %s init abandoned\n", data->name); value = -EINVAL; goto gone; } value = usb_ep_enable(ep); if (value == 0) { data->state = STATE_EP_ENABLED; value = length; } gone: spin_unlock_irq (&data->dev->lock); if (value < 0) { fail: data->desc.bDescriptorType = 0; data->hs_desc.bDescriptorType = 0; } return value; fail0: value = -EINVAL; goto fail; } static int ep_open (struct inode *inode, struct file *fd) { struct ep_data *data = inode->i_private; int value = -EBUSY; if (mutex_lock_interruptible(&data->lock) != 0) return -EINTR; spin_lock_irq (&data->dev->lock); if (data->dev->state == STATE_DEV_UNBOUND) value = -ENOENT; else if (data->state == STATE_EP_DISABLED) { value = 0; data->state = STATE_EP_READY; get_ep (data); fd->private_data = data; VDEBUG (data->dev, "%s ready\n", data->name); } else DBG (data->dev, "%s state %d\n", data->name, data->state); spin_unlock_irq (&data->dev->lock); mutex_unlock(&data->lock); return value; } /*----------------------------------------------------------------------*/ /* EP0 IMPLEMENTATION can be partly in userspace. * * Drivers that use this facility receive various events, including * control requests the kernel doesn't handle. Drivers that don't * use this facility may be too simple-minded for real applications. */ static inline void ep0_readable (struct dev_data *dev) { wake_up (&dev->wait); kill_fasync (&dev->fasync, SIGIO, POLL_IN); } static void clean_req (struct usb_ep *ep, struct usb_request *req) { struct dev_data *dev = ep->driver_data; if (req->buf != dev->rbuf) { kfree(req->buf); req->buf = dev->rbuf; } req->complete = epio_complete; dev->setup_out_ready = 0; } static void ep0_complete (struct usb_ep *ep, struct usb_request *req) { struct dev_data *dev = ep->driver_data; unsigned long flags; int free = 1; /* for control OUT, data must still get to userspace */ spin_lock_irqsave(&dev->lock, flags); if (!dev->setup_in) { dev->setup_out_error = (req->status != 0); if (!dev->setup_out_error) free = 0; dev->setup_out_ready = 1; ep0_readable (dev); } /* clean up as appropriate */ if (free && req->buf != &dev->rbuf) clean_req (ep, req); req->complete = epio_complete; spin_unlock_irqrestore(&dev->lock, flags); } static int setup_req (struct usb_ep *ep, struct usb_request *req, u16 len) { struct dev_data *dev = ep->driver_data; if (dev->setup_out_ready) { DBG (dev, "ep0 request busy!\n"); return -EBUSY; } if (len > sizeof (dev->rbuf)) req->buf = kmalloc(len, GFP_ATOMIC); if (req->buf == NULL) { req->buf = dev->rbuf; return -ENOMEM; } req->complete = ep0_complete; req->length = len; req->zero = 0; return 0; } static ssize_t ep0_read (struct file *fd, char __user *buf, size_t len, loff_t *ptr) { struct dev_data *dev = fd->private_data; ssize_t retval; enum ep0_state state; spin_lock_irq (&dev->lock); if (dev->state <= STATE_DEV_OPENED) { retval = -EINVAL; goto done; } /* report fd mode change before acting on it */ if (dev->setup_abort) { dev->setup_abort = 0; retval = -EIDRM; goto done; } /* control DATA stage */ if ((state = dev->state) == STATE_DEV_SETUP) { if (dev->setup_in) { /* stall IN */ VDEBUG(dev, "ep0in stall\n"); (void) usb_ep_set_halt (dev->gadget->ep0); retval = -EL2HLT; dev->state = STATE_DEV_CONNECTED; } else if (len == 0) { /* ack SET_CONFIGURATION etc */ struct usb_ep *ep = dev->gadget->ep0; struct usb_request *req = dev->req; if ((retval = setup_req (ep, req, 0)) == 0) { ++dev->udc_usage; spin_unlock_irq (&dev->lock); retval = usb_ep_queue (ep, req, GFP_KERNEL); spin_lock_irq (&dev->lock); --dev->udc_usage; } dev->state = STATE_DEV_CONNECTED; /* assume that was SET_CONFIGURATION */ if (dev->current_config) { unsigned power; if (gadget_is_dualspeed(dev->gadget) && (dev->gadget->speed == USB_SPEED_HIGH)) power = dev->hs_config->bMaxPower; else power = dev->config->bMaxPower; usb_gadget_vbus_draw(dev->gadget, 2 * power); } } else { /* collect OUT data */ if ((fd->f_flags & O_NONBLOCK) != 0 && !dev->setup_out_ready) { retval = -EAGAIN; goto done; } spin_unlock_irq (&dev->lock); retval = wait_event_interruptible (dev->wait, dev->setup_out_ready != 0); /* FIXME state could change from under us */ spin_lock_irq (&dev->lock); if (retval) goto done; if (dev->state != STATE_DEV_SETUP) { retval = -ECANCELED; goto done; } dev->state = STATE_DEV_CONNECTED; if (dev->setup_out_error) retval = -EIO; else { len = min (len, (size_t)dev->req->actual); ++dev->udc_usage; spin_unlock_irq(&dev->lock); if (copy_to_user (buf, dev->req->buf, len)) retval = -EFAULT; else retval = len; spin_lock_irq(&dev->lock); --dev->udc_usage; clean_req (dev->gadget->ep0, dev->req); /* NOTE userspace can't yet choose to stall */ } } goto done; } /* else normal: return event data */ if (len < sizeof dev->event [0]) { retval = -EINVAL; goto done; } len -= len % sizeof (struct usb_gadgetfs_event); dev->usermode_setup = 1; scan: /* return queued events right away */ if (dev->ev_next != 0) { unsigned i, n; n = len / sizeof (struct usb_gadgetfs_event); if (dev->ev_next < n) n = dev->ev_next; /* ep0 i/o has special semantics during STATE_DEV_SETUP */ for (i = 0; i < n; i++) { if (dev->event [i].type == GADGETFS_SETUP) { dev->state = STATE_DEV_SETUP; n = i + 1; break; } } spin_unlock_irq (&dev->lock); len = n * sizeof (struct usb_gadgetfs_event); if (copy_to_user (buf, &dev->event, len)) retval = -EFAULT; else retval = len; if (len > 0) { /* NOTE this doesn't guard against broken drivers; * concurrent ep0 readers may lose events. */ spin_lock_irq (&dev->lock); if (dev->ev_next > n) { memmove(&dev->event[0], &dev->event[n], sizeof (struct usb_gadgetfs_event) * (dev->ev_next - n)); } dev->ev_next -= n; spin_unlock_irq (&dev->lock); } return retval; } if (fd->f_flags & O_NONBLOCK) { retval = -EAGAIN; goto done; } switch (state) { default: DBG (dev, "fail %s, state %d\n", __func__, state); retval = -ESRCH; break; case STATE_DEV_UNCONNECTED: case STATE_DEV_CONNECTED: spin_unlock_irq (&dev->lock); DBG (dev, "%s wait\n", __func__); /* wait for events */ retval = wait_event_interruptible (dev->wait, dev->ev_next != 0); if (retval < 0) return retval; spin_lock_irq (&dev->lock); goto scan; } done: spin_unlock_irq (&dev->lock); return retval; } static struct usb_gadgetfs_event * next_event (struct dev_data *dev, enum usb_gadgetfs_event_type type) { struct usb_gadgetfs_event *event; unsigned i; switch (type) { /* these events purge the queue */ case GADGETFS_DISCONNECT: if (dev->state == STATE_DEV_SETUP) dev->setup_abort = 1; fallthrough; case GADGETFS_CONNECT: dev->ev_next = 0; break; case GADGETFS_SETUP: /* previous request timed out */ case GADGETFS_SUSPEND: /* same effect */ /* these events can't be repeated */ for (i = 0; i != dev->ev_next; i++) { if (dev->event [i].type != type) continue; DBG(dev, "discard old event[%d] %d\n", i, type); dev->ev_next--; if (i == dev->ev_next) break; /* indices start at zero, for simplicity */ memmove (&dev->event [i], &dev->event [i + 1], sizeof (struct usb_gadgetfs_event) * (dev->ev_next - i)); } break; default: BUG (); } VDEBUG(dev, "event[%d] = %d\n", dev->ev_next, type); event = &dev->event [dev->ev_next++]; BUG_ON (dev->ev_next > N_EVENT); memset (event, 0, sizeof *event); event->type = type; return event; } static ssize_t ep0_write (struct file *fd, const char __user *buf, size_t len, loff_t *ptr) { struct dev_data *dev = fd->private_data; ssize_t retval = -ESRCH; /* report fd mode change before acting on it */ if (dev->setup_abort) { dev->setup_abort = 0; retval = -EIDRM; /* data and/or status stage for control request */ } else if (dev->state == STATE_DEV_SETUP) { len = min_t(size_t, len, dev->setup_wLength); if (dev->setup_in) { retval = setup_req (dev->gadget->ep0, dev->req, len); if (retval == 0) { dev->state = STATE_DEV_CONNECTED; ++dev->udc_usage; spin_unlock_irq (&dev->lock); if (copy_from_user (dev->req->buf, buf, len)) retval = -EFAULT; else { if (len < dev->setup_wLength) dev->req->zero = 1; retval = usb_ep_queue ( dev->gadget->ep0, dev->req, GFP_KERNEL); } spin_lock_irq(&dev->lock); --dev->udc_usage; if (retval < 0) { clean_req (dev->gadget->ep0, dev->req); } else retval = len; return retval; } /* can stall some OUT transfers */ } else if (dev->setup_can_stall) { VDEBUG(dev, "ep0out stall\n"); (void) usb_ep_set_halt (dev->gadget->ep0); retval = -EL2HLT; dev->state = STATE_DEV_CONNECTED; } else { DBG(dev, "bogus ep0out stall!\n"); } } else DBG (dev, "fail %s, state %d\n", __func__, dev->state); return retval; } static int ep0_fasync (int f, struct file *fd, int on) { struct dev_data *dev = fd->private_data; // caller must F_SETOWN before signal delivery happens VDEBUG(dev, "%s %s\n", __func__, str_on_off(on)); return fasync_helper (f, fd, on, &dev->fasync); } static struct usb_gadget_driver gadgetfs_driver; static int dev_release (struct inode *inode, struct file *fd) { struct dev_data *dev = fd->private_data; /* closing ep0 === shutdown all */ if (dev->gadget_registered) { usb_gadget_unregister_driver (&gadgetfs_driver); dev->gadget_registered = false; } /* at this point "good" hardware has disconnected the * device from USB; the host won't see it any more. * alternatively, all host requests will time out. */ kfree (dev->buf); dev->buf = NULL; /* other endpoints were all decoupled from this device */ spin_lock_irq(&dev->lock); dev->state = STATE_DEV_DISABLED; spin_unlock_irq(&dev->lock); put_dev (dev); return 0; } static __poll_t ep0_poll (struct file *fd, poll_table *wait) { struct dev_data *dev = fd->private_data; __poll_t mask = 0; if (dev->state <= STATE_DEV_OPENED) return DEFAULT_POLLMASK; poll_wait(fd, &dev->wait, wait); spin_lock_irq(&dev->lock); /* report fd mode change before acting on it */ if (dev->setup_abort) { dev->setup_abort = 0; mask = EPOLLHUP; goto out; } if (dev->state == STATE_DEV_SETUP) { if (dev->setup_in || dev->setup_can_stall) mask = EPOLLOUT; } else { if (dev->ev_next != 0) mask = EPOLLIN; } out: spin_unlock_irq(&dev->lock); return mask; } static long gadget_dev_ioctl (struct file *fd, unsigned code, unsigned long value) { struct dev_data *dev = fd->private_data; struct usb_gadget *gadget = dev->gadget; long ret = -ENOTTY; spin_lock_irq(&dev->lock); if (dev->state == STATE_DEV_OPENED || dev->state == STATE_DEV_UNBOUND) { /* Not bound to a UDC */ } else if (gadget->ops->ioctl) { ++dev->udc_usage; spin_unlock_irq(&dev->lock); ret = gadget->ops->ioctl (gadget, code, value); spin_lock_irq(&dev->lock); --dev->udc_usage; } spin_unlock_irq(&dev->lock); return ret; } /*----------------------------------------------------------------------*/ /* The in-kernel gadget driver handles most ep0 issues, in particular * enumerating the single configuration (as provided from user space). * * Unrecognized ep0 requests may be handled in user space. */ static void make_qualifier (struct dev_data *dev) { struct usb_qualifier_descriptor qual; struct usb_device_descriptor *desc; qual.bLength = sizeof qual; qual.bDescriptorType = USB_DT_DEVICE_QUALIFIER; qual.bcdUSB = cpu_to_le16 (0x0200); desc = dev->dev; qual.bDeviceClass = desc->bDeviceClass; qual.bDeviceSubClass = desc->bDeviceSubClass; qual.bDeviceProtocol = desc->bDeviceProtocol; /* assumes ep0 uses the same value for both speeds ... */ qual.bMaxPacketSize0 = dev->gadget->ep0->maxpacket; qual.bNumConfigurations = 1; qual.bRESERVED = 0; memcpy (dev->rbuf, &qual, sizeof qual); } static int config_buf (struct dev_data *dev, u8 type, unsigned index) { int len; int hs = 0; /* only one configuration */ if (index > 0) return -EINVAL; if (gadget_is_dualspeed(dev->gadget)) { hs = (dev->gadget->speed == USB_SPEED_HIGH); if (type == USB_DT_OTHER_SPEED_CONFIG) hs = !hs; } if (hs) { dev->req->buf = dev->hs_config; len = le16_to_cpu(dev->hs_config->wTotalLength); } else { dev->req->buf = dev->config; len = le16_to_cpu(dev->config->wTotalLength); } ((u8 *)dev->req->buf) [1] = type; return len; } static int gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) { struct dev_data *dev = get_gadget_data (gadget); struct usb_request *req = dev->req; int value = -EOPNOTSUPP; struct usb_gadgetfs_event *event; u16 w_value = le16_to_cpu(ctrl->wValue); u16 w_length = le16_to_cpu(ctrl->wLength); if (w_length > RBUF_SIZE) { if (ctrl->bRequestType & USB_DIR_IN) { /* Cast away the const, we are going to overwrite on purpose. */ __le16 *temp = (__le16 *)&ctrl->wLength; *temp = cpu_to_le16(RBUF_SIZE); w_length = RBUF_SIZE; } else { return value; } } spin_lock (&dev->lock); dev->setup_abort = 0; if (dev->state == STATE_DEV_UNCONNECTED) { if (gadget_is_dualspeed(gadget) && gadget->speed == USB_SPEED_HIGH && dev->hs_config == NULL) { spin_unlock(&dev->lock); ERROR (dev, "no high speed config??\n"); return -EINVAL; } dev->state = STATE_DEV_CONNECTED; INFO (dev, "connected\n"); event = next_event (dev, GADGETFS_CONNECT); event->u.speed = gadget->speed; ep0_readable (dev); /* host may have given up waiting for response. we can miss control * requests handled lower down (device/endpoint status and features); * then ep0_{read,write} will report the wrong status. controller * driver will have aborted pending i/o. */ } else if (dev->state == STATE_DEV_SETUP) dev->setup_abort = 1; req->buf = dev->rbuf; req->context = NULL; switch (ctrl->bRequest) { case USB_REQ_GET_DESCRIPTOR: if (ctrl->bRequestType != USB_DIR_IN) goto unrecognized; switch (w_value >> 8) { case USB_DT_DEVICE: value = min (w_length, (u16) sizeof *dev->dev); dev->dev->bMaxPacketSize0 = dev->gadget->ep0->maxpacket; req->buf = dev->dev; break; case USB_DT_DEVICE_QUALIFIER: if (!dev->hs_config) break; value = min (w_length, (u16) sizeof (struct usb_qualifier_descriptor)); make_qualifier (dev); break; case USB_DT_OTHER_SPEED_CONFIG: case USB_DT_CONFIG: value = config_buf (dev, w_value >> 8, w_value & 0xff); if (value >= 0) value = min (w_length, (u16) value); break; case USB_DT_STRING: goto unrecognized; default: // all others are errors break; } break; /* currently one config, two speeds */ case USB_REQ_SET_CONFIGURATION: if (ctrl->bRequestType != 0) goto unrecognized; if (0 == (u8) w_value) { value = 0; dev->current_config = 0; usb_gadget_vbus_draw(gadget, 8 /* mA */ ); // user mode expected to disable endpoints } else { u8 config, power; if (gadget_is_dualspeed(gadget) && gadget->speed == USB_SPEED_HIGH) { config = dev->hs_config->bConfigurationValue; power = dev->hs_config->bMaxPower; } else { config = dev->config->bConfigurationValue; power = dev->config->bMaxPower; } if (config == (u8) w_value) { value = 0; dev->current_config = config; usb_gadget_vbus_draw(gadget, 2 * power); } } /* report SET_CONFIGURATION like any other control request, * except that usermode may not stall this. the next * request mustn't be allowed start until this finishes: * endpoints and threads set up, etc. * * NOTE: older PXA hardware (before PXA 255: without UDCCFR) * has bad/racey automagic that prevents synchronizing here. * even kernel mode drivers often miss them. */ if (value == 0) { INFO (dev, "configuration #%d\n", dev->current_config); usb_gadget_set_state(gadget, USB_STATE_CONFIGURED); if (dev->usermode_setup) { dev->setup_can_stall = 0; goto delegate; } } break; #ifndef CONFIG_USB_PXA25X /* PXA automagically handles this request too */ case USB_REQ_GET_CONFIGURATION: if (ctrl->bRequestType != 0x80) goto unrecognized; *(u8 *)req->buf = dev->current_config; value = min (w_length, (u16) 1); break; #endif default: unrecognized: VDEBUG (dev, "%s req%02x.%02x v%04x i%04x l%d\n", dev->usermode_setup ? "delegate" : "fail", ctrl->bRequestType, ctrl->bRequest, w_value, le16_to_cpu(ctrl->wIndex), w_length); /* if there's an ep0 reader, don't stall */ if (dev->usermode_setup) { dev->setup_can_stall = 1; delegate: dev->setup_in = (ctrl->bRequestType & USB_DIR_IN) ? 1 : 0; dev->setup_wLength = w_length; dev->setup_out_ready = 0; dev->setup_out_error = 0; /* read DATA stage for OUT right away */ if (unlikely (!dev->setup_in && w_length)) { value = setup_req (gadget->ep0, dev->req, w_length); if (value < 0) break; ++dev->udc_usage; spin_unlock (&dev->lock); value = usb_ep_queue (gadget->ep0, dev->req, GFP_KERNEL); spin_lock (&dev->lock); --dev->udc_usage; if (value < 0) { clean_req (gadget->ep0, dev->req); break; } /* we can't currently stall these */ dev->setup_can_stall = 0; } /* state changes when reader collects event */ event = next_event (dev, GADGETFS_SETUP); event->u.setup = *ctrl; ep0_readable (dev); spin_unlock (&dev->lock); /* * Return USB_GADGET_DELAYED_STATUS as a workaround to * stop some UDC drivers (e.g. dwc3) from automatically * proceeding with the status stage for 0-length * transfers. * Should be removed once all UDC drivers are fixed to * always delay the status stage until a response is * queued to EP0. */ return w_length == 0 ? USB_GADGET_DELAYED_STATUS : 0; } } /* proceed with data transfer and status phases? */ if (value >= 0 && dev->state != STATE_DEV_SETUP) { req->length = value; req->zero = value < w_length; ++dev->udc_usage; spin_unlock (&dev->lock); value = usb_ep_queue (gadget->ep0, req, GFP_KERNEL); spin_lock(&dev->lock); --dev->udc_usage; spin_unlock(&dev->lock); if (value < 0) { DBG (dev, "ep_queue --> %d\n", value); req->status = 0; } return value; } /* device stalls when value < 0 */ spin_unlock (&dev->lock); return value; } static void destroy_ep_files (struct dev_data *dev) { DBG (dev, "%s %d\n", __func__, dev->state); /* dev->state must prevent interference */ spin_lock_irq (&dev->lock); while (!list_empty(&dev->epfiles)) { struct ep_data *ep; struct inode *parent; struct dentry *dentry; /* break link to FS */ ep = list_first_entry (&dev->epfiles, struct ep_data, epfiles); list_del_init (&ep->epfiles); spin_unlock_irq (&dev->lock); dentry = ep->dentry; ep->dentry = NULL; parent = d_inode(dentry->d_parent); /* break link to controller */ mutex_lock(&ep->lock); if (ep->state == STATE_EP_ENABLED) (void) usb_ep_disable (ep->ep); ep->state = STATE_EP_UNBOUND; usb_ep_free_request (ep->ep, ep->req); ep->ep = NULL; mutex_unlock(&ep->lock); wake_up (&ep->wait); put_ep (ep); /* break link to dcache */ inode_lock(parent); d_delete (dentry); dput (dentry); inode_unlock(parent); spin_lock_irq (&dev->lock); } spin_unlock_irq (&dev->lock); } static struct dentry * gadgetfs_create_file (struct super_block *sb, char const *name, void *data, const struct file_operations *fops); static int activate_ep_files (struct dev_data *dev) { struct usb_ep *ep; struct ep_data *data; gadget_for_each_ep (ep, dev->gadget) { data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) goto enomem0; data->state = STATE_EP_DISABLED; mutex_init(&data->lock); init_waitqueue_head (&data->wait); strscpy(data->name, ep->name); refcount_set (&data->count, 1); data->dev = dev; get_dev (dev); data->ep = ep; ep->driver_data = data; data->req = usb_ep_alloc_request (ep, GFP_KERNEL); if (!data->req) goto enomem1; data->dentry = gadgetfs_create_file (dev->sb, data->name, data, &ep_io_operations); if (!data->dentry) goto enomem2; list_add_tail (&data->epfiles, &dev->epfiles); } return 0; enomem2: usb_ep_free_request (ep, data->req); enomem1: put_dev (dev); kfree (data); enomem0: DBG (dev, "%s enomem\n", __func__); destroy_ep_files (dev); return -ENOMEM; } static void gadgetfs_unbind (struct usb_gadget *gadget) { struct dev_data *dev = get_gadget_data (gadget); DBG (dev, "%s\n", __func__); spin_lock_irq (&dev->lock); dev->state = STATE_DEV_UNBOUND; while (dev->udc_usage > 0) { spin_unlock_irq(&dev->lock); usleep_range(1000, 2000); spin_lock_irq(&dev->lock); } spin_unlock_irq (&dev->lock); destroy_ep_files (dev); gadget->ep0->driver_data = NULL; set_gadget_data (gadget, NULL); /* we've already been disconnected ... no i/o is active */ if (dev->req) usb_ep_free_request (gadget->ep0, dev->req); DBG (dev, "%s done\n", __func__); put_dev (dev); } static struct dev_data *the_device; static int gadgetfs_bind(struct usb_gadget *gadget, struct usb_gadget_driver *driver) { struct dev_data *dev = the_device; if (!dev) return -ESRCH; if (0 != strcmp (CHIP, gadget->name)) { pr_err("%s expected %s controller not %s\n", shortname, CHIP, gadget->name); return -ENODEV; } set_gadget_data (gadget, dev); dev->gadget = gadget; gadget->ep0->driver_data = dev; /* preallocate control response and buffer */ dev->req = usb_ep_alloc_request (gadget->ep0, GFP_KERNEL); if (!dev->req) goto enomem; dev->req->context = NULL; dev->req->complete = epio_complete; if (activate_ep_files (dev) < 0) goto enomem; INFO (dev, "bound to %s driver\n", gadget->name); spin_lock_irq(&dev->lock); dev->state = STATE_DEV_UNCONNECTED; spin_unlock_irq(&dev->lock); get_dev (dev); return 0; enomem: gadgetfs_unbind (gadget); return -ENOMEM; } static void gadgetfs_disconnect (struct usb_gadget *gadget) { struct dev_data *dev = get_gadget_data (gadget); unsigned long flags; spin_lock_irqsave (&dev->lock, flags); if (dev->state == STATE_DEV_UNCONNECTED) goto exit; dev->state = STATE_DEV_UNCONNECTED; INFO (dev, "disconnected\n"); next_event (dev, GADGETFS_DISCONNECT); ep0_readable (dev); exit: spin_unlock_irqrestore (&dev->lock, flags); } static void gadgetfs_suspend (struct usb_gadget *gadget) { struct dev_data *dev = get_gadget_data (gadget); unsigned long flags; INFO (dev, "suspended from state %d\n", dev->state); spin_lock_irqsave(&dev->lock, flags); switch (dev->state) { case STATE_DEV_SETUP: // VERY odd... host died?? case STATE_DEV_CONNECTED: case STATE_DEV_UNCONNECTED: next_event (dev, GADGETFS_SUSPEND); ep0_readable (dev); fallthrough; default: break; } spin_unlock_irqrestore(&dev->lock, flags); } static struct usb_gadget_driver gadgetfs_driver = { .function = (char *) driver_desc, .bind = gadgetfs_bind, .unbind = gadgetfs_unbind, .setup = gadgetfs_setup, .reset = gadgetfs_disconnect, .disconnect = gadgetfs_disconnect, .suspend = gadgetfs_suspend, .driver = { .name = shortname, }, }; /*----------------------------------------------------------------------*/ /* DEVICE INITIALIZATION * * fd = open ("/dev/gadget/$CHIP", O_RDWR) * status = write (fd, descriptors, sizeof descriptors) * * That write establishes the device configuration, so the kernel can * bind to the controller ... guaranteeing it can handle enumeration * at all necessary speeds. Descriptor order is: * * . message tag (u32, host order) ... for now, must be zero; it * would change to support features like multi-config devices * . full/low speed config ... all wTotalLength bytes (with interface, * class, altsetting, endpoint, and other descriptors) * . high speed config ... all descriptors, for high speed operation; * this one's optional except for high-speed hardware * . device descriptor * * Endpoints are not yet enabled. Drivers must wait until device * configuration and interface altsetting changes create * the need to configure (or unconfigure) them. * * After initialization, the device stays active for as long as that * $CHIP file is open. Events must then be read from that descriptor, * such as configuration notifications. */ static int is_valid_config(struct usb_config_descriptor *config, unsigned int total) { return config->bDescriptorType == USB_DT_CONFIG && config->bLength == USB_DT_CONFIG_SIZE && total >= USB_DT_CONFIG_SIZE && config->bConfigurationValue != 0 && (config->bmAttributes & USB_CONFIG_ATT_ONE) != 0 && (config->bmAttributes & USB_CONFIG_ATT_WAKEUP) == 0; /* FIXME if gadget->is_otg, _must_ include an otg descriptor */ /* FIXME check lengths: walk to end */ } static ssize_t dev_config (struct file *fd, const char __user *buf, size_t len, loff_t *ptr) { struct dev_data *dev = fd->private_data; ssize_t value, length = len; unsigned total; u32 tag; char *kbuf; spin_lock_irq(&dev->lock); if (dev->state > STATE_DEV_OPENED) { value = ep0_write(fd, buf, len, ptr); spin_unlock_irq(&dev->lock); return value; } spin_unlock_irq(&dev->lock); if ((len < (USB_DT_CONFIG_SIZE + USB_DT_DEVICE_SIZE + 4)) || (len > PAGE_SIZE * 4)) return -EINVAL; /* we might need to change message format someday */ if (copy_from_user (&tag, buf, 4)) return -EFAULT; if (tag != 0) return -EINVAL; buf += 4; length -= 4; kbuf = memdup_user(buf, length); if (IS_ERR(kbuf)) return PTR_ERR(kbuf); spin_lock_irq (&dev->lock); value = -EINVAL; if (dev->buf) { spin_unlock_irq(&dev->lock); kfree(kbuf); return value; } dev->buf = kbuf; /* full or low speed config */ dev->config = (void *) kbuf; total = le16_to_cpu(dev->config->wTotalLength); if (!is_valid_config(dev->config, total) || total > length - USB_DT_DEVICE_SIZE) goto fail; kbuf += total; length -= total; /* optional high speed config */ if (kbuf [1] == USB_DT_CONFIG) { dev->hs_config = (void *) kbuf; total = le16_to_cpu(dev->hs_config->wTotalLength); if (!is_valid_config(dev->hs_config, total) || total > length - USB_DT_DEVICE_SIZE) goto fail; kbuf += total; length -= total; } else { dev->hs_config = NULL; } /* could support multiple configs, using another encoding! */ /* device descriptor (tweaked for paranoia) */ if (length != USB_DT_DEVICE_SIZE) goto fail; dev->dev = (void *)kbuf; if (dev->dev->bLength != USB_DT_DEVICE_SIZE || dev->dev->bDescriptorType != USB_DT_DEVICE || dev->dev->bNumConfigurations != 1) goto fail; dev->dev->bcdUSB = cpu_to_le16 (0x0200); /* triggers gadgetfs_bind(); then we can enumerate. */ spin_unlock_irq (&dev->lock); if (dev->hs_config) gadgetfs_driver.max_speed = USB_SPEED_HIGH; else gadgetfs_driver.max_speed = USB_SPEED_FULL; value = usb_gadget_register_driver(&gadgetfs_driver); if (value != 0) { spin_lock_irq(&dev->lock); goto fail; } else { /* at this point "good" hardware has for the first time * let the USB the host see us. alternatively, if users * unplug/replug that will clear all the error state. * * note: everything running before here was guaranteed * to choke driver model style diagnostics. from here * on, they can work ... except in cleanup paths that * kick in after the ep0 descriptor is closed. */ value = len; dev->gadget_registered = true; } return value; fail: dev->config = NULL; dev->hs_config = NULL; dev->dev = NULL; spin_unlock_irq (&dev->lock); pr_debug ("%s: %s fail %zd, %p\n", shortname, __func__, value, dev); kfree (dev->buf); dev->buf = NULL; return value; } static int gadget_dev_open (struct inode *inode, struct file *fd) { struct dev_data *dev = inode->i_private; int value = -EBUSY; spin_lock_irq(&dev->lock); if (dev->state == STATE_DEV_DISABLED) { dev->ev_next = 0; dev->state = STATE_DEV_OPENED; fd->private_data = dev; get_dev (dev); value = 0; } spin_unlock_irq(&dev->lock); return value; } static const struct file_operations ep0_operations = { .open = gadget_dev_open, .read = ep0_read, .write = dev_config, .fasync = ep0_fasync, .poll = ep0_poll, .unlocked_ioctl = gadget_dev_ioctl, .release = dev_release, }; /*----------------------------------------------------------------------*/ /* FILESYSTEM AND SUPERBLOCK OPERATIONS * * Mounting the filesystem creates a controller file, used first for * device configuration then later for event monitoring. */ /* FIXME PAM etc could set this security policy without mount options * if epfiles inherited ownership and permissons from ep0 ... */ static unsigned default_uid; static unsigned default_gid; static unsigned default_perm = S_IRUSR | S_IWUSR; module_param (default_uid, uint, 0644); module_param (default_gid, uint, 0644); module_param (default_perm, uint, 0644); static struct inode * gadgetfs_make_inode (struct super_block *sb, void *data, const struct file_operations *fops, int mode) { struct inode *inode = new_inode (sb); if (inode) { inode->i_ino = get_next_ino(); inode->i_mode = mode; inode->i_uid = make_kuid(&init_user_ns, default_uid); inode->i_gid = make_kgid(&init_user_ns, default_gid); simple_inode_init_ts(inode); inode->i_private = data; inode->i_fop = fops; } return inode; } /* creates in fs root directory, so non-renamable and non-linkable. * so inode and dentry are paired, until device reconfig. */ static struct dentry * gadgetfs_create_file (struct super_block *sb, char const *name, void *data, const struct file_operations *fops) { struct dentry *dentry; struct inode *inode; dentry = d_alloc_name(sb->s_root, name); if (!dentry) return NULL; inode = gadgetfs_make_inode (sb, data, fops, S_IFREG | (default_perm & S_IRWXUGO)); if (!inode) { dput(dentry); return NULL; } d_add (dentry, inode); return dentry; } static const struct super_operations gadget_fs_operations = { .statfs = simple_statfs, .drop_inode = generic_delete_inode, }; static int gadgetfs_fill_super (struct super_block *sb, struct fs_context *fc) { struct inode *inode; struct dev_data *dev; int rc; mutex_lock(&sb_mutex); if (the_device) { rc = -ESRCH; goto Done; } CHIP = usb_get_gadget_udc_name(); if (!CHIP) { rc = -ENODEV; goto Done; } /* superblock */ sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = GADGETFS_MAGIC; sb->s_op = &gadget_fs_operations; sb->s_time_gran = 1; /* root inode */ inode = gadgetfs_make_inode (sb, NULL, &simple_dir_operations, S_IFDIR | S_IRUGO | S_IXUGO); if (!inode) goto Enomem; inode->i_op = &simple_dir_inode_operations; if (!(sb->s_root = d_make_root (inode))) goto Enomem; /* the ep0 file is named after the controller we expect; * user mode code can use it for sanity checks, like we do. */ dev = dev_new (); if (!dev) goto Enomem; dev->sb = sb; dev->dentry = gadgetfs_create_file(sb, CHIP, dev, &ep0_operations); if (!dev->dentry) { put_dev(dev); goto Enomem; } /* other endpoint files are available after hardware setup, * from binding to a controller. */ the_device = dev; rc = 0; goto Done; Enomem: kfree(CHIP); CHIP = NULL; rc = -ENOMEM; Done: mutex_unlock(&sb_mutex); return rc; } /* "mount -t gadgetfs path /dev/gadget" ends up here */ static int gadgetfs_get_tree(struct fs_context *fc) { return get_tree_single(fc, gadgetfs_fill_super); } static const struct fs_context_operations gadgetfs_context_ops = { .get_tree = gadgetfs_get_tree, }; static int gadgetfs_init_fs_context(struct fs_context *fc) { fc->ops = &gadgetfs_context_ops; return 0; } static void gadgetfs_kill_sb (struct super_block *sb) { mutex_lock(&sb_mutex); kill_litter_super (sb); if (the_device) { put_dev (the_device); the_device = NULL; } kfree(CHIP); CHIP = NULL; mutex_unlock(&sb_mutex); } /*----------------------------------------------------------------------*/ static struct file_system_type gadgetfs_type = { .owner = THIS_MODULE, .name = shortname, .init_fs_context = gadgetfs_init_fs_context, .kill_sb = gadgetfs_kill_sb, }; MODULE_ALIAS_FS("gadgetfs"); /*----------------------------------------------------------------------*/ static int __init gadgetfs_init (void) { int status; status = register_filesystem (&gadgetfs_type); if (status == 0) pr_info ("%s: %s, version " DRIVER_VERSION "\n", shortname, driver_desc); return status; } module_init (gadgetfs_init); static void __exit gadgetfs_cleanup (void) { pr_debug ("unregister %s\n", shortname); unregister_filesystem (&gadgetfs_type); } module_exit (gadgetfs_cleanup); |
| 3 1 1 1 8 8 2 6 3 2 1 1 1 1 5 1 1 3 3 3 3 3 2 1 1 1 3 1 1 1 2 1 4 6 1 5 4 1 1 1 3 20 9 11 2 2 2 2 2 6 2 8 8 21 21 21 21 21 21 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 | // SPDX-License-Identifier: GPL-2.0-only /* * linux/fs/nfs/fs_context.c * * Copyright (C) 1992 Rick Sladkey * Conversion to new mount api Copyright (C) David Howells * * NFS mount handling. * * Split from fs/nfs/super.c by David Howells <dhowells@redhat.com> */ #include <linux/compat.h> #include <linux/module.h> #include <linux/fs.h> #include <linux/fs_context.h> #include <linux/fs_parser.h> #include <linux/nfs_fs.h> #include <linux/nfs_mount.h> #include <linux/nfs4_mount.h> #include <net/handshake.h> #include "nfs.h" #include "internal.h" #include "nfstrace.h" #define NFSDBG_FACILITY NFSDBG_MOUNT #if IS_ENABLED(CONFIG_NFS_V3) #define NFS_DEFAULT_VERSION 3 #else #define NFS_DEFAULT_VERSION 2 #endif #define NFS_MAX_CONNECTIONS 16 enum nfs_param { Opt_ac, Opt_acdirmax, Opt_acdirmin, Opt_acl, Opt_acregmax, Opt_acregmin, Opt_actimeo, Opt_addr, Opt_bg, Opt_bsize, Opt_clientaddr, Opt_cto, Opt_alignwrite, Opt_fatal_neterrors, Opt_fg, Opt_fscache, Opt_fscache_flag, Opt_hard, Opt_intr, Opt_local_lock, Opt_lock, Opt_lookupcache, Opt_migration, Opt_minorversion, Opt_mountaddr, Opt_mounthost, Opt_mountport, Opt_mountproto, Opt_mountvers, Opt_namelen, Opt_nconnect, Opt_max_connect, Opt_port, Opt_posix, Opt_proto, Opt_rdirplus, Opt_rdirplus_none, Opt_rdirplus_force, Opt_rdma, Opt_resvport, Opt_retrans, Opt_retry, Opt_rsize, Opt_sec, Opt_sharecache, Opt_sloppy, Opt_soft, Opt_softerr, Opt_softreval, Opt_source, Opt_tcp, Opt_timeo, Opt_trunkdiscovery, Opt_udp, Opt_v, Opt_vers, Opt_wsize, Opt_write, Opt_xprtsec, }; enum { Opt_fatal_neterrors_default, Opt_fatal_neterrors_enetunreach, Opt_fatal_neterrors_none, }; static const struct constant_table nfs_param_enums_fatal_neterrors[] = { { "default", Opt_fatal_neterrors_default }, { "ENETDOWN:ENETUNREACH", Opt_fatal_neterrors_enetunreach }, { "ENETUNREACH:ENETDOWN", Opt_fatal_neterrors_enetunreach }, { "none", Opt_fatal_neterrors_none }, {} }; enum { Opt_local_lock_all, Opt_local_lock_flock, Opt_local_lock_none, Opt_local_lock_posix, }; static const struct constant_table nfs_param_enums_local_lock[] = { { "all", Opt_local_lock_all }, { "flock", Opt_local_lock_flock }, { "posix", Opt_local_lock_posix }, { "none", Opt_local_lock_none }, {} }; enum { Opt_lookupcache_all, Opt_lookupcache_none, Opt_lookupcache_positive, }; static const struct constant_table nfs_param_enums_lookupcache[] = { { "all", Opt_lookupcache_all }, { "none", Opt_lookupcache_none }, { "pos", Opt_lookupcache_positive }, { "positive", Opt_lookupcache_positive }, {} }; enum { Opt_write_lazy, Opt_write_eager, Opt_write_wait, }; static const struct constant_table nfs_param_enums_write[] = { { "lazy", Opt_write_lazy }, { "eager", Opt_write_eager }, { "wait", Opt_write_wait }, {} }; static const struct fs_parameter_spec nfs_fs_parameters[] = { fsparam_flag_no("ac", Opt_ac), fsparam_u32 ("acdirmax", Opt_acdirmax), fsparam_u32 ("acdirmin", Opt_acdirmin), fsparam_flag_no("acl", Opt_acl), fsparam_u32 ("acregmax", Opt_acregmax), fsparam_u32 ("acregmin", Opt_acregmin), fsparam_u32 ("actimeo", Opt_actimeo), fsparam_string("addr", Opt_addr), fsparam_flag ("bg", Opt_bg), fsparam_u32 ("bsize", Opt_bsize), fsparam_string("clientaddr", Opt_clientaddr), fsparam_flag_no("cto", Opt_cto), fsparam_flag_no("alignwrite", Opt_alignwrite), fsparam_enum("fatal_neterrors", Opt_fatal_neterrors, nfs_param_enums_fatal_neterrors), fsparam_flag ("fg", Opt_fg), fsparam_flag_no("fsc", Opt_fscache_flag), fsparam_string("fsc", Opt_fscache), fsparam_flag ("hard", Opt_hard), __fsparam(NULL, "intr", Opt_intr, fs_param_neg_with_no|fs_param_deprecated, NULL), fsparam_enum ("local_lock", Opt_local_lock, nfs_param_enums_local_lock), fsparam_flag_no("lock", Opt_lock), fsparam_enum ("lookupcache", Opt_lookupcache, nfs_param_enums_lookupcache), fsparam_flag_no("migration", Opt_migration), fsparam_u32 ("minorversion", Opt_minorversion), fsparam_string("mountaddr", Opt_mountaddr), fsparam_string("mounthost", Opt_mounthost), fsparam_u32 ("mountport", Opt_mountport), fsparam_string("mountproto", Opt_mountproto), fsparam_u32 ("mountvers", Opt_mountvers), fsparam_u32 ("namlen", Opt_namelen), fsparam_u32 ("nconnect", Opt_nconnect), fsparam_u32 ("max_connect", Opt_max_connect), fsparam_string("nfsvers", Opt_vers), fsparam_u32 ("port", Opt_port), fsparam_flag_no("posix", Opt_posix), fsparam_string("proto", Opt_proto), fsparam_flag_no("rdirplus", Opt_rdirplus), // rdirplus|nordirplus fsparam_string("rdirplus", Opt_rdirplus), // rdirplus=... fsparam_flag ("rdma", Opt_rdma), fsparam_flag_no("resvport", Opt_resvport), fsparam_u32 ("retrans", Opt_retrans), fsparam_string("retry", Opt_retry), fsparam_u32 ("rsize", Opt_rsize), fsparam_string("sec", Opt_sec), fsparam_flag_no("sharecache", Opt_sharecache), fsparam_flag ("sloppy", Opt_sloppy), fsparam_flag ("soft", Opt_soft), fsparam_flag ("softerr", Opt_softerr), fsparam_flag ("softreval", Opt_softreval), fsparam_string("source", Opt_source), fsparam_flag ("tcp", Opt_tcp), fsparam_u32 ("timeo", Opt_timeo), fsparam_flag_no("trunkdiscovery", Opt_trunkdiscovery), fsparam_flag ("udp", Opt_udp), fsparam_flag ("v2", Opt_v), fsparam_flag ("v3", Opt_v), fsparam_flag ("v4", Opt_v), fsparam_flag ("v4.0", Opt_v), fsparam_flag ("v4.1", Opt_v), fsparam_flag ("v4.2", Opt_v), fsparam_string("vers", Opt_vers), fsparam_enum ("write", Opt_write, nfs_param_enums_write), fsparam_u32 ("wsize", Opt_wsize), fsparam_string("xprtsec", Opt_xprtsec), {} }; enum { Opt_vers_2, Opt_vers_3, Opt_vers_4, Opt_vers_4_0, Opt_vers_4_1, Opt_vers_4_2, }; static const struct constant_table nfs_vers_tokens[] = { { "2", Opt_vers_2 }, { "3", Opt_vers_3 }, { "4", Opt_vers_4 }, { "4.0", Opt_vers_4_0 }, { "4.1", Opt_vers_4_1 }, { "4.2", Opt_vers_4_2 }, {} }; enum { Opt_xprt_rdma, Opt_xprt_rdma6, Opt_xprt_tcp, Opt_xprt_tcp6, Opt_xprt_udp, Opt_xprt_udp6, nr__Opt_xprt }; static const struct constant_table nfs_xprt_protocol_tokens[] = { { "rdma", Opt_xprt_rdma }, { "rdma6", Opt_xprt_rdma6 }, { "tcp", Opt_xprt_tcp }, { "tcp6", Opt_xprt_tcp6 }, { "udp", Opt_xprt_udp }, { "udp6", Opt_xprt_udp6 }, {} }; enum { Opt_sec_krb5, Opt_sec_krb5i, Opt_sec_krb5p, Opt_sec_lkey, Opt_sec_lkeyi, Opt_sec_lkeyp, Opt_sec_none, Opt_sec_spkm, Opt_sec_spkmi, Opt_sec_spkmp, Opt_sec_sys, nr__Opt_sec }; static const struct constant_table nfs_secflavor_tokens[] = { { "krb5", Opt_sec_krb5 }, { "krb5i", Opt_sec_krb5i }, { "krb5p", Opt_sec_krb5p }, { "lkey", Opt_sec_lkey }, { "lkeyi", Opt_sec_lkeyi }, { "lkeyp", Opt_sec_lkeyp }, { "none", Opt_sec_none }, { "null", Opt_sec_none }, { "spkm3", Opt_sec_spkm }, { "spkm3i", Opt_sec_spkmi }, { "spkm3p", Opt_sec_spkmp }, { "sys", Opt_sec_sys }, {} }; enum { Opt_xprtsec_none, Opt_xprtsec_tls, Opt_xprtsec_mtls, nr__Opt_xprtsec }; static const struct constant_table nfs_xprtsec_policies[] = { { "none", Opt_xprtsec_none }, { "tls", Opt_xprtsec_tls }, { "mtls", Opt_xprtsec_mtls }, {} }; static const struct constant_table nfs_rdirplus_tokens[] = { { "none", Opt_rdirplus_none }, { "force", Opt_rdirplus_force }, {} }; /* * Sanity-check a server address provided by the mount command. * * Address family must be initialized, and address must not be * the ANY address for that family. */ static int nfs_verify_server_address(struct sockaddr_storage *addr) { switch (addr->ss_family) { case AF_INET: { struct sockaddr_in *sa = (struct sockaddr_in *)addr; return sa->sin_addr.s_addr != htonl(INADDR_ANY); } case AF_INET6: { struct in6_addr *sa = &((struct sockaddr_in6 *)addr)->sin6_addr; return !ipv6_addr_any(sa); } } return 0; } #ifdef CONFIG_NFS_DISABLE_UDP_SUPPORT static bool nfs_server_transport_udp_invalid(const struct nfs_fs_context *ctx) { return true; } #else static bool nfs_server_transport_udp_invalid(const struct nfs_fs_context *ctx) { if (ctx->version == 4) return true; return false; } #endif /* * Sanity check the NFS transport protocol. */ static int nfs_validate_transport_protocol(struct fs_context *fc, struct nfs_fs_context *ctx) { switch (ctx->nfs_server.protocol) { case XPRT_TRANSPORT_UDP: if (nfs_server_transport_udp_invalid(ctx)) goto out_invalid_transport_udp; break; case XPRT_TRANSPORT_TCP: case XPRT_TRANSPORT_RDMA: break; default: ctx->nfs_server.protocol = XPRT_TRANSPORT_TCP; } if (ctx->xprtsec.policy != RPC_XPRTSEC_NONE) switch (ctx->nfs_server.protocol) { case XPRT_TRANSPORT_TCP: ctx->nfs_server.protocol = XPRT_TRANSPORT_TCP_TLS; break; default: goto out_invalid_xprtsec_policy; } return 0; out_invalid_transport_udp: return nfs_invalf(fc, "NFS: Unsupported transport protocol udp"); out_invalid_xprtsec_policy: return nfs_invalf(fc, "NFS: Transport does not support xprtsec"); } /* * For text based NFSv2/v3 mounts, the mount protocol transport default * settings should depend upon the specified NFS transport. */ static void nfs_set_mount_transport_protocol(struct nfs_fs_context *ctx) { if (ctx->mount_server.protocol == XPRT_TRANSPORT_UDP || ctx->mount_server.protocol == XPRT_TRANSPORT_TCP) return; switch (ctx->nfs_server.protocol) { case XPRT_TRANSPORT_UDP: ctx->mount_server.protocol = XPRT_TRANSPORT_UDP; break; case XPRT_TRANSPORT_TCP: case XPRT_TRANSPORT_RDMA: ctx->mount_server.protocol = XPRT_TRANSPORT_TCP; } } /* * Add 'flavor' to 'auth_info' if not already present. * Returns true if 'flavor' ends up in the list, false otherwise */ static int nfs_auth_info_add(struct fs_context *fc, struct nfs_auth_info *auth_info, rpc_authflavor_t flavor) { unsigned int i; unsigned int max_flavor_len = ARRAY_SIZE(auth_info->flavors); /* make sure this flavor isn't already in the list */ for (i = 0; i < auth_info->flavor_len; i++) { if (flavor == auth_info->flavors[i]) return 0; } if (auth_info->flavor_len + 1 >= max_flavor_len) return nfs_invalf(fc, "NFS: too many sec= flavors"); auth_info->flavors[auth_info->flavor_len++] = flavor; return 0; } /* * Parse the value of the 'sec=' option. */ static int nfs_parse_security_flavors(struct fs_context *fc, struct fs_parameter *param) { struct nfs_fs_context *ctx = nfs_fc2context(fc); rpc_authflavor_t pseudoflavor; char *string = param->string, *p; int ret; trace_nfs_mount_assign(param->key, string); while ((p = strsep(&string, ":")) != NULL) { if (!*p) continue; switch (lookup_constant(nfs_secflavor_tokens, p, -1)) { case Opt_sec_none: pseudoflavor = RPC_AUTH_NULL; break; case Opt_sec_sys: pseudoflavor = RPC_AUTH_UNIX; break; case Opt_sec_krb5: pseudoflavor = RPC_AUTH_GSS_KRB5; break; case Opt_sec_krb5i: pseudoflavor = RPC_AUTH_GSS_KRB5I; break; case Opt_sec_krb5p: pseudoflavor = RPC_AUTH_GSS_KRB5P; break; case Opt_sec_lkey: pseudoflavor = RPC_AUTH_GSS_LKEY; break; case Opt_sec_lkeyi: pseudoflavor = RPC_AUTH_GSS_LKEYI; break; case Opt_sec_lkeyp: pseudoflavor = RPC_AUTH_GSS_LKEYP; break; case Opt_sec_spkm: pseudoflavor = RPC_AUTH_GSS_SPKM; break; case Opt_sec_spkmi: pseudoflavor = RPC_AUTH_GSS_SPKMI; break; case Opt_sec_spkmp: pseudoflavor = RPC_AUTH_GSS_SPKMP; break; default: return nfs_invalf(fc, "NFS: sec=%s option not recognized", p); } ret = nfs_auth_info_add(fc, &ctx->auth_info, pseudoflavor); if (ret < 0) return ret; } return 0; } static int nfs_parse_xprtsec_policy(struct fs_context *fc, struct fs_parameter *param) { struct nfs_fs_context *ctx = nfs_fc2context(fc); trace_nfs_mount_assign(param->key, param->string); switch (lookup_constant(nfs_xprtsec_policies, param->string, -1)) { case Opt_xprtsec_none: ctx->xprtsec.policy = RPC_XPRTSEC_NONE; break; case Opt_xprtsec_tls: ctx->xprtsec.policy = RPC_XPRTSEC_TLS_ANON; break; case Opt_xprtsec_mtls: ctx->xprtsec.policy = RPC_XPRTSEC_TLS_X509; break; default: return nfs_invalf(fc, "NFS: Unrecognized transport security policy"); } return 0; } static int nfs_parse_version_string(struct fs_context *fc, const char *string) { struct nfs_fs_context *ctx = nfs_fc2context(fc); ctx->flags &= ~NFS_MOUNT_VER3; switch (lookup_constant(nfs_vers_tokens, string, -1)) { case Opt_vers_2: ctx->version = 2; break; case Opt_vers_3: ctx->flags |= NFS_MOUNT_VER3; ctx->version = 3; break; case Opt_vers_4: /* Backward compatibility option. In future, * the mount program should always supply * a NFSv4 minor version number. */ ctx->version = 4; break; case Opt_vers_4_0: ctx->version = 4; ctx->minorversion = 0; break; case Opt_vers_4_1: ctx->version = 4; ctx->minorversion = 1; break; case Opt_vers_4_2: ctx->version = 4; ctx->minorversion = 2; break; default: return nfs_invalf(fc, "NFS: Unsupported NFS version"); } return 0; } /* * Parse a single mount parameter. */ static int nfs_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param) { struct fs_parse_result result; struct nfs_fs_context *ctx = nfs_fc2context(fc); unsigned short protofamily, mountfamily; unsigned int len; int ret, opt; trace_nfs_mount_option(param); opt = fs_parse(fc, nfs_fs_parameters, param, &result); if (opt < 0) return (opt == -ENOPARAM && ctx->sloppy) ? 1 : opt; if (fc->security) ctx->has_sec_mnt_opts = 1; switch (opt) { case Opt_source: if (fc->source) return nfs_invalf(fc, "NFS: Multiple sources not supported"); fc->source = param->string; param->string = NULL; break; /* * boolean options: foo/nofoo */ case Opt_soft: ctx->flags |= NFS_MOUNT_SOFT; ctx->flags &= ~NFS_MOUNT_SOFTERR; break; case Opt_softerr: ctx->flags |= NFS_MOUNT_SOFTERR | NFS_MOUNT_SOFTREVAL; ctx->flags &= ~NFS_MOUNT_SOFT; break; case Opt_hard: ctx->flags &= ~(NFS_MOUNT_SOFT | NFS_MOUNT_SOFTERR | NFS_MOUNT_SOFTREVAL); break; case Opt_softreval: if (result.negated) ctx->flags &= ~NFS_MOUNT_SOFTREVAL; else ctx->flags |= NFS_MOUNT_SOFTREVAL; break; case Opt_posix: if (result.negated) ctx->flags &= ~NFS_MOUNT_POSIX; else ctx->flags |= NFS_MOUNT_POSIX; break; case Opt_cto: if (result.negated) ctx->flags |= NFS_MOUNT_NOCTO; else ctx->flags &= ~NFS_MOUNT_NOCTO; break; case Opt_trunkdiscovery: if (result.negated) ctx->flags &= ~NFS_MOUNT_TRUNK_DISCOVERY; else ctx->flags |= NFS_MOUNT_TRUNK_DISCOVERY; break; case Opt_alignwrite: if (result.negated) ctx->flags |= NFS_MOUNT_NO_ALIGNWRITE; else ctx->flags &= ~NFS_MOUNT_NO_ALIGNWRITE; break; case Opt_ac: if (result.negated) ctx->flags |= NFS_MOUNT_NOAC; else ctx->flags &= ~NFS_MOUNT_NOAC; break; case Opt_lock: if (result.negated) { ctx->lock_status = NFS_LOCK_NOLOCK; ctx->flags |= NFS_MOUNT_NONLM; ctx->flags |= (NFS_MOUNT_LOCAL_FLOCK | NFS_MOUNT_LOCAL_FCNTL); } else { ctx->lock_status = NFS_LOCK_LOCK; ctx->flags &= ~NFS_MOUNT_NONLM; ctx->flags &= ~(NFS_MOUNT_LOCAL_FLOCK | NFS_MOUNT_LOCAL_FCNTL); } break; case Opt_udp: ctx->flags &= ~NFS_MOUNT_TCP; ctx->nfs_server.protocol = XPRT_TRANSPORT_UDP; break; case Opt_tcp: case Opt_rdma: ctx->flags |= NFS_MOUNT_TCP; /* for side protocols */ ret = xprt_find_transport_ident(param->key); if (ret < 0) goto out_bad_transport; ctx->nfs_server.protocol = ret; break; case Opt_acl: if (result.negated) ctx->flags |= NFS_MOUNT_NOACL; else ctx->flags &= ~NFS_MOUNT_NOACL; break; case Opt_rdirplus: if (result.negated) { ctx->flags &= ~NFS_MOUNT_FORCE_RDIRPLUS; ctx->flags |= NFS_MOUNT_NORDIRPLUS; } else if (!param->string) { ctx->flags &= ~(NFS_MOUNT_NORDIRPLUS | NFS_MOUNT_FORCE_RDIRPLUS); } else { switch (lookup_constant(nfs_rdirplus_tokens, param->string, -1)) { case Opt_rdirplus_none: ctx->flags &= ~NFS_MOUNT_FORCE_RDIRPLUS; ctx->flags |= NFS_MOUNT_NORDIRPLUS; break; case Opt_rdirplus_force: ctx->flags &= ~NFS_MOUNT_NORDIRPLUS; ctx->flags |= NFS_MOUNT_FORCE_RDIRPLUS; break; default: goto out_invalid_value; } } break; case Opt_sharecache: if (result.negated) ctx->flags |= NFS_MOUNT_UNSHARED; else ctx->flags &= ~NFS_MOUNT_UNSHARED; break; case Opt_resvport: if (result.negated) ctx->flags |= NFS_MOUNT_NORESVPORT; else ctx->flags &= ~NFS_MOUNT_NORESVPORT; break; case Opt_fscache_flag: if (result.negated) ctx->options &= ~NFS_OPTION_FSCACHE; else ctx->options |= NFS_OPTION_FSCACHE; kfree(ctx->fscache_uniq); ctx->fscache_uniq = NULL; break; case Opt_fscache: trace_nfs_mount_assign(param->key, param->string); ctx->options |= NFS_OPTION_FSCACHE; kfree(ctx->fscache_uniq); ctx->fscache_uniq = param->string; param->string = NULL; break; case Opt_migration: if (result.negated) ctx->options &= ~NFS_OPTION_MIGRATION; else ctx->options |= NFS_OPTION_MIGRATION; break; /* * options that take numeric values */ case Opt_port: if (result.uint_32 > USHRT_MAX) goto out_of_bounds; ctx->nfs_server.port = result.uint_32; break; case Opt_rsize: ctx->rsize = result.uint_32; break; case Opt_wsize: ctx->wsize = result.uint_32; break; case Opt_bsize: ctx->bsize = result.uint_32; break; case Opt_timeo: if (result.uint_32 < 1 || result.uint_32 > INT_MAX) goto out_of_bounds; ctx->timeo = result.uint_32; break; case Opt_retrans: if (result.uint_32 > INT_MAX) goto out_of_bounds; ctx->retrans = result.uint_32; break; case Opt_acregmin: ctx->acregmin = result.uint_32; break; case Opt_acregmax: ctx->acregmax = result.uint_32; break; case Opt_acdirmin: ctx->acdirmin = result.uint_32; break; case Opt_acdirmax: ctx->acdirmax = result.uint_32; break; case Opt_actimeo: ctx->acregmin = result.uint_32; ctx->acregmax = result.uint_32; ctx->acdirmin = result.uint_32; ctx->acdirmax = result.uint_32; break; case Opt_namelen: ctx->namlen = result.uint_32; break; case Opt_mountport: if (result.uint_32 > USHRT_MAX) goto out_of_bounds; ctx->mount_server.port = result.uint_32; break; case Opt_mountvers: if (result.uint_32 < NFS_MNT_VERSION || result.uint_32 > NFS_MNT3_VERSION) goto out_of_bounds; ctx->mount_server.version = result.uint_32; break; case Opt_minorversion: if (result.uint_32 > NFS4_MAX_MINOR_VERSION) goto out_of_bounds; ctx->minorversion = result.uint_32; break; /* * options that take text values */ case Opt_v: ret = nfs_parse_version_string(fc, param->key + 1); if (ret < 0) return ret; break; case Opt_vers: if (!param->string) goto out_invalid_value; trace_nfs_mount_assign(param->key, param->string); ret = nfs_parse_version_string(fc, param->string); if (ret < 0) return ret; break; case Opt_sec: ret = nfs_parse_security_flavors(fc, param); if (ret < 0) return ret; break; case Opt_xprtsec: ret = nfs_parse_xprtsec_policy(fc, param); if (ret < 0) return ret; break; case Opt_proto: if (!param->string) goto out_invalid_value; trace_nfs_mount_assign(param->key, param->string); protofamily = AF_INET; switch (lookup_constant(nfs_xprt_protocol_tokens, param->string, -1)) { case Opt_xprt_udp6: protofamily = AF_INET6; fallthrough; case Opt_xprt_udp: ctx->flags &= ~NFS_MOUNT_TCP; ctx->nfs_server.protocol = XPRT_TRANSPORT_UDP; break; case Opt_xprt_tcp6: protofamily = AF_INET6; fallthrough; case Opt_xprt_tcp: ctx->flags |= NFS_MOUNT_TCP; ctx->nfs_server.protocol = XPRT_TRANSPORT_TCP; break; case Opt_xprt_rdma6: protofamily = AF_INET6; fallthrough; case Opt_xprt_rdma: /* vector side protocols to TCP */ ctx->flags |= NFS_MOUNT_TCP; ret = xprt_find_transport_ident(param->string); if (ret < 0) goto out_bad_transport; ctx->nfs_server.protocol = ret; break; default: goto out_bad_transport; } ctx->protofamily = protofamily; break; case Opt_mountproto: if (!param->string) goto out_invalid_value; trace_nfs_mount_assign(param->key, param->string); mountfamily = AF_INET; switch (lookup_constant(nfs_xprt_protocol_tokens, param->string, -1)) { case Opt_xprt_udp6: mountfamily = AF_INET6; fallthrough; case Opt_xprt_udp: ctx->mount_server.protocol = XPRT_TRANSPORT_UDP; break; case Opt_xprt_tcp6: mountfamily = AF_INET6; fallthrough; case Opt_xprt_tcp: ctx->mount_server.protocol = XPRT_TRANSPORT_TCP; break; case Opt_xprt_rdma: /* not used for side protocols */ default: goto out_bad_transport; } ctx->mountfamily = mountfamily; break; case Opt_addr: trace_nfs_mount_assign(param->key, param->string); len = rpc_pton(fc->net_ns, param->string, param->size, &ctx->nfs_server.address, sizeof(ctx->nfs_server._address)); if (len == 0) goto out_invalid_address; ctx->nfs_server.addrlen = len; break; case Opt_clientaddr: trace_nfs_mount_assign(param->key, param->string); kfree(ctx->client_address); ctx->client_address = param->string; param->string = NULL; break; case Opt_mounthost: trace_nfs_mount_assign(param->key, param->string); kfree(ctx->mount_server.hostname); ctx->mount_server.hostname = param->string; param->string = NULL; break; case Opt_mountaddr: trace_nfs_mount_assign(param->key, param->string); len = rpc_pton(fc->net_ns, param->string, param->size, &ctx->mount_server.address, sizeof(ctx->mount_server._address)); if (len == 0) goto out_invalid_address; ctx->mount_server.addrlen = len; break; case Opt_nconnect: trace_nfs_mount_assign(param->key, param->string); if (result.uint_32 < 1 || result.uint_32 > NFS_MAX_CONNECTIONS) goto out_of_bounds; ctx->nfs_server.nconnect = result.uint_32; break; case Opt_max_connect: trace_nfs_mount_assign(param->key, param->string); if (result.uint_32 < 1 || result.uint_32 > NFS_MAX_TRANSPORTS) goto out_of_bounds; ctx->nfs_server.max_connect = result.uint_32; break; case Opt_fatal_neterrors: trace_nfs_mount_assign(param->key, param->string); switch (result.uint_32) { case Opt_fatal_neterrors_default: if (fc->net_ns != &init_net) ctx->flags |= NFS_MOUNT_NETUNREACH_FATAL; else ctx->flags &= ~NFS_MOUNT_NETUNREACH_FATAL; break; case Opt_fatal_neterrors_enetunreach: ctx->flags |= NFS_MOUNT_NETUNREACH_FATAL; break; case Opt_fatal_neterrors_none: ctx->flags &= ~NFS_MOUNT_NETUNREACH_FATAL; break; default: goto out_invalid_value; } break; case Opt_lookupcache: trace_nfs_mount_assign(param->key, param->string); switch (result.uint_32) { case Opt_lookupcache_all: ctx->flags &= ~(NFS_MOUNT_LOOKUP_CACHE_NONEG|NFS_MOUNT_LOOKUP_CACHE_NONE); break; case Opt_lookupcache_positive: ctx->flags &= ~NFS_MOUNT_LOOKUP_CACHE_NONE; ctx->flags |= NFS_MOUNT_LOOKUP_CACHE_NONEG; break; case Opt_lookupcache_none: ctx->flags |= NFS_MOUNT_LOOKUP_CACHE_NONEG|NFS_MOUNT_LOOKUP_CACHE_NONE; break; default: goto out_invalid_value; } break; case Opt_local_lock: trace_nfs_mount_assign(param->key, param->string); switch (result.uint_32) { case Opt_local_lock_all: ctx->flags |= (NFS_MOUNT_LOCAL_FLOCK | NFS_MOUNT_LOCAL_FCNTL); break; case Opt_local_lock_flock: ctx->flags |= NFS_MOUNT_LOCAL_FLOCK; break; case Opt_local_lock_posix: ctx->flags |= NFS_MOUNT_LOCAL_FCNTL; break; case Opt_local_lock_none: ctx->flags &= ~(NFS_MOUNT_LOCAL_FLOCK | NFS_MOUNT_LOCAL_FCNTL); break; default: goto out_invalid_value; } break; case Opt_write: trace_nfs_mount_assign(param->key, param->string); switch (result.uint_32) { case Opt_write_lazy: ctx->flags &= ~(NFS_MOUNT_WRITE_EAGER | NFS_MOUNT_WRITE_WAIT); break; case Opt_write_eager: ctx->flags |= NFS_MOUNT_WRITE_EAGER; ctx->flags &= ~NFS_MOUNT_WRITE_WAIT; break; case Opt_write_wait: ctx->flags |= NFS_MOUNT_WRITE_EAGER | NFS_MOUNT_WRITE_WAIT; break; default: goto out_invalid_value; } break; /* * Special options */ case Opt_sloppy: ctx->sloppy = true; break; } return 0; out_invalid_value: return nfs_invalf(fc, "NFS: Bad mount option value specified"); out_invalid_address: return nfs_invalf(fc, "NFS: Bad IP address specified"); out_of_bounds: return nfs_invalf(fc, "NFS: Value for '%s' out of range", param->key); out_bad_transport: return nfs_invalf(fc, "NFS: Unrecognized transport protocol"); } /* * Split fc->source into "hostname:export_path". * * The leftmost colon demarks the split between the server's hostname * and the export path. If the hostname starts with a left square * bracket, then it may contain colons. * * Note: caller frees hostname and export path, even on error. */ static int nfs_parse_source(struct fs_context *fc, size_t maxnamlen, size_t maxpathlen) { struct nfs_fs_context *ctx = nfs_fc2context(fc); const char *dev_name = fc->source; size_t len; const char *end; if (unlikely(!dev_name || !*dev_name)) return -EINVAL; /* Is the host name protected with square brakcets? */ if (*dev_name == '[') { end = strchr(++dev_name, ']'); if (end == NULL || end[1] != ':') goto out_bad_devname; len = end - dev_name; end++; } else { const char *comma; end = strchr(dev_name, ':'); if (end == NULL) goto out_bad_devname; len = end - dev_name; /* kill possible hostname list: not supported */ comma = memchr(dev_name, ',', len); if (comma) len = comma - dev_name; } if (len > maxnamlen) goto out_hostname; kfree(ctx->nfs_server.hostname); /* N.B. caller will free nfs_server.hostname in all cases */ ctx->nfs_server.hostname = kmemdup_nul(dev_name, len, GFP_KERNEL); if (!ctx->nfs_server.hostname) goto out_nomem; len = strlen(++end); if (len > maxpathlen) goto out_path; ctx->nfs_server.export_path = kmemdup_nul(end, len, GFP_KERNEL); if (!ctx->nfs_server.export_path) goto out_nomem; trace_nfs_mount_path(ctx->nfs_server.export_path); return 0; out_bad_devname: return nfs_invalf(fc, "NFS: device name not in host:path format"); out_nomem: nfs_errorf(fc, "NFS: not enough memory to parse device name"); return -ENOMEM; out_hostname: nfs_errorf(fc, "NFS: server hostname too long"); return -ENAMETOOLONG; out_path: nfs_errorf(fc, "NFS: export pathname too long"); return -ENAMETOOLONG; } static inline bool is_remount_fc(struct fs_context *fc) { return fc->root != NULL; } /* * Parse monolithic NFS2/NFS3 mount data * - fills in the mount root filehandle * * For option strings, user space handles the following behaviors: * * + DNS: mapping server host name to IP address ("addr=" option) * * + failure mode: how to behave if a mount request can't be handled * immediately ("fg/bg" option) * * + retry: how often to retry a mount request ("retry=" option) * * + breaking back: trying proto=udp after proto=tcp, v2 after v3, * mountproto=tcp after mountproto=udp, and so on */ static int nfs23_parse_monolithic(struct fs_context *fc, struct nfs_mount_data *data) { struct nfs_fs_context *ctx = nfs_fc2context(fc); struct nfs_fh *mntfh = ctx->mntfh; struct sockaddr_storage *sap = &ctx->nfs_server._address; int extra_flags = NFS_MOUNT_LEGACY_INTERFACE; int ret; if (data == NULL) goto out_no_data; ctx->version = NFS_DEFAULT_VERSION; switch (data->version) { case 1: data->namlen = 0; fallthrough; case 2: data->bsize = 0; fallthrough; case 3: if (data->flags & NFS_MOUNT_VER3) goto out_no_v3; data->root.size = NFS2_FHSIZE; memcpy(data->root.data, data->old_root.data, NFS2_FHSIZE); /* Turn off security negotiation */ extra_flags |= NFS_MOUNT_SECFLAVOUR; fallthrough; case 4: if (data->flags & NFS_MOUNT_SECFLAVOUR) goto out_no_sec; fallthrough; case 5: memset(data->context, 0, sizeof(data->context)); fallthrough; case 6: if (data->flags & NFS_MOUNT_VER3) { if (data->root.size > NFS3_FHSIZE || data->root.size == 0) goto out_invalid_fh; mntfh->size = data->root.size; ctx->version = 3; } else { mntfh->size = NFS2_FHSIZE; ctx->version = 2; } memcpy(mntfh->data, data->root.data, mntfh->size); if (mntfh->size < sizeof(mntfh->data)) memset(mntfh->data + mntfh->size, 0, sizeof(mntfh->data) - mntfh->size); /* * for proto == XPRT_TRANSPORT_UDP, which is what uses * to_exponential, implying shift: limit the shift value * to BITS_PER_LONG (majortimeo is unsigned long) */ if (!(data->flags & NFS_MOUNT_TCP)) /* this will be UDP */ if (data->retrans >= 64) /* shift value is too large */ goto out_invalid_data; /* * Translate to nfs_fs_context, which nfs_fill_super * can deal with. */ ctx->flags = data->flags & NFS_MOUNT_FLAGMASK; ctx->flags |= extra_flags; ctx->rsize = data->rsize; ctx->wsize = data->wsize; ctx->timeo = data->timeo; ctx->retrans = data->retrans; ctx->acregmin = data->acregmin; ctx->acregmax = data->acregmax; ctx->acdirmin = data->acdirmin; ctx->acdirmax = data->acdirmax; ctx->need_mount = false; if (!is_remount_fc(fc)) { memcpy(sap, &data->addr, sizeof(data->addr)); ctx->nfs_server.addrlen = sizeof(data->addr); ctx->nfs_server.port = ntohs(data->addr.sin_port); } if (sap->ss_family != AF_INET || !nfs_verify_server_address(sap)) goto out_no_address; if (!(data->flags & NFS_MOUNT_TCP)) ctx->nfs_server.protocol = XPRT_TRANSPORT_UDP; /* N.B. caller will free nfs_server.hostname in all cases */ ctx->nfs_server.hostname = kstrdup(data->hostname, GFP_KERNEL); if (!ctx->nfs_server.hostname) goto out_nomem; ctx->namlen |