Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.10/site-packages/imageio-2.37.0-py3.10.egg/imageio/plugins/pyav.py: 1%

Shortcuts on this page

r m x   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

340 statements  

1"""Read/Write Videos (and images) using PyAV. 

2 

3.. note:: 

4 To use this plugin you need to have `PyAV <https://pyav.org/docs/stable/>`_ 

5 installed:: 

6 

7 pip install av 

8 

9This plugin wraps pyAV, a pythonic binding for the FFMPEG library. It is similar 

10to our FFMPEG plugin, has improved performance, features a robust interface, and 

11aims to supersede the FFMPEG plugin in the future. 

12 

13 

14Methods 

15------- 

16.. note:: 

17 Check the respective function for a list of supported kwargs and detailed 

18 documentation. 

19 

20.. autosummary:: 

21 :toctree: 

22 

23 PyAVPlugin.read 

24 PyAVPlugin.iter 

25 PyAVPlugin.write 

26 PyAVPlugin.properties 

27 PyAVPlugin.metadata 

28 

29Additional methods available inside the :func:`imopen <imageio.v3.imopen>` 

30context: 

31 

32.. autosummary:: 

33 :toctree: 

34 

35 PyAVPlugin.init_video_stream 

36 PyAVPlugin.write_frame 

37 PyAVPlugin.set_video_filter 

38 PyAVPlugin.container_metadata 

39 PyAVPlugin.video_stream_metadata 

40 

41Advanced API 

42------------ 

43 

44In addition to the default ImageIO v3 API this plugin exposes custom functions 

45that are specific to reading/writing video and its metadata. These are available 

46inside the :func:`imopen <imageio.v3.imopen>` context and allow fine-grained 

47control over how the video is processed. The functions are documented above and 

48below you can find a usage example:: 

49 

50 import imageio.v3 as iio 

51 

52 with iio.imopen("test.mp4", "w", plugin="pyav") as file: 

53 file.init_video_stream("libx264") 

54 file.container_metadata["comment"] = "This video was created using ImageIO." 

55 

56 for _ in range(5): 

57 for frame in iio.imiter("imageio:newtonscradle.gif"): 

58 file.write_frame(frame) 

59 

60 meta = iio.immeta("test.mp4", plugin="pyav") 

61 assert meta["comment"] == "This video was created using ImageIO." 

62 

63 

64 

65Pixel Formats (Colorspaces) 

66--------------------------- 

67 

68By default, this plugin converts the video into 8-bit RGB (called ``rgb24`` in 

69ffmpeg). This is a useful behavior for many use-cases, but sometimes you may 

70want to use the video's native colorspace or you may wish to convert the video 

71into an entirely different colorspace. This is controlled using the ``format`` 

72kwarg. You can use ``format=None`` to leave the image in its native colorspace 

73or specify any colorspace supported by FFMPEG as long as it is stridable, i.e., 

74as long as it can be represented by a single numpy array. Some useful choices 

75include: 

76 

77- rgb24 (default; 8-bit RGB) 

78- rgb48le (16-bit lower-endian RGB) 

79- bgr24 (8-bit BGR; openCVs default colorspace) 

80- gray (8-bit grayscale) 

81- yuv444p (8-bit channel-first YUV) 

82 

83Further, FFMPEG maintains a list of available formats, albeit not as part of the 

84narrative docs. It can be `found here 

85<https://ffmpeg.org/doxygen/trunk/pixfmt_8h_source.html>`_ (warning: C source 

86code). 

87 

88Filters 

89------- 

90 

91On top of providing basic read/write functionality, this plugin allows you to 

92use the full collection of `video filters available in FFMPEG 

93<https://ffmpeg.org/ffmpeg-filters.html#Video-Filters>`_. This means that you 

94can apply excessive preprocessing to your video before retrieving it as a numpy 

95array or apply excessive post-processing before you encode your data. 

96 

97Filters come in two forms: sequences or graphs. Filter sequences are, as the 

98name suggests, sequences of filters that are applied one after the other. They 

99are specified using the ``filter_sequence`` kwarg. Filter graphs, on the other 

100hand, come in the form of a directed graph and are specified using the 

101``filter_graph`` kwarg. 

102 

103.. note:: 

104 All filters are either sequences or graphs. If all you want is to apply a 

105 single filter, you can do this by specifying a filter sequence with a single 

106 entry. 

107 

108A ``filter_sequence`` is a list of filters, each defined through a 2-element 

109tuple of the form ``(filter_name, filter_parameters)``. The first element of the 

110tuple is the name of the filter. The second element are the filter parameters, 

111which can be given either as a string or a dict. The string matches the same 

112format that you would use when specifying the filter using the ffmpeg 

113command-line tool and the dict has entries of the form ``parameter:value``. For 

114example:: 

115 

116 import imageio.v3 as iio 

117 

118 # using a filter_parameters str 

119 img1 = iio.imread( 

120 "imageio:cockatoo.mp4", 

121 plugin="pyav", 

122 filter_sequence=[ 

123 ("rotate", "45*PI/180") 

124 ] 

125 ) 

126 

127 # using a filter_parameters dict 

128 img2 = iio.imread( 

129 "imageio:cockatoo.mp4", 

130 plugin="pyav", 

131 filter_sequence=[ 

132 ("rotate", {"angle":"45*PI/180", "fillcolor":"AliceBlue"}) 

133 ] 

134 ) 

135 

136A ``filter_graph``, on the other hand, is specified using a ``(nodes, edges)`` 

137tuple. It is best explained using an example:: 

138 

139 img = iio.imread( 

140 "imageio:cockatoo.mp4", 

141 plugin="pyav", 

142 filter_graph=( 

143 { 

144 "split": ("split", ""), 

145 "scale_overlay":("scale", "512:-1"), 

146 "overlay":("overlay", "x=25:y=25:enable='between(t,1,8)'"), 

147 }, 

148 [ 

149 ("video_in", "split", 0, 0), 

150 ("split", "overlay", 0, 0), 

151 ("split", "scale_overlay", 1, 0), 

152 ("scale_overlay", "overlay", 0, 1), 

153 ("overlay", "video_out", 0, 0), 

154 ] 

155 ) 

156 ) 

157 

158The above transforms the video to have picture-in-picture of itself in the top 

159left corner. As you can see, nodes are specified using a dict which has names as 

160its keys and filter tuples as values; the same tuples as the ones used when 

161defining a filter sequence. Edges are a list of a 4-tuples of the form 

162``(node_out, node_in, output_idx, input_idx)`` and specify which two filters are 

163connected and which inputs/outputs should be used for this. 

164 

165Further, there are two special nodes in a filter graph: ``video_in`` and 

166``video_out``, which represent the graph's input and output respectively. These 

167names can not be chosen for other nodes (those nodes would simply be 

168overwritten), and for a graph to be valid there must be a path from the input to 

169the output and all nodes in the graph must be connected. 

170 

171While most graphs are quite simple, they can become very complex and we 

172recommend that you read through the `FFMPEG documentation 

173<https://ffmpeg.org/ffmpeg-filters.html#Filtergraph-description>`_ and their 

174examples to better understand how to use them. 

175 

176""" 

177 

178from fractions import Fraction 

179from math import ceil 

180from typing import Any, Dict, Generator, List, Optional, Tuple, Union 

181 

182import av 

183import av.filter 

184import numpy as np 

185from av.codec.context import Flags 

186from numpy.lib.stride_tricks import as_strided 

187 

188from ..core import Request 

189from ..core.request import URI_BYTES, InitializationError, IOMode 

190from ..core.v3_plugin_api import ImageProperties, PluginV3 

191 

192 

193def _format_to_dtype(format: av.VideoFormat) -> np.dtype: 

194 """Convert a pyAV video format into a numpy dtype""" 

195 

196 if len(format.components) == 0: 

197 # fake format 

198 raise ValueError( 

199 f"Can't determine dtype from format `{format.name}`. It has no channels." 

200 ) 

201 

202 endian = ">" if format.is_big_endian else "<" 

203 dtype = "f" if "f32" in format.name else "u" 

204 bits_per_channel = [x.bits for x in format.components] 

205 n_bytes = str(int(ceil(bits_per_channel[0] / 8))) 

206 

207 return np.dtype(endian + dtype + n_bytes) 

208 

209 

210def _get_frame_shape(frame: av.VideoFrame) -> Tuple[int, ...]: 

211 """Compute the frame's array shape 

212 

213 Parameters 

214 ---------- 

215 frame : av.VideoFrame 

216 A frame for which the resulting shape should be computed. 

217 

218 Returns 

219 ------- 

220 shape : Tuple[int, ...] 

221 A tuple describing the shape of the image data in the frame. 

222 

223 """ 

224 

225 widths = [component.width for component in frame.format.components] 

226 heights = [component.height for component in frame.format.components] 

227 bits = np.array([component.bits for component in frame.format.components]) 

228 line_sizes = [plane.line_size for plane in frame.planes] 

229 

230 subsampled_width = widths[:-1] != widths[1:] 

231 subsampled_height = heights[:-1] != heights[1:] 

232 unaligned_components = np.any(bits % 8 != 0) or (line_sizes[:-1] != line_sizes[1:]) 

233 if subsampled_width or subsampled_height or unaligned_components: 

234 raise IOError( 

235 f"{frame.format.name} can't be expressed as a strided array." 

236 "Use `format=` to select a format to convert into." 

237 ) 

238 

239 shape = [frame.height, frame.width] 

240 

241 # ffmpeg doesn't have a notion of channel-first or channel-last formats 

242 # instead it stores frames in one or more planes which contain individual 

243 # components of a pixel depending on the pixel format. For channel-first 

244 # formats each component lives on a separate plane (n_planes) and for 

245 # channel-last formats all components are packed on a single plane 

246 # (n_channels) 

247 n_planes = max([component.plane for component in frame.format.components]) + 1 

248 if n_planes > 1: 

249 shape = [n_planes] + shape 

250 

251 channels_per_plane = [0] * n_planes 

252 for component in frame.format.components: 

253 channels_per_plane[component.plane] += 1 

254 n_channels = max(channels_per_plane) 

255 

256 if n_channels > 1: 

257 shape = shape + [n_channels] 

258 

259 return tuple(shape) 

260 

261 

262def _get_frame_type(picture_type: int) -> str: 

263 """Return a human-readable name for provided picture type 

264 

265 Parameters 

266 ---------- 

267 picture_type : int 

268 The picture type extracted from Frame.pict_type 

269 

270 Returns 

271 ------- 

272 picture_name : str 

273 A human readable name of the picture type 

274 

275 """ 

276 

277 if not isinstance(picture_type, int): 

278 # old pyAV versions send an enum, not an int 

279 return picture_type.name 

280 

281 picture_types = [ 

282 "NONE", 

283 "I", 

284 "P", 

285 "B", 

286 "S", 

287 "SI", 

288 "SP", 

289 "BI", 

290 ] 

291 

292 return picture_types[picture_type] 

293 

294 

295class PyAVPlugin(PluginV3): 

296 """Support for pyAV as backend. 

297 

298 Parameters 

299 ---------- 

300 request : iio.Request 

301 A request object that represents the users intent. It provides a 

302 standard interface to access various the various ImageResources and 

303 serves them to the plugin as a file object (or file). Check the docs for 

304 details. 

305 container : str 

306 Only used during `iio_mode="w"`! If not None, overwrite the default container 

307 format chosen by pyav. 

308 kwargs : Any 

309 Additional kwargs are forwarded to PyAV's constructor. 

310 

311 """ 

312 

313 def __init__(self, request: Request, *, container: str = None, **kwargs) -> None: 

314 """Initialize a new Plugin Instance. 

315 

316 See Plugin's docstring for detailed documentation. 

317 

318 Notes 

319 ----- 

320 The implementation here stores the request as a local variable that is 

321 exposed using a @property below. If you inherit from PluginV3, remember 

322 to call ``super().__init__(request)``. 

323 

324 """ 

325 

326 super().__init__(request) 

327 

328 self._container = None 

329 self._video_stream = None 

330 self._video_filter = None 

331 

332 if request.mode.io_mode == IOMode.read: 

333 self._next_idx = 0 

334 try: 

335 if request._uri_type == 5: # 5 is the value of URI_HTTP 

336 # pyav should read from HTTP by itself. This enables reading 

337 # HTTP-based streams like DASH. Note that solving streams 

338 # like this is temporary until the new request object gets 

339 # implemented. 

340 self._container = av.open(request.raw_uri, **kwargs) 

341 else: 

342 self._container = av.open(request.get_file(), **kwargs) 

343 self._video_stream = self._container.streams.video[0] 

344 self._decoder = self._container.decode(video=0) 

345 except av.FFmpegError: 

346 if isinstance(request.raw_uri, bytes): 

347 msg = "PyAV does not support these `<bytes>`" 

348 else: 

349 msg = f"PyAV does not support `{request.raw_uri}`" 

350 raise InitializationError(msg) from None 

351 else: 

352 self.frames_written = 0 

353 file_handle = self.request.get_file() 

354 filename = getattr(file_handle, "name", None) 

355 extension = self.request.extension or self.request.format_hint 

356 if extension is None: 

357 raise InitializationError("Can't determine output container to use.") 

358 

359 # hacky, but beats running our own format selection logic 

360 # (since av_guess_format is not exposed) 

361 try: 

362 setattr(file_handle, "name", filename or "tmp" + extension) 

363 except AttributeError: 

364 pass # read-only, nothing we can do 

365 

366 try: 

367 self._container = av.open( 

368 file_handle, mode="w", format=container, **kwargs 

369 ) 

370 except ValueError: 

371 raise InitializationError( 

372 f"PyAV can not write to `{self.request.raw_uri}`" 

373 ) 

374 

375 # --------------------- 

376 # Standard V3 Interface 

377 # --------------------- 

378 

379 def read( 

380 self, 

381 *, 

382 index: int = ..., 

383 format: str = "rgb24", 

384 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

385 filter_graph: Tuple[dict, List] = None, 

386 constant_framerate: bool = None, 

387 thread_count: int = 0, 

388 thread_type: str = None, 

389 ) -> np.ndarray: 

390 """Read frames from the video. 

391 

392 If ``index`` is an integer, this function reads the index-th frame from 

393 the file. If ``index`` is ... (Ellipsis), this function reads all frames 

394 from the video, stacks them along the first dimension, and returns a 

395 batch of frames. 

396 

397 Parameters 

398 ---------- 

399 index : int 

400 The index of the frame to read, e.g. ``index=5`` reads the 5th 

401 frame. If ``...``, read all the frames in the video and stack them 

402 along a new, prepended, batch dimension. 

403 format : str 

404 Set the returned colorspace. If not None (default: rgb24), convert 

405 the data into the given format before returning it. If ``None`` 

406 return the data in the encoded format if it can be expressed as a 

407 strided array; otherwise raise an Exception. 

408 filter_sequence : List[str, str, dict] 

409 If not None, apply the given sequence of FFmpeg filters to each 

410 ndimage. Check the (module-level) plugin docs for details and 

411 examples. 

412 filter_graph : (dict, List) 

413 If not None, apply the given graph of FFmpeg filters to each 

414 ndimage. The graph is given as a tuple of two dicts. The first dict 

415 contains a (named) set of nodes, and the second dict contains a set 

416 of edges between nodes of the previous dict. Check the (module-level) 

417 plugin docs for details and examples. 

418 constant_framerate : bool 

419 If True assume the video's framerate is constant. This allows for 

420 faster seeking inside the file. If False, the video is reset before 

421 each read and searched from the beginning. If None (default), this 

422 value will be read from the container format. 

423 thread_count : int 

424 How many threads to use when decoding a frame. The default is 0, 

425 which will set the number using ffmpeg's default, which is based on 

426 the codec, number of available cores, threadding model, and other 

427 considerations. 

428 thread_type : str 

429 The threading model to be used. One of 

430 

431 - `"SLICE"`: threads assemble parts of the current frame 

432 - `"FRAME"`: threads may assemble future frames 

433 - None (default): Uses ``"FRAME"`` if ``index=...`` and ffmpeg's 

434 default otherwise. 

435 

436 

437 Returns 

438 ------- 

439 frame : np.ndarray 

440 A numpy array containing loaded frame data. 

441 

442 Notes 

443 ----- 

444 Accessing random frames repeatedly is costly (O(k), where k is the 

445 average distance between two keyframes). You should do so only sparingly 

446 if possible. In some cases, it can be faster to bulk-read the video (if 

447 it fits into memory) and to then access the returned ndarray randomly. 

448 

449 The current implementation may cause problems for b-frames, i.e., 

450 bidirectionaly predicted pictures. I lack test videos to write unit 

451 tests for this case. 

452 

453 Reading from an index other than ``...``, i.e. reading a single frame, 

454 currently doesn't support filters that introduce delays. 

455 

456 """ 

457 

458 if index is ...: 

459 props = self.properties(format=format) 

460 uses_filter = ( 

461 self._video_filter is not None 

462 or filter_graph is not None 

463 or filter_sequence is not None 

464 ) 

465 

466 self._container.seek(0) 

467 if not uses_filter and props.shape[0] != 0: 

468 frames = np.empty(props.shape, dtype=props.dtype) 

469 for idx, frame in enumerate( 

470 self.iter( 

471 format=format, 

472 filter_sequence=filter_sequence, 

473 filter_graph=filter_graph, 

474 thread_count=thread_count, 

475 thread_type=thread_type or "FRAME", 

476 ) 

477 ): 

478 frames[idx] = frame 

479 else: 

480 frames = np.stack( 

481 [ 

482 x 

483 for x in self.iter( 

484 format=format, 

485 filter_sequence=filter_sequence, 

486 filter_graph=filter_graph, 

487 thread_count=thread_count, 

488 thread_type=thread_type or "FRAME", 

489 ) 

490 ] 

491 ) 

492 

493 # reset stream container, because threading model can't change after 

494 # first access 

495 self._video_stream = self._container.streams.video[0] 

496 

497 return frames 

498 

499 if thread_type is not None and not ( 

500 self._video_stream.thread_type == thread_type 

501 or self._video_stream.thread_type.name == thread_type 

502 ): 

503 self._video_stream.thread_type = thread_type 

504 

505 if ( 

506 thread_count != 0 

507 and thread_count != self._video_stream.codec_context.thread_count 

508 ): 

509 # in FFMPEG thread_count == 0 means use the default count, which we 

510 # change to mean don't change the thread count. 

511 self._video_stream.codec_context.thread_count = thread_count 

512 

513 if constant_framerate is None: 

514 # "variable_fps" is now a flag (handle got removed). Full list at 

515 # https://pyav.org/docs/stable/api/container.html#module-av.format 

516 variable_fps = bool(self._container.format.flags & 0x400) 

517 constant_framerate = not variable_fps 

518 

519 # note: cheap for contigous incremental reads 

520 self._seek(index, constant_framerate=constant_framerate) 

521 desired_frame = next(self._decoder) 

522 self._next_idx += 1 

523 

524 self.set_video_filter(filter_sequence, filter_graph) 

525 if self._video_filter is not None: 

526 desired_frame = self._video_filter.send(desired_frame) 

527 

528 return self._unpack_frame(desired_frame, format=format) 

529 

530 def iter( 

531 self, 

532 *, 

533 format: str = "rgb24", 

534 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

535 filter_graph: Tuple[dict, List] = None, 

536 thread_count: int = 0, 

537 thread_type: str = None, 

538 ) -> np.ndarray: 

539 """Yield frames from the video. 

540 

541 Parameters 

542 ---------- 

543 frame : np.ndarray 

544 A numpy array containing loaded frame data. 

545 format : str 

546 Convert the data into the given format before returning it. If None, 

547 return the data in the encoded format if it can be expressed as a 

548 strided array; otherwise raise an Exception. 

549 filter_sequence : List[str, str, dict] 

550 Set the returned colorspace. If not None (default: rgb24), convert 

551 the data into the given format before returning it. If ``None`` 

552 return the data in the encoded format if it can be expressed as a 

553 strided array; otherwise raise an Exception. 

554 filter_graph : (dict, List) 

555 If not None, apply the given graph of FFmpeg filters to each 

556 ndimage. The graph is given as a tuple of two dicts. The first dict 

557 contains a (named) set of nodes, and the second dict contains a set 

558 of edges between nodes of the previous dict. Check the (module-level) 

559 plugin docs for details and examples. 

560 thread_count : int 

561 How many threads to use when decoding a frame. The default is 0, 

562 which will set the number using ffmpeg's default, which is based on 

563 the codec, number of available cores, threadding model, and other 

564 considerations. 

565 thread_type : str 

566 The threading model to be used. One of 

567 

568 - `"SLICE"` (default): threads assemble parts of the current frame 

569 - `"FRAME"`: threads may assemble future frames (faster for bulk reading) 

570 

571 

572 Yields 

573 ------ 

574 frame : np.ndarray 

575 A (decoded) video frame. 

576 

577 

578 """ 

579 

580 self._video_stream.thread_type = thread_type or "SLICE" 

581 self._video_stream.codec_context.thread_count = thread_count 

582 

583 self.set_video_filter(filter_sequence, filter_graph) 

584 

585 for frame in self._decoder: 

586 self._next_idx += 1 

587 

588 if self._video_filter is not None: 

589 try: 

590 frame = self._video_filter.send(frame) 

591 except StopIteration: 

592 break 

593 

594 if frame is None: 

595 continue 

596 

597 yield self._unpack_frame(frame, format=format) 

598 

599 if self._video_filter is not None: 

600 for frame in self._video_filter: 

601 yield self._unpack_frame(frame, format=format) 

602 

603 def write( 

604 self, 

605 ndimage: Union[np.ndarray, List[np.ndarray]], 

606 *, 

607 codec: str = None, 

608 is_batch: bool = True, 

609 fps: int = 24, 

610 in_pixel_format: str = "rgb24", 

611 out_pixel_format: str = None, 

612 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

613 filter_graph: Tuple[dict, List] = None, 

614 ) -> Optional[bytes]: 

615 """Save a ndimage as a video. 

616 

617 Given a batch of frames (stacked along the first axis) or a list of 

618 frames, encode them and add the result to the ImageResource. 

619 

620 Parameters 

621 ---------- 

622 ndimage : ArrayLike, List[ArrayLike] 

623 The ndimage to encode and write to the ImageResource. 

624 codec : str 

625 The codec to use when encoding frames. Only needed on first write 

626 and ignored on subsequent writes. 

627 is_batch : bool 

628 If True (default), the ndimage is a batch of images, otherwise it is 

629 a single image. This parameter has no effect on lists of ndimages. 

630 fps : str 

631 The resulting videos frames per second. 

632 in_pixel_format : str 

633 The pixel format of the incoming ndarray. Defaults to "rgb24" and can 

634 be any stridable pix_fmt supported by FFmpeg. 

635 out_pixel_format : str 

636 The pixel format to use while encoding frames. If None (default) 

637 use the codec's default. 

638 filter_sequence : List[str, str, dict] 

639 If not None, apply the given sequence of FFmpeg filters to each 

640 ndimage. Check the (module-level) plugin docs for details and 

641 examples. 

642 filter_graph : (dict, List) 

643 If not None, apply the given graph of FFmpeg filters to each 

644 ndimage. The graph is given as a tuple of two dicts. The first dict 

645 contains a (named) set of nodes, and the second dict contains a set 

646 of edges between nodes of the previous dict. Check the (module-level) 

647 plugin docs for details and examples. 

648 

649 Returns 

650 ------- 

651 encoded_image : bytes or None 

652 If the chosen ImageResource is the special target ``"<bytes>"`` then 

653 write will return a byte string containing the encoded image data. 

654 Otherwise, it returns None. 

655 

656 Notes 

657 ----- 

658 When writing ``<bytes>``, the video is finalized immediately after the 

659 first write call and calling write multiple times to append frames is 

660 not possible. 

661 

662 """ 

663 

664 if isinstance(ndimage, list): 

665 # frames shapes must agree for video 

666 if any(f.shape != ndimage[0].shape for f in ndimage): 

667 raise ValueError("All frames should have the same shape") 

668 elif not is_batch: 

669 ndimage = np.asarray(ndimage)[None, ...] 

670 else: 

671 ndimage = np.asarray(ndimage) 

672 

673 if self._video_stream is None: 

674 self.init_video_stream(codec, fps=fps, pixel_format=out_pixel_format) 

675 

676 self.set_video_filter(filter_sequence, filter_graph) 

677 

678 for img in ndimage: 

679 self.write_frame(img, pixel_format=in_pixel_format) 

680 

681 if self.request._uri_type == URI_BYTES: 

682 # bytes are immutuable, so we have to flush immediately 

683 # and can't support appending 

684 self._flush_writer() 

685 self._container.close() 

686 

687 return self.request.get_file().getvalue() 

688 

689 def properties(self, index: int = ..., *, format: str = "rgb24") -> ImageProperties: 

690 """Standardized ndimage metadata. 

691 

692 Parameters 

693 ---------- 

694 index : int 

695 The index of the ndimage for which to return properties. If ``...`` 

696 (Ellipsis, default), return the properties for the resulting batch 

697 of frames. 

698 format : str 

699 If not None (default: rgb24), convert the data into the given format 

700 before returning it. If None return the data in the encoded format 

701 if that can be expressed as a strided array; otherwise raise an 

702 Exception. 

703 

704 Returns 

705 ------- 

706 properties : ImageProperties 

707 A dataclass filled with standardized image metadata. 

708 

709 Notes 

710 ----- 

711 This function is efficient and won't process any pixel data. 

712 

713 The provided metadata does not include modifications by any filters 

714 (through ``filter_sequence`` or ``filter_graph``). 

715 

716 """ 

717 

718 video_width = self._video_stream.codec_context.width 

719 video_height = self._video_stream.codec_context.height 

720 pix_format = format or self._video_stream.codec_context.pix_fmt 

721 frame_template = av.VideoFrame(video_width, video_height, pix_format) 

722 

723 shape = _get_frame_shape(frame_template) 

724 if index is ...: 

725 n_frames = self._video_stream.frames 

726 shape = (n_frames,) + shape 

727 

728 return ImageProperties( 

729 shape=tuple(shape), 

730 dtype=_format_to_dtype(frame_template.format), 

731 n_images=shape[0] if index is ... else None, 

732 is_batch=index is ..., 

733 ) 

734 

735 def metadata( 

736 self, 

737 index: int = ..., 

738 exclude_applied: bool = True, 

739 constant_framerate: bool = None, 

740 ) -> Dict[str, Any]: 

741 """Format-specific metadata. 

742 

743 Returns a dictionary filled with metadata that is either stored in the 

744 container, the video stream, or the frame's side-data. 

745 

746 Parameters 

747 ---------- 

748 index : int 

749 If ... (Ellipsis, default) return global metadata (the metadata 

750 stored in the container and video stream). If not ..., return the 

751 side data stored in the frame at the given index. 

752 exclude_applied : bool 

753 Currently, this parameter has no effect. It exists for compliance with 

754 the ImageIO v3 API. 

755 constant_framerate : bool 

756 If True assume the video's framerate is constant. This allows for 

757 faster seeking inside the file. If False, the video is reset before 

758 each read and searched from the beginning. If None (default), this 

759 value will be read from the container format. 

760 

761 Returns 

762 ------- 

763 metadata : dict 

764 A dictionary filled with format-specific metadata fields and their 

765 values. 

766 

767 """ 

768 

769 metadata = dict() 

770 

771 if index is ...: 

772 # useful flags defined on the container and/or video stream 

773 metadata.update( 

774 { 

775 "video_format": self._video_stream.codec_context.pix_fmt, 

776 "codec": self._video_stream.codec.name, 

777 "long_codec": self._video_stream.codec.long_name, 

778 "profile": self._video_stream.profile, 

779 "fps": float(self._video_stream.guessed_rate), 

780 } 

781 ) 

782 if self._video_stream.duration is not None: 

783 duration = float( 

784 self._video_stream.duration * self._video_stream.time_base 

785 ) 

786 metadata.update({"duration": duration}) 

787 

788 metadata.update(self.container_metadata) 

789 metadata.update(self.video_stream_metadata) 

790 return metadata 

791 

792 if constant_framerate is None: 

793 # "variable_fps" is now a flag (handle got removed). Full list at 

794 # https://pyav.org/docs/stable/api/container.html#module-av.format 

795 variable_fps = bool(self._container.format.flags & 0x400) 

796 constant_framerate = not variable_fps 

797 

798 self._seek(index, constant_framerate=constant_framerate) 

799 desired_frame = next(self._decoder) 

800 self._next_idx += 1 

801 

802 # useful flags defined on the frame 

803 metadata.update( 

804 { 

805 "key_frame": bool(desired_frame.key_frame), 

806 "time": desired_frame.time, 

807 "interlaced_frame": bool(desired_frame.interlaced_frame), 

808 "frame_type": _get_frame_type(desired_frame.pict_type), 

809 } 

810 ) 

811 

812 # side data 

813 metadata.update( 

814 {item.type.name: bytes(item) for item in desired_frame.side_data} 

815 ) 

816 

817 return metadata 

818 

819 def close(self) -> None: 

820 """Close the Video.""" 

821 

822 is_write = self.request.mode.io_mode == IOMode.write 

823 if is_write and self._video_stream is not None: 

824 self._flush_writer() 

825 

826 if self._video_stream is not None: 

827 self._video_stream = None 

828 

829 if self._container is not None: 

830 self._container.close() 

831 

832 self.request.finish() 

833 

834 def __enter__(self) -> "PyAVPlugin": 

835 return super().__enter__() 

836 

837 # ------------------------------ 

838 # Add-on Interface inside imopen 

839 # ------------------------------ 

840 

841 def init_video_stream( 

842 self, 

843 codec: str, 

844 *, 

845 fps: float = 24, 

846 pixel_format: str = None, 

847 max_keyframe_interval: int = None, 

848 force_keyframes: bool = None, 

849 ) -> None: 

850 """Initialize a new video stream. 

851 

852 This function adds a new video stream to the ImageResource using the 

853 selected encoder (codec), framerate, and colorspace. 

854 

855 Parameters 

856 ---------- 

857 codec : str 

858 The codec to use, e.g. ``"h264"`` or ``"vp9"``. 

859 fps : float 

860 The desired framerate of the video stream (frames per second). 

861 pixel_format : str 

862 The pixel format to use while encoding frames. If None (default) use 

863 the codec's default. 

864 max_keyframe_interval : int 

865 The maximum distance between two intra frames (I-frames). Also known 

866 as GOP size. If unspecified use the codec's default. Note that not 

867 every I-frame is a keyframe; see the notes for details. 

868 force_keyframes : bool 

869 If True, limit inter frames dependency to frames within the current 

870 keyframe interval (GOP), i.e., force every I-frame to be a keyframe. 

871 If unspecified, use the codec's default. 

872 

873 Notes 

874 ----- 

875 You can usually leave ``max_keyframe_interval`` and ``force_keyframes`` 

876 at their default values, unless you try to generate seek-optimized video 

877 or have a similar specialist use-case. In this case, ``force_keyframes`` 

878 controls the ability to seek to _every_ I-frame, and 

879 ``max_keyframe_interval`` controls how close to a random frame you can 

880 seek. Low values allow more fine-grained seek at the expense of 

881 file-size (and thus I/O performance). 

882 

883 """ 

884 

885 fps = Fraction.from_float(fps) 

886 stream = self._container.add_stream(codec, fps) 

887 stream.time_base = Fraction(1 / fps).limit_denominator(int(2**16 - 1)) 

888 if pixel_format is not None: 

889 stream.pix_fmt = pixel_format 

890 if max_keyframe_interval is not None: 

891 stream.gop_size = max_keyframe_interval 

892 if force_keyframes is not None: 

893 if force_keyframes: 

894 stream.codec_context.flags |= Flags.closed_gop 

895 else: 

896 stream.codec_context.flags &= ~Flags.closed_gop 

897 

898 self._video_stream = stream 

899 

900 def write_frame(self, frame: np.ndarray, *, pixel_format: str = "rgb24") -> None: 

901 """Add a frame to the video stream. 

902 

903 This function appends a new frame to the video. It assumes that the 

904 stream previously has been initialized. I.e., ``init_video_stream`` has 

905 to be called before calling this function for the write to succeed. 

906 

907 Parameters 

908 ---------- 

909 frame : np.ndarray 

910 The image to be appended/written to the video stream. 

911 pixel_format : str 

912 The colorspace (pixel format) of the incoming frame. 

913 

914 Notes 

915 ----- 

916 Frames may be held in a buffer, e.g., by the filter pipeline used during 

917 writing or by FFMPEG to batch them prior to encoding. Make sure to 

918 ``.close()`` the plugin or to use a context manager to ensure that all 

919 frames are written to the ImageResource. 

920 

921 """ 

922 

923 # manual packing of ndarray into frame 

924 # (this should live in pyAV, but it doesn't support all the formats we 

925 # want and PRs there are slow) 

926 pixel_format = av.VideoFormat(pixel_format) 

927 img_dtype = _format_to_dtype(pixel_format) 

928 width = frame.shape[2 if pixel_format.is_planar else 1] 

929 height = frame.shape[1 if pixel_format.is_planar else 0] 

930 av_frame = av.VideoFrame(width, height, pixel_format.name) 

931 if pixel_format.is_planar: 

932 for idx, plane in enumerate(av_frame.planes): 

933 plane_array = np.frombuffer(plane, dtype=img_dtype) 

934 plane_array = as_strided( 

935 plane_array, 

936 shape=(plane.height, plane.width), 

937 strides=(plane.line_size, img_dtype.itemsize), 

938 ) 

939 plane_array[...] = frame[idx] 

940 else: 

941 if pixel_format.name.startswith("bayer_"): 

942 # ffmpeg doesn't describe bayer formats correctly 

943 # see https://github.com/imageio/imageio/issues/761#issuecomment-1059318851 

944 # and following for details. 

945 n_channels = 1 

946 else: 

947 n_channels = len(pixel_format.components) 

948 

949 plane = av_frame.planes[0] 

950 plane_shape = (plane.height, plane.width) 

951 plane_strides = (plane.line_size, n_channels * img_dtype.itemsize) 

952 if n_channels > 1: 

953 plane_shape += (n_channels,) 

954 plane_strides += (img_dtype.itemsize,) 

955 

956 plane_array = as_strided( 

957 np.frombuffer(plane, dtype=img_dtype), 

958 shape=plane_shape, 

959 strides=plane_strides, 

960 ) 

961 plane_array[...] = frame 

962 

963 stream = self._video_stream 

964 av_frame.time_base = stream.codec_context.time_base 

965 av_frame.pts = self.frames_written 

966 self.frames_written += 1 

967 

968 if self._video_filter is not None: 

969 av_frame = self._video_filter.send(av_frame) 

970 if av_frame is None: 

971 return 

972 

973 if stream.frames == 0: 

974 stream.width = av_frame.width 

975 stream.height = av_frame.height 

976 

977 for packet in stream.encode(av_frame): 

978 self._container.mux(packet) 

979 

980 def set_video_filter( 

981 self, 

982 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

983 filter_graph: Tuple[dict, List] = None, 

984 ) -> None: 

985 """Set the filter(s) to use. 

986 

987 This function creates a new FFMPEG filter graph to use when reading or 

988 writing video. In the case of reading, frames are passed through the 

989 filter graph before begin returned and, in case of writing, frames are 

990 passed through the filter before being written to the video. 

991 

992 Parameters 

993 ---------- 

994 filter_sequence : List[str, str, dict] 

995 If not None, apply the given sequence of FFmpeg filters to each 

996 ndimage. Check the (module-level) plugin docs for details and 

997 examples. 

998 filter_graph : (dict, List) 

999 If not None, apply the given graph of FFmpeg filters to each 

1000 ndimage. The graph is given as a tuple of two dicts. The first dict 

1001 contains a (named) set of nodes, and the second dict contains a set 

1002 of edges between nodes of the previous dict. Check the 

1003 (module-level) plugin docs for details and examples. 

1004 

1005 Notes 

1006 ----- 

1007 Changing a filter graph with lag during reading or writing will 

1008 currently cause frames in the filter queue to be lost. 

1009 

1010 """ 

1011 

1012 if filter_sequence is None and filter_graph is None: 

1013 self._video_filter = None 

1014 return 

1015 

1016 if filter_sequence is None: 

1017 filter_sequence = list() 

1018 

1019 node_descriptors: Dict[str, Tuple[str, Union[str, Dict]]] 

1020 edges: List[Tuple[str, str, int, int]] 

1021 if filter_graph is None: 

1022 node_descriptors, edges = dict(), [("video_in", "video_out", 0, 0)] 

1023 else: 

1024 node_descriptors, edges = filter_graph 

1025 

1026 graph = av.filter.Graph() 

1027 

1028 previous_node = graph.add_buffer(template=self._video_stream) 

1029 for filter_name, argument in filter_sequence: 

1030 if isinstance(argument, str): 

1031 current_node = graph.add(filter_name, argument) 

1032 else: 

1033 current_node = graph.add(filter_name, **argument) 

1034 previous_node.link_to(current_node) 

1035 previous_node = current_node 

1036 

1037 nodes = dict() 

1038 nodes["video_in"] = previous_node 

1039 nodes["video_out"] = graph.add("buffersink") 

1040 for name, (filter_name, arguments) in node_descriptors.items(): 

1041 if isinstance(arguments, str): 

1042 nodes[name] = graph.add(filter_name, arguments) 

1043 else: 

1044 nodes[name] = graph.add(filter_name, **arguments) 

1045 

1046 for from_note, to_node, out_idx, in_idx in edges: 

1047 nodes[from_note].link_to(nodes[to_node], out_idx, in_idx) 

1048 

1049 graph.configure() 

1050 

1051 def video_filter(): 

1052 # this starts a co-routine 

1053 # send frames using graph.send() 

1054 frame = yield None 

1055 

1056 # send and receive frames in "parallel" 

1057 while frame is not None: 

1058 graph.push(frame) 

1059 try: 

1060 frame = yield graph.pull() 

1061 except av.error.BlockingIOError: 

1062 # filter has lag and needs more frames 

1063 frame = yield None 

1064 except av.error.EOFError: 

1065 break 

1066 

1067 try: 

1068 # send EOF in av>=9.0 

1069 graph.push(None) 

1070 except ValueError: # pragma: no cover 

1071 # handle av<9.0 

1072 pass 

1073 

1074 # all frames have been sent, empty the filter 

1075 while True: 

1076 try: 

1077 yield graph.pull() 

1078 except av.error.EOFError: 

1079 break # EOF 

1080 except av.error.BlockingIOError: # pragma: no cover 

1081 # handle av<9.0 

1082 break 

1083 

1084 self._video_filter = video_filter() 

1085 self._video_filter.send(None) 

1086 

1087 @property 

1088 def container_metadata(self): 

1089 """Container-specific metadata. 

1090 

1091 A dictionary containing metadata stored at the container level. 

1092 

1093 """ 

1094 return self._container.metadata 

1095 

1096 @property 

1097 def video_stream_metadata(self): 

1098 """Stream-specific metadata. 

1099 

1100 A dictionary containing metadata stored at the stream level. 

1101 

1102 """ 

1103 return self._video_stream.metadata 

1104 

1105 # ------------------------------- 

1106 # Internals and private functions 

1107 # ------------------------------- 

1108 

1109 def _unpack_frame(self, frame: av.VideoFrame, *, format: str = None) -> np.ndarray: 

1110 """Convert a av.VideoFrame into a ndarray 

1111 

1112 Parameters 

1113 ---------- 

1114 frame : av.VideoFrame 

1115 The frame to unpack. 

1116 format : str 

1117 If not None, convert the frame to the given format before unpacking. 

1118 

1119 """ 

1120 

1121 if format is not None: 

1122 frame = frame.reformat(format=format) 

1123 

1124 dtype = _format_to_dtype(frame.format) 

1125 shape = _get_frame_shape(frame) 

1126 

1127 planes = list() 

1128 for idx in range(len(frame.planes)): 

1129 n_channels = sum( 

1130 [ 

1131 x.bits // (dtype.itemsize * 8) 

1132 for x in frame.format.components 

1133 if x.plane == idx 

1134 ] 

1135 ) 

1136 av_plane = frame.planes[idx] 

1137 plane_shape = (av_plane.height, av_plane.width) 

1138 plane_strides = (av_plane.line_size, n_channels * dtype.itemsize) 

1139 if n_channels > 1: 

1140 plane_shape += (n_channels,) 

1141 plane_strides += (dtype.itemsize,) 

1142 

1143 np_plane = as_strided( 

1144 np.frombuffer(av_plane, dtype=dtype), 

1145 shape=plane_shape, 

1146 strides=plane_strides, 

1147 ) 

1148 planes.append(np_plane) 

1149 

1150 if len(planes) > 1: 

1151 # Note: the planes *should* exist inside a contigous memory block 

1152 # somewhere inside av.Frame however pyAV does not appear to expose this, 

1153 # so we are forced to copy the planes individually instead of wrapping 

1154 # them :( 

1155 out = np.concatenate(planes).reshape(shape) 

1156 else: 

1157 out = planes[0] 

1158 

1159 return out 

1160 

1161 def _seek(self, index, *, constant_framerate: bool = True) -> Generator: 

1162 """Seeks to the frame at the given index.""" 

1163 

1164 if index == self._next_idx: 

1165 return # fast path :) 

1166 

1167 # we must decode at least once before we seek otherwise the 

1168 # returned frames become corrupt. 

1169 if self._next_idx == 0: 

1170 next(self._decoder) 

1171 self._next_idx += 1 

1172 

1173 if index == self._next_idx: 

1174 return # fast path :) 

1175 

1176 # remove this branch until I find a way to efficiently find the next 

1177 # keyframe. keeping this as a reminder 

1178 # if self._next_idx < index and index < self._next_keyframe_idx: 

1179 # frames_to_yield = index - self._next_idx 

1180 if not constant_framerate and index > self._next_idx: 

1181 frames_to_yield = index - self._next_idx 

1182 elif not constant_framerate: 

1183 # seek backwards and can't link idx and pts 

1184 self._container.seek(0) 

1185 self._decoder = self._container.decode(video=0) 

1186 self._next_idx = 0 

1187 

1188 frames_to_yield = index 

1189 else: 

1190 # we know that the time between consecutive frames is constant 

1191 # hence we can link index and pts 

1192 

1193 # how many pts lie between two frames 

1194 sec_delta = 1 / self._video_stream.guessed_rate 

1195 pts_delta = sec_delta / self._video_stream.time_base 

1196 

1197 index_pts = int(index * pts_delta) 

1198 

1199 # this only seeks to the closed (preceeding) keyframe 

1200 self._container.seek(index_pts, stream=self._video_stream) 

1201 self._decoder = self._container.decode(video=0) 

1202 

1203 # this may be made faster if we could get the keyframe's time without 

1204 # decoding it 

1205 keyframe = next(self._decoder) 

1206 keyframe_time = keyframe.pts * keyframe.time_base 

1207 keyframe_pts = int(keyframe_time / self._video_stream.time_base) 

1208 keyframe_index = keyframe_pts // pts_delta 

1209 

1210 self._container.seek(index_pts, stream=self._video_stream) 

1211 self._next_idx = keyframe_index 

1212 

1213 frames_to_yield = index - keyframe_index 

1214 

1215 for _ in range(frames_to_yield): 

1216 next(self._decoder) 

1217 self._next_idx += 1 

1218 

1219 def _flush_writer(self): 

1220 """Flush the filter and encoder 

1221 

1222 This will reset the filter to `None` and send EoF to the encoder, 

1223 i.e., after calling, no more frames may be written. 

1224 

1225 """ 

1226 

1227 stream = self._video_stream 

1228 

1229 if self._video_filter is not None: 

1230 # flush encoder 

1231 for av_frame in self._video_filter: 

1232 if stream.frames == 0: 

1233 stream.width = av_frame.width 

1234 stream.height = av_frame.height 

1235 for packet in stream.encode(av_frame): 

1236 self._container.mux(packet) 

1237 self._video_filter = None 

1238 

1239 # flush stream 

1240 for packet in stream.encode(): 

1241 self._container.mux(packet) 

1242 self._video_stream = None