Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.8/site-packages/imageio-2.35.1-py3.8.egg/imageio/plugins/pyav.py: 1%

Shortcuts on this page

r m x   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

332 statements  

1"""Read/Write Videos (and images) using PyAV. 

2 

3.. note:: 

4 To use this plugin you need to have `PyAV <https://pyav.org/docs/stable/>`_ 

5 installed:: 

6 

7 pip install av 

8 

9This plugin wraps pyAV, a pythonic binding for the FFMPEG library. It is similar 

10to our FFMPEG plugin, has improved performance, features a robust interface, and 

11aims to supersede the FFMPEG plugin in the future. 

12 

13 

14Methods 

15------- 

16.. note:: 

17 Check the respective function for a list of supported kwargs and detailed 

18 documentation. 

19 

20.. autosummary:: 

21 :toctree: 

22 

23 PyAVPlugin.read 

24 PyAVPlugin.iter 

25 PyAVPlugin.write 

26 PyAVPlugin.properties 

27 PyAVPlugin.metadata 

28 

29Additional methods available inside the :func:`imopen <imageio.v3.imopen>` 

30context: 

31 

32.. autosummary:: 

33 :toctree: 

34 

35 PyAVPlugin.init_video_stream 

36 PyAVPlugin.write_frame 

37 PyAVPlugin.set_video_filter 

38 PyAVPlugin.container_metadata 

39 PyAVPlugin.video_stream_metadata 

40 

41Advanced API 

42------------ 

43 

44In addition to the default ImageIO v3 API this plugin exposes custom functions 

45that are specific to reading/writing video and its metadata. These are available 

46inside the :func:`imopen <imageio.v3.imopen>` context and allow fine-grained 

47control over how the video is processed. The functions are documented above and 

48below you can find a usage example:: 

49 

50 import imageio.v3 as iio 

51 

52 with iio.imopen("test.mp4", "w", plugin="pyav") as file: 

53 file.init_video_stream("libx264") 

54 file.container_metadata["comment"] = "This video was created using ImageIO." 

55 

56 for _ in range(5): 

57 for frame in iio.imiter("imageio:newtonscradle.gif"): 

58 file.write_frame(frame) 

59 

60 meta = iio.immeta("test.mp4", plugin="pyav") 

61 assert meta["comment"] == "This video was created using ImageIO." 

62 

63 

64 

65Pixel Formats (Colorspaces) 

66--------------------------- 

67 

68By default, this plugin converts the video into 8-bit RGB (called ``rgb24`` in 

69ffmpeg). This is a useful behavior for many use-cases, but sometimes you may 

70want to use the video's native colorspace or you may wish to convert the video 

71into an entirely different colorspace. This is controlled using the ``format`` 

72kwarg. You can use ``format=None`` to leave the image in its native colorspace 

73or specify any colorspace supported by FFMPEG as long as it is stridable, i.e., 

74as long as it can be represented by a single numpy array. Some useful choices 

75include: 

76 

77- rgb24 (default; 8-bit RGB) 

78- rgb48le (16-bit lower-endian RGB) 

79- bgr24 (8-bit BGR; openCVs default colorspace) 

80- gray (8-bit grayscale) 

81- yuv444p (8-bit channel-first YUV) 

82 

83Further, FFMPEG maintains a list of available formats, albeit not as part of the 

84narrative docs. It can be `found here 

85<https://ffmpeg.org/doxygen/trunk/pixfmt_8h_source.html>`_ (warning: C source 

86code). 

87 

88Filters 

89------- 

90 

91On top of providing basic read/write functionality, this plugin allows you to 

92use the full collection of `video filters available in FFMPEG 

93<https://ffmpeg.org/ffmpeg-filters.html#Video-Filters>`_. This means that you 

94can apply excessive preprocessing to your video before retrieving it as a numpy 

95array or apply excessive post-processing before you encode your data. 

96 

97Filters come in two forms: sequences or graphs. Filter sequences are, as the 

98name suggests, sequences of filters that are applied one after the other. They 

99are specified using the ``filter_sequence`` kwarg. Filter graphs, on the other 

100hand, come in the form of a directed graph and are specified using the 

101``filter_graph`` kwarg. 

102 

103.. note:: 

104 All filters are either sequences or graphs. If all you want is to apply a 

105 single filter, you can do this by specifying a filter sequence with a single 

106 entry. 

107 

108A ``filter_sequence`` is a list of filters, each defined through a 2-element 

109tuple of the form ``(filter_name, filter_parameters)``. The first element of the 

110tuple is the name of the filter. The second element are the filter parameters, 

111which can be given either as a string or a dict. The string matches the same 

112format that you would use when specifying the filter using the ffmpeg 

113command-line tool and the dict has entries of the form ``parameter:value``. For 

114example:: 

115 

116 import imageio.v3 as iio 

117 

118 # using a filter_parameters str 

119 img1 = iio.imread( 

120 "imageio:cockatoo.mp4", 

121 plugin="pyav", 

122 filter_sequence=[ 

123 ("rotate", "45*PI/180") 

124 ] 

125 ) 

126 

127 # using a filter_parameters dict 

128 img2 = iio.imread( 

129 "imageio:cockatoo.mp4", 

130 plugin="pyav", 

131 filter_sequence=[ 

132 ("rotate", {"angle":"45*PI/180", "fillcolor":"AliceBlue"}) 

133 ] 

134 ) 

135 

136A ``filter_graph``, on the other hand, is specified using a ``(nodes, edges)`` 

137tuple. It is best explained using an example:: 

138 

139 img = iio.imread( 

140 "imageio:cockatoo.mp4", 

141 plugin="pyav", 

142 filter_graph=( 

143 { 

144 "split": ("split", ""), 

145 "scale_overlay":("scale", "512:-1"), 

146 "overlay":("overlay", "x=25:y=25:enable='between(t,1,8)'"), 

147 }, 

148 [ 

149 ("video_in", "split", 0, 0), 

150 ("split", "overlay", 0, 0), 

151 ("split", "scale_overlay", 1, 0), 

152 ("scale_overlay", "overlay", 0, 1), 

153 ("overlay", "video_out", 0, 0), 

154 ] 

155 ) 

156 ) 

157 

158The above transforms the video to have picture-in-picture of itself in the top 

159left corner. As you can see, nodes are specified using a dict which has names as 

160its keys and filter tuples as values; the same tuples as the ones used when 

161defining a filter sequence. Edges are a list of a 4-tuples of the form 

162``(node_out, node_in, output_idx, input_idx)`` and specify which two filters are 

163connected and which inputs/outputs should be used for this. 

164 

165Further, there are two special nodes in a filter graph: ``video_in`` and 

166``video_out``, which represent the graph's input and output respectively. These 

167names can not be chosen for other nodes (those nodes would simply be 

168overwritten), and for a graph to be valid there must be a path from the input to 

169the output and all nodes in the graph must be connected. 

170 

171While most graphs are quite simple, they can become very complex and we 

172recommend that you read through the `FFMPEG documentation 

173<https://ffmpeg.org/ffmpeg-filters.html#Filtergraph-description>`_ and their 

174examples to better understand how to use them. 

175 

176""" 

177 

178from fractions import Fraction 

179from math import ceil 

180from typing import Any, Dict, List, Optional, Tuple, Union, Generator 

181 

182import av 

183import av.filter 

184import numpy as np 

185from numpy.lib.stride_tricks import as_strided 

186 

187from ..core import Request 

188from ..core.request import URI_BYTES, InitializationError, IOMode 

189from ..core.v3_plugin_api import ImageProperties, PluginV3 

190 

191 

192def _format_to_dtype(format: av.VideoFormat) -> np.dtype: 

193 """Convert a pyAV video format into a numpy dtype""" 

194 

195 if len(format.components) == 0: 

196 # fake format 

197 raise ValueError( 

198 f"Can't determine dtype from format `{format.name}`. It has no channels." 

199 ) 

200 

201 endian = ">" if format.is_big_endian else "<" 

202 dtype = "f" if "f32" in format.name else "u" 

203 bits_per_channel = [x.bits for x in format.components] 

204 n_bytes = str(int(ceil(bits_per_channel[0] / 8))) 

205 

206 return np.dtype(endian + dtype + n_bytes) 

207 

208 

209def _get_frame_shape(frame: av.VideoFrame) -> Tuple[int, ...]: 

210 """Compute the frame's array shape 

211 

212 Parameters 

213 ---------- 

214 frame : av.VideoFrame 

215 A frame for which the resulting shape should be computed. 

216 

217 Returns 

218 ------- 

219 shape : Tuple[int, ...] 

220 A tuple describing the shape of the image data in the frame. 

221 

222 """ 

223 

224 widths = [component.width for component in frame.format.components] 

225 heights = [component.height for component in frame.format.components] 

226 bits = np.array([component.bits for component in frame.format.components]) 

227 line_sizes = [plane.line_size for plane in frame.planes] 

228 

229 subsampled_width = widths[:-1] != widths[1:] 

230 subsampled_height = heights[:-1] != heights[1:] 

231 unaligned_components = np.any(bits % 8 != 0) or (line_sizes[:-1] != line_sizes[1:]) 

232 if subsampled_width or subsampled_height or unaligned_components: 

233 raise IOError( 

234 f"{frame.format.name} can't be expressed as a strided array." 

235 "Use `format=` to select a format to convert into." 

236 ) 

237 

238 shape = [frame.height, frame.width] 

239 

240 # ffmpeg doesn't have a notion of channel-first or channel-last formats 

241 # instead it stores frames in one or more planes which contain individual 

242 # components of a pixel depending on the pixel format. For channel-first 

243 # formats each component lives on a separate plane (n_planes) and for 

244 # channel-last formats all components are packed on a single plane 

245 # (n_channels) 

246 n_planes = max([component.plane for component in frame.format.components]) + 1 

247 if n_planes > 1: 

248 shape = [n_planes] + shape 

249 

250 channels_per_plane = [0] * n_planes 

251 for component in frame.format.components: 

252 channels_per_plane[component.plane] += 1 

253 n_channels = max(channels_per_plane) 

254 

255 if n_channels > 1: 

256 shape = shape + [n_channels] 

257 

258 return tuple(shape) 

259 

260 

261class PyAVPlugin(PluginV3): 

262 """Support for pyAV as backend. 

263 

264 Parameters 

265 ---------- 

266 request : iio.Request 

267 A request object that represents the users intent. It provides a 

268 standard interface to access various the various ImageResources and 

269 serves them to the plugin as a file object (or file). Check the docs for 

270 details. 

271 container : str 

272 Only used during `iio_mode="w"`! If not None, overwrite the default container 

273 format chosen by pyav. 

274 kwargs : Any 

275 Additional kwargs are forwarded to PyAV's constructor. 

276 

277 """ 

278 

279 def __init__(self, request: Request, *, container: str = None, **kwargs) -> None: 

280 """Initialize a new Plugin Instance. 

281 

282 See Plugin's docstring for detailed documentation. 

283 

284 Notes 

285 ----- 

286 The implementation here stores the request as a local variable that is 

287 exposed using a @property below. If you inherit from PluginV3, remember 

288 to call ``super().__init__(request)``. 

289 

290 """ 

291 

292 super().__init__(request) 

293 

294 self._container = None 

295 self._video_stream = None 

296 self._video_filter = None 

297 

298 if request.mode.io_mode == IOMode.read: 

299 self._next_idx = 0 

300 try: 

301 if request._uri_type == 5: # 5 is the value of URI_HTTP 

302 # pyav should read from HTTP by itself. This enables reading 

303 # HTTP-based streams like DASH. Note that solving streams 

304 # like this is temporary until the new request object gets 

305 # implemented. 

306 self._container = av.open(request.raw_uri, **kwargs) 

307 else: 

308 self._container = av.open(request.get_file(), **kwargs) 

309 self._video_stream = self._container.streams.video[0] 

310 self._decoder = self._container.decode(video=0) 

311 except av.AVError: 

312 if isinstance(request.raw_uri, bytes): 

313 msg = "PyAV does not support these `<bytes>`" 

314 else: 

315 msg = f"PyAV does not support `{request.raw_uri}`" 

316 raise InitializationError(msg) from None 

317 else: 

318 self.frames_written = 0 

319 file_handle = self.request.get_file() 

320 filename = getattr(file_handle, "name", None) 

321 extension = self.request.extension or self.request.format_hint 

322 if extension is None: 

323 raise InitializationError("Can't determine output container to use.") 

324 

325 # hacky, but beats running our own format selection logic 

326 # (since av_guess_format is not exposed) 

327 try: 

328 setattr(file_handle, "name", filename or "tmp" + extension) 

329 except AttributeError: 

330 pass # read-only, nothing we can do 

331 

332 try: 

333 self._container = av.open( 

334 file_handle, mode="w", format=container, **kwargs 

335 ) 

336 except ValueError: 

337 raise InitializationError( 

338 f"PyAV can not write to `{self.request.raw_uri}`" 

339 ) 

340 

341 # --------------------- 

342 # Standard V3 Interface 

343 # --------------------- 

344 

345 def read( 

346 self, 

347 *, 

348 index: int = ..., 

349 format: str = "rgb24", 

350 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

351 filter_graph: Tuple[dict, List] = None, 

352 constant_framerate: bool = None, 

353 thread_count: int = 0, 

354 thread_type: str = None, 

355 ) -> np.ndarray: 

356 """Read frames from the video. 

357 

358 If ``index`` is an integer, this function reads the index-th frame from 

359 the file. If ``index`` is ... (Ellipsis), this function reads all frames 

360 from the video, stacks them along the first dimension, and returns a 

361 batch of frames. 

362 

363 Parameters 

364 ---------- 

365 index : int 

366 The index of the frame to read, e.g. ``index=5`` reads the 5th 

367 frame. If ``...``, read all the frames in the video and stack them 

368 along a new, prepended, batch dimension. 

369 format : str 

370 Set the returned colorspace. If not None (default: rgb24), convert 

371 the data into the given format before returning it. If ``None`` 

372 return the data in the encoded format if it can be expressed as a 

373 strided array; otherwise raise an Exception. 

374 filter_sequence : List[str, str, dict] 

375 If not None, apply the given sequence of FFmpeg filters to each 

376 ndimage. Check the (module-level) plugin docs for details and 

377 examples. 

378 filter_graph : (dict, List) 

379 If not None, apply the given graph of FFmpeg filters to each 

380 ndimage. The graph is given as a tuple of two dicts. The first dict 

381 contains a (named) set of nodes, and the second dict contains a set 

382 of edges between nodes of the previous dict. Check the (module-level) 

383 plugin docs for details and examples. 

384 constant_framerate : bool 

385 If True assume the video's framerate is constant. This allows for 

386 faster seeking inside the file. If False, the video is reset before 

387 each read and searched from the beginning. If None (default), this 

388 value will be read from the container format. 

389 thread_count : int 

390 How many threads to use when decoding a frame. The default is 0, 

391 which will set the number using ffmpeg's default, which is based on 

392 the codec, number of available cores, threadding model, and other 

393 considerations. 

394 thread_type : str 

395 The threading model to be used. One of 

396 

397 - `"SLICE"`: threads assemble parts of the current frame 

398 - `"FRAME"`: threads may assemble future frames 

399 - None (default): Uses ``"FRAME"`` if ``index=...`` and ffmpeg's 

400 default otherwise. 

401 

402 

403 Returns 

404 ------- 

405 frame : np.ndarray 

406 A numpy array containing loaded frame data. 

407 

408 Notes 

409 ----- 

410 Accessing random frames repeatedly is costly (O(k), where k is the 

411 average distance between two keyframes). You should do so only sparingly 

412 if possible. In some cases, it can be faster to bulk-read the video (if 

413 it fits into memory) and to then access the returned ndarray randomly. 

414 

415 The current implementation may cause problems for b-frames, i.e., 

416 bidirectionaly predicted pictures. I lack test videos to write unit 

417 tests for this case. 

418 

419 Reading from an index other than ``...``, i.e. reading a single frame, 

420 currently doesn't support filters that introduce delays. 

421 

422 """ 

423 

424 if index is ...: 

425 props = self.properties(format=format) 

426 uses_filter = ( 

427 self._video_filter is not None 

428 or filter_graph is not None 

429 or filter_sequence is not None 

430 ) 

431 

432 self._container.seek(0) 

433 if not uses_filter and props.shape[0] != 0: 

434 frames = np.empty(props.shape, dtype=props.dtype) 

435 for idx, frame in enumerate( 

436 self.iter( 

437 format=format, 

438 filter_sequence=filter_sequence, 

439 filter_graph=filter_graph, 

440 thread_count=thread_count, 

441 thread_type=thread_type or "FRAME", 

442 ) 

443 ): 

444 frames[idx] = frame 

445 else: 

446 frames = np.stack( 

447 [ 

448 x 

449 for x in self.iter( 

450 format=format, 

451 filter_sequence=filter_sequence, 

452 filter_graph=filter_graph, 

453 thread_count=thread_count, 

454 thread_type=thread_type or "FRAME", 

455 ) 

456 ] 

457 ) 

458 

459 # reset stream container, because threading model can't change after 

460 # first access 

461 self._video_stream.close() 

462 self._video_stream = self._container.streams.video[0] 

463 

464 return frames 

465 

466 if thread_type is not None and thread_type != self._video_stream.thread_type: 

467 self._video_stream.thread_type = thread_type 

468 if ( 

469 thread_count != 0 

470 and thread_count != self._video_stream.codec_context.thread_count 

471 ): 

472 # in FFMPEG thread_count == 0 means use the default count, which we 

473 # change to mean don't change the thread count. 

474 self._video_stream.codec_context.thread_count = thread_count 

475 

476 if constant_framerate is None: 

477 constant_framerate = not self._container.format.variable_fps 

478 

479 # note: cheap for contigous incremental reads 

480 self._seek(index, constant_framerate=constant_framerate) 

481 desired_frame = next(self._decoder) 

482 self._next_idx += 1 

483 

484 self.set_video_filter(filter_sequence, filter_graph) 

485 if self._video_filter is not None: 

486 desired_frame = self._video_filter.send(desired_frame) 

487 

488 return self._unpack_frame(desired_frame, format=format) 

489 

490 def iter( 

491 self, 

492 *, 

493 format: str = "rgb24", 

494 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

495 filter_graph: Tuple[dict, List] = None, 

496 thread_count: int = 0, 

497 thread_type: str = None, 

498 ) -> np.ndarray: 

499 """Yield frames from the video. 

500 

501 Parameters 

502 ---------- 

503 frame : np.ndarray 

504 A numpy array containing loaded frame data. 

505 format : str 

506 Convert the data into the given format before returning it. If None, 

507 return the data in the encoded format if it can be expressed as a 

508 strided array; otherwise raise an Exception. 

509 filter_sequence : List[str, str, dict] 

510 Set the returned colorspace. If not None (default: rgb24), convert 

511 the data into the given format before returning it. If ``None`` 

512 return the data in the encoded format if it can be expressed as a 

513 strided array; otherwise raise an Exception. 

514 filter_graph : (dict, List) 

515 If not None, apply the given graph of FFmpeg filters to each 

516 ndimage. The graph is given as a tuple of two dicts. The first dict 

517 contains a (named) set of nodes, and the second dict contains a set 

518 of edges between nodes of the previous dict. Check the (module-level) 

519 plugin docs for details and examples. 

520 thread_count : int 

521 How many threads to use when decoding a frame. The default is 0, 

522 which will set the number using ffmpeg's default, which is based on 

523 the codec, number of available cores, threadding model, and other 

524 considerations. 

525 thread_type : str 

526 The threading model to be used. One of 

527 

528 - `"SLICE"` (default): threads assemble parts of the current frame 

529 - `"FRAME"`: threads may assemble future frames (faster for bulk reading) 

530 

531 

532 Yields 

533 ------ 

534 frame : np.ndarray 

535 A (decoded) video frame. 

536 

537 

538 """ 

539 

540 self._video_stream.thread_type = thread_type or "SLICE" 

541 self._video_stream.codec_context.thread_count = thread_count 

542 

543 self.set_video_filter(filter_sequence, filter_graph) 

544 

545 for frame in self._decoder: 

546 self._next_idx += 1 

547 

548 if self._video_filter is not None: 

549 try: 

550 frame = self._video_filter.send(frame) 

551 except StopIteration: 

552 break 

553 

554 if frame is None: 

555 continue 

556 

557 yield self._unpack_frame(frame, format=format) 

558 

559 if self._video_filter is not None: 

560 for frame in self._video_filter: 

561 yield self._unpack_frame(frame, format=format) 

562 

563 def write( 

564 self, 

565 ndimage: Union[np.ndarray, List[np.ndarray]], 

566 *, 

567 codec: str = None, 

568 is_batch: bool = True, 

569 fps: int = 24, 

570 in_pixel_format: str = "rgb24", 

571 out_pixel_format: str = None, 

572 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

573 filter_graph: Tuple[dict, List] = None, 

574 ) -> Optional[bytes]: 

575 """Save a ndimage as a video. 

576 

577 Given a batch of frames (stacked along the first axis) or a list of 

578 frames, encode them and add the result to the ImageResource. 

579 

580 Parameters 

581 ---------- 

582 ndimage : ArrayLike, List[ArrayLike] 

583 The ndimage to encode and write to the ImageResource. 

584 codec : str 

585 The codec to use when encoding frames. Only needed on first write 

586 and ignored on subsequent writes. 

587 is_batch : bool 

588 If True (default), the ndimage is a batch of images, otherwise it is 

589 a single image. This parameter has no effect on lists of ndimages. 

590 fps : str 

591 The resulting videos frames per second. 

592 in_pixel_format : str 

593 The pixel format of the incoming ndarray. Defaults to "rgb24" and can 

594 be any stridable pix_fmt supported by FFmpeg. 

595 out_pixel_format : str 

596 The pixel format to use while encoding frames. If None (default) 

597 use the codec's default. 

598 filter_sequence : List[str, str, dict] 

599 If not None, apply the given sequence of FFmpeg filters to each 

600 ndimage. Check the (module-level) plugin docs for details and 

601 examples. 

602 filter_graph : (dict, List) 

603 If not None, apply the given graph of FFmpeg filters to each 

604 ndimage. The graph is given as a tuple of two dicts. The first dict 

605 contains a (named) set of nodes, and the second dict contains a set 

606 of edges between nodes of the previous dict. Check the (module-level) 

607 plugin docs for details and examples. 

608 

609 Returns 

610 ------- 

611 encoded_image : bytes or None 

612 If the chosen ImageResource is the special target ``"<bytes>"`` then 

613 write will return a byte string containing the encoded image data. 

614 Otherwise, it returns None. 

615 

616 Notes 

617 ----- 

618 When writing ``<bytes>``, the video is finalized immediately after the 

619 first write call and calling write multiple times to append frames is 

620 not possible. 

621 

622 """ 

623 

624 if isinstance(ndimage, list): 

625 # frames shapes must agree for video 

626 if any(f.shape != ndimage[0].shape for f in ndimage): 

627 raise ValueError("All frames should have the same shape") 

628 elif not is_batch: 

629 ndimage = np.asarray(ndimage)[None, ...] 

630 else: 

631 ndimage = np.asarray(ndimage) 

632 

633 if self._video_stream is None: 

634 self.init_video_stream(codec, fps=fps, pixel_format=out_pixel_format) 

635 

636 self.set_video_filter(filter_sequence, filter_graph) 

637 

638 for img in ndimage: 

639 self.write_frame(img, pixel_format=in_pixel_format) 

640 

641 if self.request._uri_type == URI_BYTES: 

642 # bytes are immutuable, so we have to flush immediately 

643 # and can't support appending 

644 self._flush_writer() 

645 self._container.close() 

646 

647 return self.request.get_file().getvalue() 

648 

649 def properties(self, index: int = ..., *, format: str = "rgb24") -> ImageProperties: 

650 """Standardized ndimage metadata. 

651 

652 Parameters 

653 ---------- 

654 index : int 

655 The index of the ndimage for which to return properties. If ``...`` 

656 (Ellipsis, default), return the properties for the resulting batch 

657 of frames. 

658 format : str 

659 If not None (default: rgb24), convert the data into the given format 

660 before returning it. If None return the data in the encoded format 

661 if that can be expressed as a strided array; otherwise raise an 

662 Exception. 

663 

664 Returns 

665 ------- 

666 properties : ImageProperties 

667 A dataclass filled with standardized image metadata. 

668 

669 Notes 

670 ----- 

671 This function is efficient and won't process any pixel data. 

672 

673 The provided metadata does not include modifications by any filters 

674 (through ``filter_sequence`` or ``filter_graph``). 

675 

676 """ 

677 

678 video_width = self._video_stream.codec_context.width 

679 video_height = self._video_stream.codec_context.height 

680 pix_format = format or self._video_stream.codec_context.pix_fmt 

681 frame_template = av.VideoFrame(video_width, video_height, pix_format) 

682 

683 shape = _get_frame_shape(frame_template) 

684 if index is ...: 

685 n_frames = self._video_stream.frames 

686 shape = (n_frames,) + shape 

687 

688 return ImageProperties( 

689 shape=tuple(shape), 

690 dtype=_format_to_dtype(frame_template.format), 

691 n_images=shape[0] if index is ... else None, 

692 is_batch=index is ..., 

693 ) 

694 

695 def metadata( 

696 self, 

697 index: int = ..., 

698 exclude_applied: bool = True, 

699 constant_framerate: bool = None, 

700 ) -> Dict[str, Any]: 

701 """Format-specific metadata. 

702 

703 Returns a dictionary filled with metadata that is either stored in the 

704 container, the video stream, or the frame's side-data. 

705 

706 Parameters 

707 ---------- 

708 index : int 

709 If ... (Ellipsis, default) return global metadata (the metadata 

710 stored in the container and video stream). If not ..., return the 

711 side data stored in the frame at the given index. 

712 exclude_applied : bool 

713 Currently, this parameter has no effect. It exists for compliance with 

714 the ImageIO v3 API. 

715 constant_framerate : bool 

716 If True assume the video's framerate is constant. This allows for 

717 faster seeking inside the file. If False, the video is reset before 

718 each read and searched from the beginning. If None (default), this 

719 value will be read from the container format. 

720 

721 Returns 

722 ------- 

723 metadata : dict 

724 A dictionary filled with format-specific metadata fields and their 

725 values. 

726 

727 """ 

728 

729 metadata = dict() 

730 

731 if index is ...: 

732 # useful flags defined on the container and/or video stream 

733 metadata.update( 

734 { 

735 "video_format": self._video_stream.codec_context.pix_fmt, 

736 "codec": self._video_stream.codec.name, 

737 "long_codec": self._video_stream.codec.long_name, 

738 "profile": self._video_stream.profile, 

739 "fps": float(self._video_stream.guessed_rate), 

740 } 

741 ) 

742 if self._video_stream.duration is not None: 

743 duration = float( 

744 self._video_stream.duration * self._video_stream.time_base 

745 ) 

746 metadata.update({"duration": duration}) 

747 

748 metadata.update(self.container_metadata) 

749 metadata.update(self.video_stream_metadata) 

750 return metadata 

751 

752 if constant_framerate is None: 

753 constant_framerate = not self._container.format.variable_fps 

754 

755 self._seek(index, constant_framerate=constant_framerate) 

756 desired_frame = next(self._decoder) 

757 self._next_idx += 1 

758 

759 # useful flags defined on the frame 

760 metadata.update( 

761 { 

762 "key_frame": bool(desired_frame.key_frame), 

763 "time": desired_frame.time, 

764 "interlaced_frame": bool(desired_frame.interlaced_frame), 

765 "frame_type": desired_frame.pict_type.name, 

766 } 

767 ) 

768 

769 # side data 

770 metadata.update( 

771 {item.type.name: item.to_bytes() for item in desired_frame.side_data} 

772 ) 

773 

774 return metadata 

775 

776 def close(self) -> None: 

777 """Close the Video.""" 

778 

779 is_write = self.request.mode.io_mode == IOMode.write 

780 if is_write and self._video_stream is not None: 

781 self._flush_writer() 

782 

783 if self._video_stream is not None: 

784 try: 

785 self._video_stream.close() 

786 except ValueError: 

787 pass # stream already closed 

788 

789 if self._container is not None: 

790 self._container.close() 

791 

792 self.request.finish() 

793 

794 def __enter__(self) -> "PyAVPlugin": 

795 return super().__enter__() 

796 

797 # ------------------------------ 

798 # Add-on Interface inside imopen 

799 # ------------------------------ 

800 

801 def init_video_stream( 

802 self, 

803 codec: str, 

804 *, 

805 fps: float = 24, 

806 pixel_format: str = None, 

807 max_keyframe_interval: int = None, 

808 force_keyframes: bool = None, 

809 ) -> None: 

810 """Initialize a new video stream. 

811 

812 This function adds a new video stream to the ImageResource using the 

813 selected encoder (codec), framerate, and colorspace. 

814 

815 Parameters 

816 ---------- 

817 codec : str 

818 The codec to use, e.g. ``"libx264"`` or ``"vp9"``. 

819 fps : float 

820 The desired framerate of the video stream (frames per second). 

821 pixel_format : str 

822 The pixel format to use while encoding frames. If None (default) use 

823 the codec's default. 

824 max_keyframe_interval : int 

825 The maximum distance between two intra frames (I-frames). Also known 

826 as GOP size. If unspecified use the codec's default. Note that not 

827 every I-frame is a keyframe; see the notes for details. 

828 force_keyframes : bool 

829 If True, limit inter frames dependency to frames within the current 

830 keyframe interval (GOP), i.e., force every I-frame to be a keyframe. 

831 If unspecified, use the codec's default. 

832 

833 Notes 

834 ----- 

835 You can usually leave ``max_keyframe_interval`` and ``force_keyframes`` 

836 at their default values, unless you try to generate seek-optimized video 

837 or have a similar specialist use-case. In this case, ``force_keyframes`` 

838 controls the ability to seek to _every_ I-frame, and 

839 ``max_keyframe_interval`` controls how close to a random frame you can 

840 seek. Low values allow more fine-grained seek at the expense of 

841 file-size (and thus I/O performance). 

842 

843 """ 

844 

845 stream = self._container.add_stream(codec, fps) 

846 stream.time_base = Fraction(1 / fps).limit_denominator(int(2**16 - 1)) 

847 if pixel_format is not None: 

848 stream.pix_fmt = pixel_format 

849 if max_keyframe_interval is not None: 

850 stream.gop_size = max_keyframe_interval 

851 if force_keyframes is not None: 

852 stream.closed_gop = force_keyframes 

853 

854 self._video_stream = stream 

855 

856 def write_frame(self, frame: np.ndarray, *, pixel_format: str = "rgb24") -> None: 

857 """Add a frame to the video stream. 

858 

859 This function appends a new frame to the video. It assumes that the 

860 stream previously has been initialized. I.e., ``init_video_stream`` has 

861 to be called before calling this function for the write to succeed. 

862 

863 Parameters 

864 ---------- 

865 frame : np.ndarray 

866 The image to be appended/written to the video stream. 

867 pixel_format : str 

868 The colorspace (pixel format) of the incoming frame. 

869 

870 Notes 

871 ----- 

872 Frames may be held in a buffer, e.g., by the filter pipeline used during 

873 writing or by FFMPEG to batch them prior to encoding. Make sure to 

874 ``.close()`` the plugin or to use a context manager to ensure that all 

875 frames are written to the ImageResource. 

876 

877 """ 

878 

879 # manual packing of ndarray into frame 

880 # (this should live in pyAV, but it doesn't support all the formats we 

881 # want and PRs there are slow) 

882 pixel_format = av.VideoFormat(pixel_format) 

883 img_dtype = _format_to_dtype(pixel_format) 

884 width = frame.shape[2 if pixel_format.is_planar else 1] 

885 height = frame.shape[1 if pixel_format.is_planar else 0] 

886 av_frame = av.VideoFrame(width, height, pixel_format.name) 

887 if pixel_format.is_planar: 

888 for idx, plane in enumerate(av_frame.planes): 

889 plane_array = np.frombuffer(plane, dtype=img_dtype) 

890 plane_array = as_strided( 

891 plane_array, 

892 shape=(plane.height, plane.width), 

893 strides=(plane.line_size, img_dtype.itemsize), 

894 ) 

895 plane_array[...] = frame[idx] 

896 else: 

897 if pixel_format.name.startswith("bayer_"): 

898 # ffmpeg doesn't describe bayer formats correctly 

899 # see https://github.com/imageio/imageio/issues/761#issuecomment-1059318851 

900 # and following for details. 

901 n_channels = 1 

902 else: 

903 n_channels = len(pixel_format.components) 

904 

905 plane = av_frame.planes[0] 

906 plane_shape = (plane.height, plane.width) 

907 plane_strides = (plane.line_size, n_channels * img_dtype.itemsize) 

908 if n_channels > 1: 

909 plane_shape += (n_channels,) 

910 plane_strides += (img_dtype.itemsize,) 

911 

912 plane_array = as_strided( 

913 np.frombuffer(plane, dtype=img_dtype), 

914 shape=plane_shape, 

915 strides=plane_strides, 

916 ) 

917 plane_array[...] = frame 

918 

919 stream = self._video_stream 

920 av_frame.time_base = stream.codec_context.time_base 

921 av_frame.pts = self.frames_written 

922 self.frames_written += 1 

923 

924 if self._video_filter is not None: 

925 av_frame = self._video_filter.send(av_frame) 

926 if av_frame is None: 

927 return 

928 

929 if stream.frames == 0: 

930 stream.width = av_frame.width 

931 stream.height = av_frame.height 

932 

933 for packet in stream.encode(av_frame): 

934 self._container.mux(packet) 

935 

936 def set_video_filter( 

937 self, 

938 filter_sequence: List[Tuple[str, Union[str, dict]]] = None, 

939 filter_graph: Tuple[dict, List] = None, 

940 ) -> None: 

941 """Set the filter(s) to use. 

942 

943 This function creates a new FFMPEG filter graph to use when reading or 

944 writing video. In the case of reading, frames are passed through the 

945 filter graph before begin returned and, in case of writing, frames are 

946 passed through the filter before being written to the video. 

947 

948 Parameters 

949 ---------- 

950 filter_sequence : List[str, str, dict] 

951 If not None, apply the given sequence of FFmpeg filters to each 

952 ndimage. Check the (module-level) plugin docs for details and 

953 examples. 

954 filter_graph : (dict, List) 

955 If not None, apply the given graph of FFmpeg filters to each 

956 ndimage. The graph is given as a tuple of two dicts. The first dict 

957 contains a (named) set of nodes, and the second dict contains a set 

958 of edges between nodes of the previous dict. Check the 

959 (module-level) plugin docs for details and examples. 

960 

961 Notes 

962 ----- 

963 Changing a filter graph with lag during reading or writing will 

964 currently cause frames in the filter queue to be lost. 

965 

966 """ 

967 

968 if filter_sequence is None and filter_graph is None: 

969 self._video_filter = None 

970 return 

971 

972 if filter_sequence is None: 

973 filter_sequence = list() 

974 

975 node_descriptors: Dict[str, Tuple[str, Union[str, Dict]]] 

976 edges: List[Tuple[str, str, int, int]] 

977 if filter_graph is None: 

978 node_descriptors, edges = dict(), [("video_in", "video_out", 0, 0)] 

979 else: 

980 node_descriptors, edges = filter_graph 

981 

982 graph = av.filter.Graph() 

983 

984 previous_node = graph.add_buffer(template=self._video_stream) 

985 for filter_name, argument in filter_sequence: 

986 if isinstance(argument, str): 

987 current_node = graph.add(filter_name, argument) 

988 else: 

989 current_node = graph.add(filter_name, **argument) 

990 previous_node.link_to(current_node) 

991 previous_node = current_node 

992 

993 nodes = dict() 

994 nodes["video_in"] = previous_node 

995 nodes["video_out"] = graph.add("buffersink") 

996 for name, (filter_name, arguments) in node_descriptors.items(): 

997 if isinstance(arguments, str): 

998 nodes[name] = graph.add(filter_name, arguments) 

999 else: 

1000 nodes[name] = graph.add(filter_name, **arguments) 

1001 

1002 for from_note, to_node, out_idx, in_idx in edges: 

1003 nodes[from_note].link_to(nodes[to_node], out_idx, in_idx) 

1004 

1005 graph.configure() 

1006 

1007 def video_filter(): 

1008 # this starts a co-routine 

1009 # send frames using graph.send() 

1010 frame = yield None 

1011 

1012 # send and receive frames in "parallel" 

1013 while frame is not None: 

1014 graph.push(frame) 

1015 try: 

1016 frame = yield graph.pull() 

1017 except av.error.BlockingIOError: 

1018 # filter has lag and needs more frames 

1019 frame = yield None 

1020 except av.error.EOFError: 

1021 break 

1022 

1023 try: 

1024 # send EOF in av>=9.0 

1025 graph.push(None) 

1026 except ValueError: # pragma: no cover 

1027 # handle av<9.0 

1028 pass 

1029 

1030 # all frames have been sent, empty the filter 

1031 while True: 

1032 try: 

1033 yield graph.pull() 

1034 except av.error.EOFError: 

1035 break # EOF 

1036 except av.error.BlockingIOError: # pragma: no cover 

1037 # handle av<9.0 

1038 break 

1039 

1040 self._video_filter = video_filter() 

1041 self._video_filter.send(None) 

1042 

1043 @property 

1044 def container_metadata(self): 

1045 """Container-specific metadata. 

1046 

1047 A dictionary containing metadata stored at the container level. 

1048 

1049 """ 

1050 return self._container.metadata 

1051 

1052 @property 

1053 def video_stream_metadata(self): 

1054 """Stream-specific metadata. 

1055 

1056 A dictionary containing metadata stored at the stream level. 

1057 

1058 """ 

1059 return self._video_stream.metadata 

1060 

1061 # ------------------------------- 

1062 # Internals and private functions 

1063 # ------------------------------- 

1064 

1065 def _unpack_frame(self, frame: av.VideoFrame, *, format: str = None) -> np.ndarray: 

1066 """Convert a av.VideoFrame into a ndarray 

1067 

1068 Parameters 

1069 ---------- 

1070 frame : av.VideoFrame 

1071 The frame to unpack. 

1072 format : str 

1073 If not None, convert the frame to the given format before unpacking. 

1074 

1075 """ 

1076 

1077 if format is not None: 

1078 frame = frame.reformat(format=format) 

1079 

1080 dtype = _format_to_dtype(frame.format) 

1081 shape = _get_frame_shape(frame) 

1082 

1083 planes = list() 

1084 for idx in range(len(frame.planes)): 

1085 n_channels = sum( 

1086 [ 

1087 x.bits // (dtype.itemsize * 8) 

1088 for x in frame.format.components 

1089 if x.plane == idx 

1090 ] 

1091 ) 

1092 av_plane = frame.planes[idx] 

1093 plane_shape = (av_plane.height, av_plane.width) 

1094 plane_strides = (av_plane.line_size, n_channels * dtype.itemsize) 

1095 if n_channels > 1: 

1096 plane_shape += (n_channels,) 

1097 plane_strides += (dtype.itemsize,) 

1098 

1099 np_plane = as_strided( 

1100 np.frombuffer(av_plane, dtype=dtype), 

1101 shape=plane_shape, 

1102 strides=plane_strides, 

1103 ) 

1104 planes.append(np_plane) 

1105 

1106 if len(planes) > 1: 

1107 # Note: the planes *should* exist inside a contigous memory block 

1108 # somewhere inside av.Frame however pyAV does not appear to expose this, 

1109 # so we are forced to copy the planes individually instead of wrapping 

1110 # them :( 

1111 out = np.concatenate(planes).reshape(shape) 

1112 else: 

1113 out = planes[0] 

1114 

1115 return out 

1116 

1117 def _seek(self, index, *, constant_framerate: bool = True) -> Generator: 

1118 """Seeks to the frame at the given index.""" 

1119 

1120 if index == self._next_idx: 

1121 return # fast path :) 

1122 

1123 # we must decode at least once before we seek otherwise the 

1124 # returned frames become corrupt. 

1125 if self._next_idx == 0: 

1126 next(self._decoder) 

1127 self._next_idx += 1 

1128 

1129 if index == self._next_idx: 

1130 return # fast path :) 

1131 

1132 # remove this branch until I find a way to efficiently find the next 

1133 # keyframe. keeping this as a reminder 

1134 # if self._next_idx < index and index < self._next_keyframe_idx: 

1135 # frames_to_yield = index - self._next_idx 

1136 if not constant_framerate and index > self._next_idx: 

1137 frames_to_yield = index - self._next_idx 

1138 elif not constant_framerate: 

1139 # seek backwards and can't link idx and pts 

1140 self._container.seek(0) 

1141 self._decoder = self._container.decode(video=0) 

1142 self._next_idx = 0 

1143 

1144 frames_to_yield = index 

1145 else: 

1146 # we know that the time between consecutive frames is constant 

1147 # hence we can link index and pts 

1148 

1149 # how many pts lie between two frames 

1150 sec_delta = 1 / self._video_stream.guessed_rate 

1151 pts_delta = sec_delta / self._video_stream.time_base 

1152 

1153 index_pts = int(index * pts_delta) 

1154 

1155 # this only seeks to the closed (preceeding) keyframe 

1156 self._container.seek(index_pts, stream=self._video_stream) 

1157 self._decoder = self._container.decode(video=0) 

1158 

1159 # this may be made faster if we could get the keyframe's time without 

1160 # decoding it 

1161 keyframe = next(self._decoder) 

1162 keyframe_time = keyframe.pts * keyframe.time_base 

1163 keyframe_pts = int(keyframe_time / self._video_stream.time_base) 

1164 keyframe_index = keyframe_pts // pts_delta 

1165 

1166 self._container.seek(index_pts, stream=self._video_stream) 

1167 self._next_idx = keyframe_index 

1168 

1169 frames_to_yield = index - keyframe_index 

1170 

1171 for _ in range(frames_to_yield): 

1172 next(self._decoder) 

1173 self._next_idx += 1 

1174 

1175 def _flush_writer(self): 

1176 """Flush the filter and encoder 

1177 

1178 This will reset the filter to `None` and send EoF to the encoder, 

1179 i.e., after calling, no more frames may be written. 

1180 

1181 """ 

1182 

1183 stream = self._video_stream 

1184 

1185 if self._video_filter is not None: 

1186 # flush encoder 

1187 for av_frame in self._video_filter: 

1188 if stream.frames == 0: 

1189 stream.width = av_frame.width 

1190 stream.height = av_frame.height 

1191 for packet in stream.encode(av_frame): 

1192 self._container.mux(packet) 

1193 self._video_filter = None 

1194 

1195 # flush stream 

1196 for packet in stream.encode(): 

1197 self._container.mux(packet) 

1198 self._video_stream = None