Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.10/site-packages/imageio-2.36.1-py3.10.egg/imageio/plugins/pyav.py: 1%

1"""Read/Write Videos (and images) using PyAV.

3.. note::

4 To use this plugin you need to have `PyAV <https://pyav.org/docs/stable/>`_

5 installed::

7 pip install av

9This plugin wraps pyAV, a pythonic binding for the FFMPEG library. It is similar

10to our FFMPEG plugin, has improved performance, features a robust interface, and

11aims to supersede the FFMPEG plugin in the future.

14Methods

15-------

16.. note::

17 Check the respective function for a list of supported kwargs and detailed

18 documentation.

20.. autosummary::

21 :toctree:

23 PyAVPlugin.read

24 PyAVPlugin.iter

25 PyAVPlugin.write

26 PyAVPlugin.properties

27 PyAVPlugin.metadata

29Additional methods available inside the :func:`imopen <imageio.v3.imopen>`

30context:

32.. autosummary::

33 :toctree:

35 PyAVPlugin.init_video_stream

36 PyAVPlugin.write_frame

37 PyAVPlugin.set_video_filter

38 PyAVPlugin.container_metadata

39 PyAVPlugin.video_stream_metadata

41Advanced API

42------------

44In addition to the default ImageIO v3 API this plugin exposes custom functions

45that are specific to reading/writing video and its metadata. These are available

46inside the :func:`imopen <imageio.v3.imopen>` context and allow fine-grained

47control over how the video is processed. The functions are documented above and

48below you can find a usage example::

50 import imageio.v3 as iio

52 with iio.imopen("test.mp4", "w", plugin="pyav") as file:

53 file.init_video_stream("libx264")

54 file.container_metadata["comment"] = "This video was created using ImageIO."

56 for _ in range(5):

57 for frame in iio.imiter("imageio:newtonscradle.gif"):

58 file.write_frame(frame)

60 meta = iio.immeta("test.mp4", plugin="pyav")

61 assert meta["comment"] == "This video was created using ImageIO."

65Pixel Formats (Colorspaces)

66---------------------------

68By default, this plugin converts the video into 8-bit RGB (called ``rgb24`` in

69ffmpeg). This is a useful behavior for many use-cases, but sometimes you may

70want to use the video's native colorspace or you may wish to convert the video

71into an entirely different colorspace. This is controlled using the ``format``

72kwarg. You can use ``format=None`` to leave the image in its native colorspace

73or specify any colorspace supported by FFMPEG as long as it is stridable, i.e.,

74as long as it can be represented by a single numpy array. Some useful choices

75include:

77- rgb24 (default; 8-bit RGB)

78- rgb48le (16-bit lower-endian RGB)

79- bgr24 (8-bit BGR; openCVs default colorspace)

80- gray (8-bit grayscale)

81- yuv444p (8-bit channel-first YUV)

83Further, FFMPEG maintains a list of available formats, albeit not as part of the

84narrative docs. It can be `found here

85<https://ffmpeg.org/doxygen/trunk/pixfmt_8h_source.html>`_ (warning: C source

86code).

88Filters

89-------

91On top of providing basic read/write functionality, this plugin allows you to

92use the full collection of `video filters available in FFMPEG

93<https://ffmpeg.org/ffmpeg-filters.html#Video-Filters>`_. This means that you

94can apply excessive preprocessing to your video before retrieving it as a numpy

95array or apply excessive post-processing before you encode your data.

97Filters come in two forms: sequences or graphs. Filter sequences are, as the

98name suggests, sequences of filters that are applied one after the other. They

99are specified using the ``filter_sequence`` kwarg. Filter graphs, on the other

100hand, come in the form of a directed graph and are specified using the

101``filter_graph`` kwarg.

102

103.. note::

104 All filters are either sequences or graphs. If all you want is to apply a

105 single filter, you can do this by specifying a filter sequence with a single

106 entry.

107

108A ``filter_sequence`` is a list of filters, each defined through a 2-element

109tuple of the form ``(filter_name, filter_parameters)``. The first element of the

110tuple is the name of the filter. The second element are the filter parameters,

111which can be given either as a string or a dict. The string matches the same

112format that you would use when specifying the filter using the ffmpeg

113command-line tool and the dict has entries of the form ``parameter:value``. For

114example::

115

116 import imageio.v3 as iio

117

118 # using a filter_parameters str

119 img1 = iio.imread(

120 "imageio:cockatoo.mp4",

121 plugin="pyav",

122 filter_sequence=[

123 ("rotate", "45*PI/180")

124 ]

125 )

126

127 # using a filter_parameters dict

128 img2 = iio.imread(

129 "imageio:cockatoo.mp4",

130 plugin="pyav",

131 filter_sequence=[

132 ("rotate", {"angle":"45*PI/180", "fillcolor":"AliceBlue"})

133 ]

134 )

135

136A ``filter_graph``, on the other hand, is specified using a ``(nodes, edges)``

137tuple. It is best explained using an example::

138

139 img = iio.imread(

140 "imageio:cockatoo.mp4",

141 plugin="pyav",

142 filter_graph=(

143 {

144 "split": ("split", ""),

145 "scale_overlay":("scale", "512:-1"),

146 "overlay":("overlay", "x=25:y=25:enable='between(t,1,8)'"),

147 },

148 [

149 ("video_in", "split", 0, 0),

150 ("split", "overlay", 0, 0),

151 ("split", "scale_overlay", 1, 0),

152 ("scale_overlay", "overlay", 0, 1),

153 ("overlay", "video_out", 0, 0),

154 ]

155 )

156 )

157

158The above transforms the video to have picture-in-picture of itself in the top

159left corner. As you can see, nodes are specified using a dict which has names as

160its keys and filter tuples as values; the same tuples as the ones used when

161defining a filter sequence. Edges are a list of a 4-tuples of the form

162``(node_out, node_in, output_idx, input_idx)`` and specify which two filters are

163connected and which inputs/outputs should be used for this.

164

165Further, there are two special nodes in a filter graph: ``video_in`` and

166``video_out``, which represent the graph's input and output respectively. These

167names can not be chosen for other nodes (those nodes would simply be

168overwritten), and for a graph to be valid there must be a path from the input to

169the output and all nodes in the graph must be connected.

170

171While most graphs are quite simple, they can become very complex and we

172recommend that you read through the `FFMPEG documentation

173<https://ffmpeg.org/ffmpeg-filters.html#Filtergraph-description>`_ and their

174examples to better understand how to use them.

175

176"""

177

178from fractions import Fraction

179from math import ceil

180from typing import Any, Dict, List, Optional, Tuple, Union, Generator

181

182import av

183import av.filter

184import numpy as np

185from numpy.lib.stride_tricks import as_strided

186

187from ..core import Request

188from ..core.request import URI_BYTES, InitializationError, IOMode

189from ..core.v3_plugin_api import ImageProperties, PluginV3

190

191

192def _format_to_dtype(format: av.VideoFormat) -> np.dtype:

193 """Convert a pyAV video format into a numpy dtype"""

194

195 if len(format.components) == 0:

196 # fake format

197 raise ValueError(

198 f"Can't determine dtype from format `{format.name}`. It has no channels."

199 )

200

201 endian = ">" if format.is_big_endian else "<"

202 dtype = "f" if "f32" in format.name else "u"

203 bits_per_channel = [x.bits for x in format.components]

204 n_bytes = str(int(ceil(bits_per_channel[0] / 8)))

205

206 return np.dtype(endian + dtype + n_bytes)

207

208

209def _get_frame_shape(frame: av.VideoFrame) -> Tuple[int, ...]:

210 """Compute the frame's array shape

211

212 Parameters

213 ----------

214 frame : av.VideoFrame

215 A frame for which the resulting shape should be computed.

216

217 Returns

218 -------

219 shape : Tuple[int, ...]

220 A tuple describing the shape of the image data in the frame.

221

222 """

223

224 widths = [component.width for component in frame.format.components]

225 heights = [component.height for component in frame.format.components]

226 bits = np.array([component.bits for component in frame.format.components])

227 line_sizes = [plane.line_size for plane in frame.planes]

228

229 subsampled_width = widths[:-1] != widths[1:]

230 subsampled_height = heights[:-1] != heights[1:]

231 unaligned_components = np.any(bits % 8 != 0) or (line_sizes[:-1] != line_sizes[1:])

232 if subsampled_width or subsampled_height or unaligned_components:

233 raise IOError(

234 f"{frame.format.name} can't be expressed as a strided array."

235 "Use `format=` to select a format to convert into."

236 )

237

238 shape = [frame.height, frame.width]

239

240 # ffmpeg doesn't have a notion of channel-first or channel-last formats

241 # instead it stores frames in one or more planes which contain individual

242 # components of a pixel depending on the pixel format. For channel-first

243 # formats each component lives on a separate plane (n_planes) and for

244 # channel-last formats all components are packed on a single plane

245 # (n_channels)

246 n_planes = max([component.plane for component in frame.format.components]) + 1

247 if n_planes > 1:

248 shape = [n_planes] + shape

249

250 channels_per_plane = [0] * n_planes

251 for component in frame.format.components:

252 channels_per_plane[component.plane] += 1

253 n_channels = max(channels_per_plane)

254

255 if n_channels > 1:

256 shape = shape + [n_channels]

257

258 return tuple(shape)

259

260

261class PyAVPlugin(PluginV3):

262 """Support for pyAV as backend.

263

264 Parameters

265 ----------

266 request : iio.Request

267 A request object that represents the users intent. It provides a

268 standard interface to access various the various ImageResources and

269 serves them to the plugin as a file object (or file). Check the docs for

270 details.

271 container : str

272 Only used during `iio_mode="w"`! If not None, overwrite the default container

273 format chosen by pyav.

274 kwargs : Any

275 Additional kwargs are forwarded to PyAV's constructor.

276

277 """

278

279 def __init__(self, request: Request, *, container: str = None, **kwargs) -> None:

280 """Initialize a new Plugin Instance.

281

282 See Plugin's docstring for detailed documentation.

283

284 Notes

285 -----

286 The implementation here stores the request as a local variable that is

287 exposed using a @property below. If you inherit from PluginV3, remember

288 to call ``super().__init__(request)``.

289

290 """

291

292 super().__init__(request)

293

294 self._container = None

295 self._video_stream = None

296 self._video_filter = None

297

298 if request.mode.io_mode == IOMode.read:

299 self._next_idx = 0

300 try:

301 if request._uri_type == 5: # 5 is the value of URI_HTTP

302 # pyav should read from HTTP by itself. This enables reading

303 # HTTP-based streams like DASH. Note that solving streams

304 # like this is temporary until the new request object gets

305 # implemented.

306 self._container = av.open(request.raw_uri, **kwargs)

307 else:

308 self._container = av.open(request.get_file(), **kwargs)

309 self._video_stream = self._container.streams.video[0]

310 self._decoder = self._container.decode(video=0)

311 except av.AVError:

312 if isinstance(request.raw_uri, bytes):

313 msg = "PyAV does not support these `<bytes>`"

314 else:

315 msg = f"PyAV does not support `{request.raw_uri}`"

316 raise InitializationError(msg) from None

317 else:

318 self.frames_written = 0

319 file_handle = self.request.get_file()

320 filename = getattr(file_handle, "name", None)

321 extension = self.request.extension or self.request.format_hint

322 if extension is None:

323 raise InitializationError("Can't determine output container to use.")

324

325 # hacky, but beats running our own format selection logic

326 # (since av_guess_format is not exposed)

327 try:

328 setattr(file_handle, "name", filename or "tmp" + extension)

329 except AttributeError:

330 pass # read-only, nothing we can do

331

332 try:

333 self._container = av.open(

334 file_handle, mode="w", format=container, **kwargs

335 )

336 except ValueError:

337 raise InitializationError(

338 f"PyAV can not write to `{self.request.raw_uri}`"

339 )

340

341 # ---------------------

342 # Standard V3 Interface

343 # ---------------------

344

345 def read(

346 self,

347 *,

348 index: int = ...,

349 format: str = "rgb24",

350 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

351 filter_graph: Tuple[dict, List] = None,

352 constant_framerate: bool = None,

353 thread_count: int = 0,

354 thread_type: str = None,

355 ) -> np.ndarray:

356 """Read frames from the video.

357

358 If ``index`` is an integer, this function reads the index-th frame from

359 the file. If ``index`` is ... (Ellipsis), this function reads all frames

360 from the video, stacks them along the first dimension, and returns a

361 batch of frames.

362

363 Parameters

364 ----------

365 index : int

366 The index of the frame to read, e.g. ``index=5`` reads the 5th

367 frame. If ``...``, read all the frames in the video and stack them

368 along a new, prepended, batch dimension.

369 format : str

370 Set the returned colorspace. If not None (default: rgb24), convert

371 the data into the given format before returning it. If ``None``

372 return the data in the encoded format if it can be expressed as a

373 strided array; otherwise raise an Exception.

374 filter_sequence : List[str, str, dict]

375 If not None, apply the given sequence of FFmpeg filters to each

376 ndimage. Check the (module-level) plugin docs for details and

377 examples.

378 filter_graph : (dict, List)

379 If not None, apply the given graph of FFmpeg filters to each

380 ndimage. The graph is given as a tuple of two dicts. The first dict

381 contains a (named) set of nodes, and the second dict contains a set

382 of edges between nodes of the previous dict. Check the (module-level)

383 plugin docs for details and examples.

384 constant_framerate : bool

385 If True assume the video's framerate is constant. This allows for

386 faster seeking inside the file. If False, the video is reset before

387 each read and searched from the beginning. If None (default), this

388 value will be read from the container format.

389 thread_count : int

390 How many threads to use when decoding a frame. The default is 0,

391 which will set the number using ffmpeg's default, which is based on

392 the codec, number of available cores, threadding model, and other

393 considerations.

394 thread_type : str

395 The threading model to be used. One of

396

397 - `"SLICE"`: threads assemble parts of the current frame

398 - `"FRAME"`: threads may assemble future frames

399 - None (default): Uses ``"FRAME"`` if ``index=...`` and ffmpeg's

400 default otherwise.

401

402

403 Returns

404 -------

405 frame : np.ndarray

406 A numpy array containing loaded frame data.

407

408 Notes

409 -----

410 Accessing random frames repeatedly is costly (O(k), where k is the

411 average distance between two keyframes). You should do so only sparingly

412 if possible. In some cases, it can be faster to bulk-read the video (if

413 it fits into memory) and to then access the returned ndarray randomly.

414

415 The current implementation may cause problems for b-frames, i.e.,

416 bidirectionaly predicted pictures. I lack test videos to write unit

417 tests for this case.

418

419 Reading from an index other than ``...``, i.e. reading a single frame,

420 currently doesn't support filters that introduce delays.

421

422 """

423

424 if index is ...:

425 props = self.properties(format=format)

426 uses_filter = (

427 self._video_filter is not None

428 or filter_graph is not None

429 or filter_sequence is not None

430 )

431

432 self._container.seek(0)

433 if not uses_filter and props.shape[0] != 0:

434 frames = np.empty(props.shape, dtype=props.dtype)

435 for idx, frame in enumerate(

436 self.iter(

437 format=format,

438 filter_sequence=filter_sequence,

439 filter_graph=filter_graph,

440 thread_count=thread_count,

441 thread_type=thread_type or "FRAME",

442 )

443 ):

444 frames[idx] = frame

445 else:

446 frames = np.stack(

447 [

448 x

449 for x in self.iter(

450 format=format,

451 filter_sequence=filter_sequence,

452 filter_graph=filter_graph,

453 thread_count=thread_count,

454 thread_type=thread_type or "FRAME",

455 )

456 ]

457 )

458

459 # reset stream container, because threading model can't change after

460 # first access

461 self._video_stream.close()

462 self._video_stream = self._container.streams.video[0]

463

464 return frames

465

466 if thread_type is not None and thread_type != self._video_stream.thread_type:

467 self._video_stream.thread_type = thread_type

468 if (

469 thread_count != 0

470 and thread_count != self._video_stream.codec_context.thread_count

471 ):

472 # in FFMPEG thread_count == 0 means use the default count, which we

473 # change to mean don't change the thread count.

474 self._video_stream.codec_context.thread_count = thread_count

475

476 if constant_framerate is None:

477 constant_framerate = not self._container.format.variable_fps

478

479 # note: cheap for contigous incremental reads

480 self._seek(index, constant_framerate=constant_framerate)

481 desired_frame = next(self._decoder)

482 self._next_idx += 1

483

484 self.set_video_filter(filter_sequence, filter_graph)

485 if self._video_filter is not None:

486 desired_frame = self._video_filter.send(desired_frame)

487

488 return self._unpack_frame(desired_frame, format=format)

489

490 def iter(

491 self,

492 *,

493 format: str = "rgb24",

494 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

495 filter_graph: Tuple[dict, List] = None,

496 thread_count: int = 0,

497 thread_type: str = None,

498 ) -> np.ndarray:

499 """Yield frames from the video.

500

501 Parameters

502 ----------

503 frame : np.ndarray

504 A numpy array containing loaded frame data.

505 format : str

506 Convert the data into the given format before returning it. If None,

507 return the data in the encoded format if it can be expressed as a

508 strided array; otherwise raise an Exception.

509 filter_sequence : List[str, str, dict]

510 Set the returned colorspace. If not None (default: rgb24), convert

511 the data into the given format before returning it. If ``None``

512 return the data in the encoded format if it can be expressed as a

513 strided array; otherwise raise an Exception.

514 filter_graph : (dict, List)

515 If not None, apply the given graph of FFmpeg filters to each

516 ndimage. The graph is given as a tuple of two dicts. The first dict

517 contains a (named) set of nodes, and the second dict contains a set

518 of edges between nodes of the previous dict. Check the (module-level)

519 plugin docs for details and examples.

520 thread_count : int

521 How many threads to use when decoding a frame. The default is 0,

522 which will set the number using ffmpeg's default, which is based on

523 the codec, number of available cores, threadding model, and other

524 considerations.

525 thread_type : str

526 The threading model to be used. One of

527

528 - `"SLICE"` (default): threads assemble parts of the current frame

529 - `"FRAME"`: threads may assemble future frames (faster for bulk reading)

530

531

532 Yields

533 ------

534 frame : np.ndarray

535 A (decoded) video frame.

536

537

538 """

539

540 self._video_stream.thread_type = thread_type or "SLICE"

541 self._video_stream.codec_context.thread_count = thread_count

542

543 self.set_video_filter(filter_sequence, filter_graph)

544

545 for frame in self._decoder:

546 self._next_idx += 1

547

548 if self._video_filter is not None:

549 try:

550 frame = self._video_filter.send(frame)

551 except StopIteration:

552 break

553

554 if frame is None:

555 continue

556

557 yield self._unpack_frame(frame, format=format)

558

559 if self._video_filter is not None:

560 for frame in self._video_filter:

561 yield self._unpack_frame(frame, format=format)

562

563 def write(

564 self,

565 ndimage: Union[np.ndarray, List[np.ndarray]],

566 *,

567 codec: str = None,

568 is_batch: bool = True,

569 fps: int = 24,

570 in_pixel_format: str = "rgb24",

571 out_pixel_format: str = None,

572 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

573 filter_graph: Tuple[dict, List] = None,

574 ) -> Optional[bytes]:

575 """Save a ndimage as a video.

576

577 Given a batch of frames (stacked along the first axis) or a list of

578 frames, encode them and add the result to the ImageResource.

579

580 Parameters

581 ----------

582 ndimage : ArrayLike, List[ArrayLike]

583 The ndimage to encode and write to the ImageResource.

584 codec : str

585 The codec to use when encoding frames. Only needed on first write

586 and ignored on subsequent writes.

587 is_batch : bool

588 If True (default), the ndimage is a batch of images, otherwise it is

589 a single image. This parameter has no effect on lists of ndimages.

590 fps : str

591 The resulting videos frames per second.

592 in_pixel_format : str

593 The pixel format of the incoming ndarray. Defaults to "rgb24" and can

594 be any stridable pix_fmt supported by FFmpeg.

595 out_pixel_format : str

596 The pixel format to use while encoding frames. If None (default)

597 use the codec's default.

598 filter_sequence : List[str, str, dict]

599 If not None, apply the given sequence of FFmpeg filters to each

600 ndimage. Check the (module-level) plugin docs for details and

601 examples.

602 filter_graph : (dict, List)

603 If not None, apply the given graph of FFmpeg filters to each

604 ndimage. The graph is given as a tuple of two dicts. The first dict

605 contains a (named) set of nodes, and the second dict contains a set

606 of edges between nodes of the previous dict. Check the (module-level)

607 plugin docs for details and examples.

608

609 Returns

610 -------

611 encoded_image : bytes or None

612 If the chosen ImageResource is the special target ``"<bytes>"`` then

613 write will return a byte string containing the encoded image data.

614 Otherwise, it returns None.

615

616 Notes

617 -----

618 When writing ``<bytes>``, the video is finalized immediately after the

619 first write call and calling write multiple times to append frames is

620 not possible.

621

622 """

623

624 if isinstance(ndimage, list):

625 # frames shapes must agree for video

626 if any(f.shape != ndimage[0].shape for f in ndimage):

627 raise ValueError("All frames should have the same shape")

628 elif not is_batch:

629 ndimage = np.asarray(ndimage)[None, ...]

630 else:

631 ndimage = np.asarray(ndimage)

632

633 if self._video_stream is None:

634 self.init_video_stream(codec, fps=fps, pixel_format=out_pixel_format)

635

636 self.set_video_filter(filter_sequence, filter_graph)

637

638 for img in ndimage:

639 self.write_frame(img, pixel_format=in_pixel_format)

640

641 if self.request._uri_type == URI_BYTES:

642 # bytes are immutuable, so we have to flush immediately

643 # and can't support appending

644 self._flush_writer()

645 self._container.close()

646

647 return self.request.get_file().getvalue()

648

649 def properties(self, index: int = ..., *, format: str = "rgb24") -> ImageProperties:

650 """Standardized ndimage metadata.

651

652 Parameters

653 ----------

654 index : int

655 The index of the ndimage for which to return properties. If ``...``

656 (Ellipsis, default), return the properties for the resulting batch

657 of frames.

658 format : str

659 If not None (default: rgb24), convert the data into the given format

660 before returning it. If None return the data in the encoded format

661 if that can be expressed as a strided array; otherwise raise an

662 Exception.

663

664 Returns

665 -------

666 properties : ImageProperties

667 A dataclass filled with standardized image metadata.

668

669 Notes

670 -----

671 This function is efficient and won't process any pixel data.

672

673 The provided metadata does not include modifications by any filters

674 (through ``filter_sequence`` or ``filter_graph``).

675

676 """

677

678 video_width = self._video_stream.codec_context.width

679 video_height = self._video_stream.codec_context.height

680 pix_format = format or self._video_stream.codec_context.pix_fmt

681 frame_template = av.VideoFrame(video_width, video_height, pix_format)

682

683 shape = _get_frame_shape(frame_template)

684 if index is ...:

685 n_frames = self._video_stream.frames

686 shape = (n_frames,) + shape

687

688 return ImageProperties(

689 shape=tuple(shape),

690 dtype=_format_to_dtype(frame_template.format),

691 n_images=shape[0] if index is ... else None,

692 is_batch=index is ...,

693 )

694

695 def metadata(

696 self,

697 index: int = ...,

698 exclude_applied: bool = True,

699 constant_framerate: bool = None,

700 ) -> Dict[str, Any]:

701 """Format-specific metadata.

702

703 Returns a dictionary filled with metadata that is either stored in the

704 container, the video stream, or the frame's side-data.

705

706 Parameters

707 ----------

708 index : int

709 If ... (Ellipsis, default) return global metadata (the metadata

710 stored in the container and video stream). If not ..., return the

711 side data stored in the frame at the given index.

712 exclude_applied : bool

713 Currently, this parameter has no effect. It exists for compliance with

714 the ImageIO v3 API.

715 constant_framerate : bool

716 If True assume the video's framerate is constant. This allows for

717 faster seeking inside the file. If False, the video is reset before

718 each read and searched from the beginning. If None (default), this

719 value will be read from the container format.

720

721 Returns

722 -------

723 metadata : dict

724 A dictionary filled with format-specific metadata fields and their

725 values.

726

727 """

728

729 metadata = dict()

730

731 if index is ...:

732 # useful flags defined on the container and/or video stream

733 metadata.update(

734 {

735 "video_format": self._video_stream.codec_context.pix_fmt,

736 "codec": self._video_stream.codec.name,

737 "long_codec": self._video_stream.codec.long_name,

738 "profile": self._video_stream.profile,

739 "fps": float(self._video_stream.guessed_rate),

740 }

741 )

742 if self._video_stream.duration is not None:

743 duration = float(

744 self._video_stream.duration * self._video_stream.time_base

745 )

746 metadata.update({"duration": duration})

747

748 metadata.update(self.container_metadata)

749 metadata.update(self.video_stream_metadata)

750 return metadata

751

752 if constant_framerate is None:

753 constant_framerate = not self._container.format.variable_fps

754

755 self._seek(index, constant_framerate=constant_framerate)

756 desired_frame = next(self._decoder)

757 self._next_idx += 1

758

759 # useful flags defined on the frame

760 metadata.update(

761 {

762 "key_frame": bool(desired_frame.key_frame),

763 "time": desired_frame.time,

764 "interlaced_frame": bool(desired_frame.interlaced_frame),

765 "frame_type": desired_frame.pict_type.name,

766 }

767 )

768

769 # side data

770 metadata.update(

771 {item.type.name: bytes(item) for item in desired_frame.side_data}

772 )

773

774 return metadata

775

776 def close(self) -> None:

777 """Close the Video."""

778

779 is_write = self.request.mode.io_mode == IOMode.write

780 if is_write and self._video_stream is not None:

781 self._flush_writer()

782

783 if self._video_stream is not None:

784 try:

785 self._video_stream.close()

786 except ValueError:

787 pass # stream already closed

788

789 if self._container is not None:

790 self._container.close()

791

792 self.request.finish()

793

794 def __enter__(self) -> "PyAVPlugin":

795 return super().__enter__()

796

797 # ------------------------------

798 # Add-on Interface inside imopen

799 # ------------------------------

800

801 def init_video_stream(

802 self,

803 codec: str,

804 *,

805 fps: float = 24,

806 pixel_format: str = None,

807 max_keyframe_interval: int = None,

808 force_keyframes: bool = None,

809 ) -> None:

810 """Initialize a new video stream.

811

812 This function adds a new video stream to the ImageResource using the

813 selected encoder (codec), framerate, and colorspace.

814

815 Parameters

816 ----------

817 codec : str

818 The codec to use, e.g. ``"libx264"`` or ``"vp9"``.

819 fps : float

820 The desired framerate of the video stream (frames per second).

821 pixel_format : str

822 The pixel format to use while encoding frames. If None (default) use

823 the codec's default.

824 max_keyframe_interval : int

825 The maximum distance between two intra frames (I-frames). Also known

826 as GOP size. If unspecified use the codec's default. Note that not

827 every I-frame is a keyframe; see the notes for details.

828 force_keyframes : bool

829 If True, limit inter frames dependency to frames within the current

830 keyframe interval (GOP), i.e., force every I-frame to be a keyframe.

831 If unspecified, use the codec's default.

832

833 Notes

834 -----

835 You can usually leave ``max_keyframe_interval`` and ``force_keyframes``

836 at their default values, unless you try to generate seek-optimized video

837 or have a similar specialist use-case. In this case, ``force_keyframes``

838 controls the ability to seek to _every_ I-frame, and

839 ``max_keyframe_interval`` controls how close to a random frame you can

840 seek. Low values allow more fine-grained seek at the expense of

841 file-size (and thus I/O performance).

842

843 """

844

845 fps = Fraction.from_float(fps)

846 stream = self._container.add_stream(codec, fps)

847 stream.time_base = Fraction(1 / fps).limit_denominator(int(2**16 - 1))

848 if pixel_format is not None:

849 stream.pix_fmt = pixel_format

850 if max_keyframe_interval is not None:

851 stream.gop_size = max_keyframe_interval

852 if force_keyframes is not None:

853 stream.closed_gop = force_keyframes

854

855 self._video_stream = stream

856

857 def write_frame(self, frame: np.ndarray, *, pixel_format: str = "rgb24") -> None:

858 """Add a frame to the video stream.

859

860 This function appends a new frame to the video. It assumes that the

861 stream previously has been initialized. I.e., ``init_video_stream`` has

862 to be called before calling this function for the write to succeed.

863

864 Parameters

865 ----------

866 frame : np.ndarray

867 The image to be appended/written to the video stream.

868 pixel_format : str

869 The colorspace (pixel format) of the incoming frame.

870

871 Notes

872 -----

873 Frames may be held in a buffer, e.g., by the filter pipeline used during

874 writing or by FFMPEG to batch them prior to encoding. Make sure to

875 ``.close()`` the plugin or to use a context manager to ensure that all

876 frames are written to the ImageResource.

877

878 """

879

880 # manual packing of ndarray into frame

881 # (this should live in pyAV, but it doesn't support all the formats we

882 # want and PRs there are slow)

883 pixel_format = av.VideoFormat(pixel_format)

884 img_dtype = _format_to_dtype(pixel_format)

885 width = frame.shape[2 if pixel_format.is_planar else 1]

886 height = frame.shape[1 if pixel_format.is_planar else 0]

887 av_frame = av.VideoFrame(width, height, pixel_format.name)

888 if pixel_format.is_planar:

889 for idx, plane in enumerate(av_frame.planes):

890 plane_array = np.frombuffer(plane, dtype=img_dtype)

891 plane_array = as_strided(

892 plane_array,

893 shape=(plane.height, plane.width),

894 strides=(plane.line_size, img_dtype.itemsize),

895 )

896 plane_array[...] = frame[idx]

897 else:

898 if pixel_format.name.startswith("bayer_"):

899 # ffmpeg doesn't describe bayer formats correctly

900 # see https://github.com/imageio/imageio/issues/761#issuecomment-1059318851

901 # and following for details.

902 n_channels = 1

903 else:

904 n_channels = len(pixel_format.components)

905

906 plane = av_frame.planes[0]

907 plane_shape = (plane.height, plane.width)

908 plane_strides = (plane.line_size, n_channels * img_dtype.itemsize)

909 if n_channels > 1:

910 plane_shape += (n_channels,)

911 plane_strides += (img_dtype.itemsize,)

912

913 plane_array = as_strided(

914 np.frombuffer(plane, dtype=img_dtype),

915 shape=plane_shape,

916 strides=plane_strides,

917 )

918 plane_array[...] = frame

919

920 stream = self._video_stream

921 av_frame.time_base = stream.codec_context.time_base

922 av_frame.pts = self.frames_written

923 self.frames_written += 1

924

925 if self._video_filter is not None:

926 av_frame = self._video_filter.send(av_frame)

927 if av_frame is None:

928 return

929

930 if stream.frames == 0:

931 stream.width = av_frame.width

932 stream.height = av_frame.height

933

934 for packet in stream.encode(av_frame):

935 self._container.mux(packet)

936

937 def set_video_filter(

938 self,

939 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

940 filter_graph: Tuple[dict, List] = None,

941 ) -> None:

942 """Set the filter(s) to use.

943

944 This function creates a new FFMPEG filter graph to use when reading or

945 writing video. In the case of reading, frames are passed through the

946 filter graph before begin returned and, in case of writing, frames are

947 passed through the filter before being written to the video.

948

949 Parameters

950 ----------

951 filter_sequence : List[str, str, dict]

952 If not None, apply the given sequence of FFmpeg filters to each

953 ndimage. Check the (module-level) plugin docs for details and

954 examples.

955 filter_graph : (dict, List)

956 If not None, apply the given graph of FFmpeg filters to each

957 ndimage. The graph is given as a tuple of two dicts. The first dict

958 contains a (named) set of nodes, and the second dict contains a set

959 of edges between nodes of the previous dict. Check the

960 (module-level) plugin docs for details and examples.

961

962 Notes

963 -----

964 Changing a filter graph with lag during reading or writing will

965 currently cause frames in the filter queue to be lost.

966

967 """

968

969 if filter_sequence is None and filter_graph is None:

970 self._video_filter = None

971 return

972

973 if filter_sequence is None:

974 filter_sequence = list()

975

976 node_descriptors: Dict[str, Tuple[str, Union[str, Dict]]]

977 edges: List[Tuple[str, str, int, int]]

978 if filter_graph is None:

979 node_descriptors, edges = dict(), [("video_in", "video_out", 0, 0)]

980 else:

981 node_descriptors, edges = filter_graph

982

983 graph = av.filter.Graph()

984

985 previous_node = graph.add_buffer(template=self._video_stream)

986 for filter_name, argument in filter_sequence:

987 if isinstance(argument, str):

988 current_node = graph.add(filter_name, argument)

989 else:

990 current_node = graph.add(filter_name, **argument)

991 previous_node.link_to(current_node)

992 previous_node = current_node

993

994 nodes = dict()

995 nodes["video_in"] = previous_node

996 nodes["video_out"] = graph.add("buffersink")

997 for name, (filter_name, arguments) in node_descriptors.items():

998 if isinstance(arguments, str):

999 nodes[name] = graph.add(filter_name, arguments)

1000 else:

1001 nodes[name] = graph.add(filter_name, **arguments)

1002

1003 for from_note, to_node, out_idx, in_idx in edges:

1004 nodes[from_note].link_to(nodes[to_node], out_idx, in_idx)

1005

1006 graph.configure()

1007

1008 def video_filter():

1009 # this starts a co-routine

1010 # send frames using graph.send()

1011 frame = yield None

1012

1013 # send and receive frames in "parallel"

1014 while frame is not None:

1015 graph.push(frame)

1016 try:

1017 frame = yield graph.pull()

1018 except av.error.BlockingIOError:

1019 # filter has lag and needs more frames

1020 frame = yield None

1021 except av.error.EOFError:

1022 break

1023

1024 try:

1025 # send EOF in av>=9.0

1026 graph.push(None)

1027 except ValueError: # pragma: no cover

1028 # handle av<9.0

1029 pass

1030

1031 # all frames have been sent, empty the filter

1032 while True:

1033 try:

1034 yield graph.pull()

1035 except av.error.EOFError:

1036 break # EOF

1037 except av.error.BlockingIOError: # pragma: no cover

1038 # handle av<9.0

1039 break

1040

1041 self._video_filter = video_filter()

1042 self._video_filter.send(None)

1043

1044 @property

1045 def container_metadata(self):

1046 """Container-specific metadata.

1047

1048 A dictionary containing metadata stored at the container level.

1049

1050 """

1051 return self._container.metadata

1052

1053 @property

1054 def video_stream_metadata(self):

1055 """Stream-specific metadata.

1056

1057 A dictionary containing metadata stored at the stream level.

1058

1059 """

1060 return self._video_stream.metadata

1061

1062 # -------------------------------

1063 # Internals and private functions

1064 # -------------------------------

1065

1066 def _unpack_frame(self, frame: av.VideoFrame, *, format: str = None) -> np.ndarray:

1067 """Convert a av.VideoFrame into a ndarray

1068

1069 Parameters

1070 ----------

1071 frame : av.VideoFrame

1072 The frame to unpack.

1073 format : str

1074 If not None, convert the frame to the given format before unpacking.

1075

1076 """

1077

1078 if format is not None:

1079 frame = frame.reformat(format=format)

1080

1081 dtype = _format_to_dtype(frame.format)

1082 shape = _get_frame_shape(frame)

1083

1084 planes = list()

1085 for idx in range(len(frame.planes)):

1086 n_channels = sum(

1087 [

1088 x.bits // (dtype.itemsize * 8)

1089 for x in frame.format.components

1090 if x.plane == idx

1091 ]

1092 )

1093 av_plane = frame.planes[idx]

1094 plane_shape = (av_plane.height, av_plane.width)

1095 plane_strides = (av_plane.line_size, n_channels * dtype.itemsize)

1096 if n_channels > 1:

1097 plane_shape += (n_channels,)

1098 plane_strides += (dtype.itemsize,)

1099

1100 np_plane = as_strided(

1101 np.frombuffer(av_plane, dtype=dtype),

1102 shape=plane_shape,

1103 strides=plane_strides,

1104 )

1105 planes.append(np_plane)

1106

1107 if len(planes) > 1:

1108 # Note: the planes *should* exist inside a contigous memory block

1109 # somewhere inside av.Frame however pyAV does not appear to expose this,

1110 # so we are forced to copy the planes individually instead of wrapping

1111 # them :(

1112 out = np.concatenate(planes).reshape(shape)

1113 else:

1114 out = planes[0]

1115

1116 return out

1117

1118 def _seek(self, index, *, constant_framerate: bool = True) -> Generator:

1119 """Seeks to the frame at the given index."""

1120

1121 if index == self._next_idx:

1122 return # fast path :)

1123

1124 # we must decode at least once before we seek otherwise the

1125 # returned frames become corrupt.

1126 if self._next_idx == 0:

1127 next(self._decoder)

1128 self._next_idx += 1

1129

1130 if index == self._next_idx:

1131 return # fast path :)

1132

1133 # remove this branch until I find a way to efficiently find the next

1134 # keyframe. keeping this as a reminder

1135 # if self._next_idx < index and index < self._next_keyframe_idx:

1136 # frames_to_yield = index - self._next_idx

1137 if not constant_framerate and index > self._next_idx:

1138 frames_to_yield = index - self._next_idx

1139 elif not constant_framerate:

1140 # seek backwards and can't link idx and pts

1141 self._container.seek(0)

1142 self._decoder = self._container.decode(video=0)

1143 self._next_idx = 0

1144

1145 frames_to_yield = index

1146 else:

1147 # we know that the time between consecutive frames is constant

1148 # hence we can link index and pts

1149

1150 # how many pts lie between two frames

1151 sec_delta = 1 / self._video_stream.guessed_rate

1152 pts_delta = sec_delta / self._video_stream.time_base

1153

1154 index_pts = int(index * pts_delta)

1155

1156 # this only seeks to the closed (preceeding) keyframe

1157 self._container.seek(index_pts, stream=self._video_stream)

1158 self._decoder = self._container.decode(video=0)

1159

1160 # this may be made faster if we could get the keyframe's time without

1161 # decoding it

1162 keyframe = next(self._decoder)

1163 keyframe_time = keyframe.pts * keyframe.time_base

1164 keyframe_pts = int(keyframe_time / self._video_stream.time_base)

1165 keyframe_index = keyframe_pts // pts_delta

1166

1167 self._container.seek(index_pts, stream=self._video_stream)

1168 self._next_idx = keyframe_index

1169

1170 frames_to_yield = index - keyframe_index

1171

1172 for _ in range(frames_to_yield):

1173 next(self._decoder)

1174 self._next_idx += 1

1175

1176 def _flush_writer(self):

1177 """Flush the filter and encoder

1178

1179 This will reset the filter to `None` and send EoF to the encoder,

1180 i.e., after calling, no more frames may be written.

1181

1182 """

1183

1184 stream = self._video_stream

1185

1186 if self._video_filter is not None:

1187 # flush encoder

1188 for av_frame in self._video_filter:

1189 if stream.frames == 0:

1190 stream.width = av_frame.width

1191 stream.height = av_frame.height

1192 for packet in stream.encode(av_frame):

1193 self._container.mux(packet)

1194 self._video_filter = None

1195

1196 # flush stream

1197 for packet in stream.encode():

1198 self._container.mux(packet)

1199 self._video_stream = None