Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.8/site-packages/imageio-2.35.1-py3.8.egg/imageio/plugins/pyav.py: 1%

1"""Read/Write Videos (and images) using PyAV.

3.. note::

4 To use this plugin you need to have `PyAV <https://pyav.org/docs/stable/>`_

5 installed::

7 pip install av

9This plugin wraps pyAV, a pythonic binding for the FFMPEG library. It is similar

10to our FFMPEG plugin, has improved performance, features a robust interface, and

11aims to supersede the FFMPEG plugin in the future.

14Methods

15-------

16.. note::

17 Check the respective function for a list of supported kwargs and detailed

18 documentation.

20.. autosummary::

21 :toctree:

23 PyAVPlugin.read

24 PyAVPlugin.iter

25 PyAVPlugin.write

26 PyAVPlugin.properties

27 PyAVPlugin.metadata

29Additional methods available inside the :func:`imopen <imageio.v3.imopen>`

30context:

32.. autosummary::

33 :toctree:

35 PyAVPlugin.init_video_stream

36 PyAVPlugin.write_frame

37 PyAVPlugin.set_video_filter

38 PyAVPlugin.container_metadata

39 PyAVPlugin.video_stream_metadata

41Advanced API

42------------

44In addition to the default ImageIO v3 API this plugin exposes custom functions

45that are specific to reading/writing video and its metadata. These are available

46inside the :func:`imopen <imageio.v3.imopen>` context and allow fine-grained

47control over how the video is processed. The functions are documented above and

48below you can find a usage example::

50 import imageio.v3 as iio

52 with iio.imopen("test.mp4", "w", plugin="pyav") as file:

53 file.init_video_stream("libx264")

54 file.container_metadata["comment"] = "This video was created using ImageIO."

56 for _ in range(5):

57 for frame in iio.imiter("imageio:newtonscradle.gif"):

58 file.write_frame(frame)

60 meta = iio.immeta("test.mp4", plugin="pyav")

61 assert meta["comment"] == "This video was created using ImageIO."

65Pixel Formats (Colorspaces)

66---------------------------

68By default, this plugin converts the video into 8-bit RGB (called ``rgb24`` in

69ffmpeg). This is a useful behavior for many use-cases, but sometimes you may

70want to use the video's native colorspace or you may wish to convert the video

71into an entirely different colorspace. This is controlled using the ``format``

72kwarg. You can use ``format=None`` to leave the image in its native colorspace

73or specify any colorspace supported by FFMPEG as long as it is stridable, i.e.,

74as long as it can be represented by a single numpy array. Some useful choices

75include:

77- rgb24 (default; 8-bit RGB)

78- rgb48le (16-bit lower-endian RGB)

79- bgr24 (8-bit BGR; openCVs default colorspace)

80- gray (8-bit grayscale)

81- yuv444p (8-bit channel-first YUV)

83Further, FFMPEG maintains a list of available formats, albeit not as part of the

84narrative docs. It can be `found here

85<https://ffmpeg.org/doxygen/trunk/pixfmt_8h_source.html>`_ (warning: C source

86code).

88Filters

89-------

91On top of providing basic read/write functionality, this plugin allows you to

92use the full collection of `video filters available in FFMPEG

93<https://ffmpeg.org/ffmpeg-filters.html#Video-Filters>`_. This means that you

94can apply excessive preprocessing to your video before retrieving it as a numpy

95array or apply excessive post-processing before you encode your data.

97Filters come in two forms: sequences or graphs. Filter sequences are, as the

98name suggests, sequences of filters that are applied one after the other. They

99are specified using the ``filter_sequence`` kwarg. Filter graphs, on the other

100hand, come in the form of a directed graph and are specified using the

101``filter_graph`` kwarg.

102

103.. note::

104 All filters are either sequences or graphs. If all you want is to apply a

105 single filter, you can do this by specifying a filter sequence with a single

106 entry.

107

108A ``filter_sequence`` is a list of filters, each defined through a 2-element

109tuple of the form ``(filter_name, filter_parameters)``. The first element of the

110tuple is the name of the filter. The second element are the filter parameters,

111which can be given either as a string or a dict. The string matches the same

112format that you would use when specifying the filter using the ffmpeg

113command-line tool and the dict has entries of the form ``parameter:value``. For

114example::

115

116 import imageio.v3 as iio

117

118 # using a filter_parameters str

119 img1 = iio.imread(

120 "imageio:cockatoo.mp4",

121 plugin="pyav",

122 filter_sequence=[

123 ("rotate", "45*PI/180")

124 ]

125 )

126

127 # using a filter_parameters dict

128 img2 = iio.imread(

129 "imageio:cockatoo.mp4",

130 plugin="pyav",

131 filter_sequence=[

132 ("rotate", {"angle":"45*PI/180", "fillcolor":"AliceBlue"})

133 ]

134 )

135

136A ``filter_graph``, on the other hand, is specified using a ``(nodes, edges)``

137tuple. It is best explained using an example::

138

139 img = iio.imread(

140 "imageio:cockatoo.mp4",

141 plugin="pyav",

142 filter_graph=(

143 {

144 "split": ("split", ""),

145 "scale_overlay":("scale", "512:-1"),

146 "overlay":("overlay", "x=25:y=25:enable='between(t,1,8)'"),

147 },

148 [

149 ("video_in", "split", 0, 0),

150 ("split", "overlay", 0, 0),

151 ("split", "scale_overlay", 1, 0),

152 ("scale_overlay", "overlay", 0, 1),

153 ("overlay", "video_out", 0, 0),

154 ]

155 )

156 )

157

158The above transforms the video to have picture-in-picture of itself in the top

159left corner. As you can see, nodes are specified using a dict which has names as

160its keys and filter tuples as values; the same tuples as the ones used when

161defining a filter sequence. Edges are a list of a 4-tuples of the form

162``(node_out, node_in, output_idx, input_idx)`` and specify which two filters are

163connected and which inputs/outputs should be used for this.

164

165Further, there are two special nodes in a filter graph: ``video_in`` and

166``video_out``, which represent the graph's input and output respectively. These

167names can not be chosen for other nodes (those nodes would simply be

168overwritten), and for a graph to be valid there must be a path from the input to

169the output and all nodes in the graph must be connected.

170

171While most graphs are quite simple, they can become very complex and we

172recommend that you read through the `FFMPEG documentation

173<https://ffmpeg.org/ffmpeg-filters.html#Filtergraph-description>`_ and their

174examples to better understand how to use them.

175

176"""

177

178from fractions import Fraction

179from math import ceil

180from typing import Any, Dict, List, Optional, Tuple, Union, Generator

181

182import av

183import av.filter

184import numpy as np

185from numpy.lib.stride_tricks import as_strided

186

187from ..core import Request

188from ..core.request import URI_BYTES, InitializationError, IOMode

189from ..core.v3_plugin_api import ImageProperties, PluginV3

190

191

192def _format_to_dtype(format: av.VideoFormat) -> np.dtype:

193 """Convert a pyAV video format into a numpy dtype"""

194

195 if len(format.components) == 0:

196 # fake format

197 raise ValueError(

198 f"Can't determine dtype from format `{format.name}`. It has no channels."

199 )

200

201 endian = ">" if format.is_big_endian else "<"

202 dtype = "f" if "f32" in format.name else "u"

203 bits_per_channel = [x.bits for x in format.components]

204 n_bytes = str(int(ceil(bits_per_channel[0] / 8)))

205

206 return np.dtype(endian + dtype + n_bytes)

207

208

209def _get_frame_shape(frame: av.VideoFrame) -> Tuple[int, ...]:

210 """Compute the frame's array shape

211

212 Parameters

213 ----------

214 frame : av.VideoFrame

215 A frame for which the resulting shape should be computed.

216

217 Returns

218 -------

219 shape : Tuple[int, ...]

220 A tuple describing the shape of the image data in the frame.

221

222 """

223

224 widths = [component.width for component in frame.format.components]

225 heights = [component.height for component in frame.format.components]

226 bits = np.array([component.bits for component in frame.format.components])

227 line_sizes = [plane.line_size for plane in frame.planes]

228

229 subsampled_width = widths[:-1] != widths[1:]

230 subsampled_height = heights[:-1] != heights[1:]

231 unaligned_components = np.any(bits % 8 != 0) or (line_sizes[:-1] != line_sizes[1:])

232 if subsampled_width or subsampled_height or unaligned_components:

233 raise IOError(

234 f"{frame.format.name} can't be expressed as a strided array."

235 "Use `format=` to select a format to convert into."

236 )

237

238 shape = [frame.height, frame.width]

239

240 # ffmpeg doesn't have a notion of channel-first or channel-last formats

241 # instead it stores frames in one or more planes which contain individual

242 # components of a pixel depending on the pixel format. For channel-first

243 # formats each component lives on a separate plane (n_planes) and for

244 # channel-last formats all components are packed on a single plane

245 # (n_channels)

246 n_planes = max([component.plane for component in frame.format.components]) + 1

247 if n_planes > 1:

248 shape = [n_planes] + shape

249

250 channels_per_plane = [0] * n_planes

251 for component in frame.format.components:

252 channels_per_plane[component.plane] += 1

253 n_channels = max(channels_per_plane)

254

255 if n_channels > 1:

256 shape = shape + [n_channels]

257

258 return tuple(shape)

259

260

261class PyAVPlugin(PluginV3):

262 """Support for pyAV as backend.

263

264 Parameters

265 ----------

266 request : iio.Request

267 A request object that represents the users intent. It provides a

268 standard interface to access various the various ImageResources and

269 serves them to the plugin as a file object (or file). Check the docs for

270 details.

271 container : str

272 Only used during `iio_mode="w"`! If not None, overwrite the default container

273 format chosen by pyav.

274 kwargs : Any

275 Additional kwargs are forwarded to PyAV's constructor.

276

277 """

278

279 def __init__(self, request: Request, *, container: str = None, **kwargs) -> None:

280 """Initialize a new Plugin Instance.

281

282 See Plugin's docstring for detailed documentation.

283

284 Notes

285 -----

286 The implementation here stores the request as a local variable that is

287 exposed using a @property below. If you inherit from PluginV3, remember

288 to call ``super().__init__(request)``.

289

290 """

291

292 super().__init__(request)

293

294 self._container = None

295 self._video_stream = None

296 self._video_filter = None

297

298 if request.mode.io_mode == IOMode.read:

299 self._next_idx = 0

300 try:

301 if request._uri_type == 5: # 5 is the value of URI_HTTP

302 # pyav should read from HTTP by itself. This enables reading

303 # HTTP-based streams like DASH. Note that solving streams

304 # like this is temporary until the new request object gets

305 # implemented.

306 self._container = av.open(request.raw_uri, **kwargs)

307 else:

308 self._container = av.open(request.get_file(), **kwargs)

309 self._video_stream = self._container.streams.video[0]

310 self._decoder = self._container.decode(video=0)

311 except av.AVError:

312 if isinstance(request.raw_uri, bytes):

313 msg = "PyAV does not support these `<bytes>`"

314 else:

315 msg = f"PyAV does not support `{request.raw_uri}`"

316 raise InitializationError(msg) from None

317 else:

318 self.frames_written = 0

319 file_handle = self.request.get_file()

320 filename = getattr(file_handle, "name", None)

321 extension = self.request.extension or self.request.format_hint

322 if extension is None:

323 raise InitializationError("Can't determine output container to use.")

324

325 # hacky, but beats running our own format selection logic

326 # (since av_guess_format is not exposed)

327 try:

328 setattr(file_handle, "name", filename or "tmp" + extension)

329 except AttributeError:

330 pass # read-only, nothing we can do

331

332 try:

333 self._container = av.open(

334 file_handle, mode="w", format=container, **kwargs

335 )

336 except ValueError:

337 raise InitializationError(

338 f"PyAV can not write to `{self.request.raw_uri}`"

339 )

340

341 # ---------------------

342 # Standard V3 Interface

343 # ---------------------

344

345 def read(

346 self,

347 *,

348 index: int = ...,

349 format: str = "rgb24",

350 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

351 filter_graph: Tuple[dict, List] = None,

352 constant_framerate: bool = None,

353 thread_count: int = 0,

354 thread_type: str = None,

355 ) -> np.ndarray:

356 """Read frames from the video.

357

358 If ``index`` is an integer, this function reads the index-th frame from

359 the file. If ``index`` is ... (Ellipsis), this function reads all frames

360 from the video, stacks them along the first dimension, and returns a

361 batch of frames.

362

363 Parameters

364 ----------

365 index : int

366 The index of the frame to read, e.g. ``index=5`` reads the 5th

367 frame. If ``...``, read all the frames in the video and stack them

368 along a new, prepended, batch dimension.

369 format : str

370 Set the returned colorspace. If not None (default: rgb24), convert

371 the data into the given format before returning it. If ``None``

372 return the data in the encoded format if it can be expressed as a

373 strided array; otherwise raise an Exception.

374 filter_sequence : List[str, str, dict]

375 If not None, apply the given sequence of FFmpeg filters to each

376 ndimage. Check the (module-level) plugin docs for details and

377 examples.

378 filter_graph : (dict, List)

379 If not None, apply the given graph of FFmpeg filters to each

380 ndimage. The graph is given as a tuple of two dicts. The first dict

381 contains a (named) set of nodes, and the second dict contains a set

382 of edges between nodes of the previous dict. Check the (module-level)

383 plugin docs for details and examples.

384 constant_framerate : bool

385 If True assume the video's framerate is constant. This allows for

386 faster seeking inside the file. If False, the video is reset before

387 each read and searched from the beginning. If None (default), this

388 value will be read from the container format.

389 thread_count : int

390 How many threads to use when decoding a frame. The default is 0,

391 which will set the number using ffmpeg's default, which is based on

392 the codec, number of available cores, threadding model, and other

393 considerations.

394 thread_type : str

395 The threading model to be used. One of

396

397 - `"SLICE"`: threads assemble parts of the current frame

398 - `"FRAME"`: threads may assemble future frames

399 - None (default): Uses ``"FRAME"`` if ``index=...`` and ffmpeg's

400 default otherwise.

401

402

403 Returns

404 -------

405 frame : np.ndarray

406 A numpy array containing loaded frame data.

407

408 Notes

409 -----

410 Accessing random frames repeatedly is costly (O(k), where k is the

411 average distance between two keyframes). You should do so only sparingly

412 if possible. In some cases, it can be faster to bulk-read the video (if

413 it fits into memory) and to then access the returned ndarray randomly.

414

415 The current implementation may cause problems for b-frames, i.e.,

416 bidirectionaly predicted pictures. I lack test videos to write unit

417 tests for this case.

418

419 Reading from an index other than ``...``, i.e. reading a single frame,

420 currently doesn't support filters that introduce delays.

421

422 """

423

424 if index is ...:

425 props = self.properties(format=format)

426 uses_filter = (

427 self._video_filter is not None

428 or filter_graph is not None

429 or filter_sequence is not None

430 )

431

432 self._container.seek(0)

433 if not uses_filter and props.shape[0] != 0:

434 frames = np.empty(props.shape, dtype=props.dtype)

435 for idx, frame in enumerate(

436 self.iter(

437 format=format,

438 filter_sequence=filter_sequence,

439 filter_graph=filter_graph,

440 thread_count=thread_count,

441 thread_type=thread_type or "FRAME",

442 )

443 ):

444 frames[idx] = frame

445 else:

446 frames = np.stack(

447 [

448 x

449 for x in self.iter(

450 format=format,

451 filter_sequence=filter_sequence,

452 filter_graph=filter_graph,

453 thread_count=thread_count,

454 thread_type=thread_type or "FRAME",

455 )

456 ]

457 )

458

459 # reset stream container, because threading model can't change after

460 # first access

461 self._video_stream.close()

462 self._video_stream = self._container.streams.video[0]

463

464 return frames

465

466 if thread_type is not None and thread_type != self._video_stream.thread_type:

467 self._video_stream.thread_type = thread_type

468 if (

469 thread_count != 0

470 and thread_count != self._video_stream.codec_context.thread_count

471 ):

472 # in FFMPEG thread_count == 0 means use the default count, which we

473 # change to mean don't change the thread count.

474 self._video_stream.codec_context.thread_count = thread_count

475

476 if constant_framerate is None:

477 constant_framerate = not self._container.format.variable_fps

478

479 # note: cheap for contigous incremental reads

480 self._seek(index, constant_framerate=constant_framerate)

481 desired_frame = next(self._decoder)

482 self._next_idx += 1

483

484 self.set_video_filter(filter_sequence, filter_graph)

485 if self._video_filter is not None:

486 desired_frame = self._video_filter.send(desired_frame)

487

488 return self._unpack_frame(desired_frame, format=format)

489

490 def iter(

491 self,

492 *,

493 format: str = "rgb24",

494 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

495 filter_graph: Tuple[dict, List] = None,

496 thread_count: int = 0,

497 thread_type: str = None,

498 ) -> np.ndarray:

499 """Yield frames from the video.

500

501 Parameters

502 ----------

503 frame : np.ndarray

504 A numpy array containing loaded frame data.

505 format : str

506 Convert the data into the given format before returning it. If None,

507 return the data in the encoded format if it can be expressed as a

508 strided array; otherwise raise an Exception.

509 filter_sequence : List[str, str, dict]

510 Set the returned colorspace. If not None (default: rgb24), convert

511 the data into the given format before returning it. If ``None``

512 return the data in the encoded format if it can be expressed as a

513 strided array; otherwise raise an Exception.

514 filter_graph : (dict, List)

515 If not None, apply the given graph of FFmpeg filters to each

516 ndimage. The graph is given as a tuple of two dicts. The first dict

517 contains a (named) set of nodes, and the second dict contains a set

518 of edges between nodes of the previous dict. Check the (module-level)

519 plugin docs for details and examples.

520 thread_count : int

521 How many threads to use when decoding a frame. The default is 0,

522 which will set the number using ffmpeg's default, which is based on

523 the codec, number of available cores, threadding model, and other

524 considerations.

525 thread_type : str

526 The threading model to be used. One of

527

528 - `"SLICE"` (default): threads assemble parts of the current frame

529 - `"FRAME"`: threads may assemble future frames (faster for bulk reading)

530

531

532 Yields

533 ------

534 frame : np.ndarray

535 A (decoded) video frame.

536

537

538 """

539

540 self._video_stream.thread_type = thread_type or "SLICE"

541 self._video_stream.codec_context.thread_count = thread_count

542

543 self.set_video_filter(filter_sequence, filter_graph)

544

545 for frame in self._decoder:

546 self._next_idx += 1

547

548 if self._video_filter is not None:

549 try:

550 frame = self._video_filter.send(frame)

551 except StopIteration:

552 break

553

554 if frame is None:

555 continue

556

557 yield self._unpack_frame(frame, format=format)

558

559 if self._video_filter is not None:

560 for frame in self._video_filter:

561 yield self._unpack_frame(frame, format=format)

562

563 def write(

564 self,

565 ndimage: Union[np.ndarray, List[np.ndarray]],

566 *,

567 codec: str = None,

568 is_batch: bool = True,

569 fps: int = 24,

570 in_pixel_format: str = "rgb24",

571 out_pixel_format: str = None,

572 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

573 filter_graph: Tuple[dict, List] = None,

574 ) -> Optional[bytes]:

575 """Save a ndimage as a video.

576

577 Given a batch of frames (stacked along the first axis) or a list of

578 frames, encode them and add the result to the ImageResource.

579

580 Parameters

581 ----------

582 ndimage : ArrayLike, List[ArrayLike]

583 The ndimage to encode and write to the ImageResource.

584 codec : str

585 The codec to use when encoding frames. Only needed on first write

586 and ignored on subsequent writes.

587 is_batch : bool

588 If True (default), the ndimage is a batch of images, otherwise it is

589 a single image. This parameter has no effect on lists of ndimages.

590 fps : str

591 The resulting videos frames per second.

592 in_pixel_format : str

593 The pixel format of the incoming ndarray. Defaults to "rgb24" and can

594 be any stridable pix_fmt supported by FFmpeg.

595 out_pixel_format : str

596 The pixel format to use while encoding frames. If None (default)

597 use the codec's default.

598 filter_sequence : List[str, str, dict]

599 If not None, apply the given sequence of FFmpeg filters to each

600 ndimage. Check the (module-level) plugin docs for details and

601 examples.

602 filter_graph : (dict, List)

603 If not None, apply the given graph of FFmpeg filters to each

604 ndimage. The graph is given as a tuple of two dicts. The first dict

605 contains a (named) set of nodes, and the second dict contains a set

606 of edges between nodes of the previous dict. Check the (module-level)

607 plugin docs for details and examples.

608

609 Returns

610 -------

611 encoded_image : bytes or None

612 If the chosen ImageResource is the special target ``"<bytes>"`` then

613 write will return a byte string containing the encoded image data.

614 Otherwise, it returns None.

615

616 Notes

617 -----

618 When writing ``<bytes>``, the video is finalized immediately after the

619 first write call and calling write multiple times to append frames is

620 not possible.

621

622 """

623

624 if isinstance(ndimage, list):

625 # frames shapes must agree for video

626 if any(f.shape != ndimage[0].shape for f in ndimage):

627 raise ValueError("All frames should have the same shape")

628 elif not is_batch:

629 ndimage = np.asarray(ndimage)[None, ...]

630 else:

631 ndimage = np.asarray(ndimage)

632

633 if self._video_stream is None:

634 self.init_video_stream(codec, fps=fps, pixel_format=out_pixel_format)

635

636 self.set_video_filter(filter_sequence, filter_graph)

637

638 for img in ndimage:

639 self.write_frame(img, pixel_format=in_pixel_format)

640

641 if self.request._uri_type == URI_BYTES:

642 # bytes are immutuable, so we have to flush immediately

643 # and can't support appending

644 self._flush_writer()

645 self._container.close()

646

647 return self.request.get_file().getvalue()

648

649 def properties(self, index: int = ..., *, format: str = "rgb24") -> ImageProperties:

650 """Standardized ndimage metadata.

651

652 Parameters

653 ----------

654 index : int

655 The index of the ndimage for which to return properties. If ``...``

656 (Ellipsis, default), return the properties for the resulting batch

657 of frames.

658 format : str

659 If not None (default: rgb24), convert the data into the given format

660 before returning it. If None return the data in the encoded format

661 if that can be expressed as a strided array; otherwise raise an

662 Exception.

663

664 Returns

665 -------

666 properties : ImageProperties

667 A dataclass filled with standardized image metadata.

668

669 Notes

670 -----

671 This function is efficient and won't process any pixel data.

672

673 The provided metadata does not include modifications by any filters

674 (through ``filter_sequence`` or ``filter_graph``).

675

676 """

677

678 video_width = self._video_stream.codec_context.width

679 video_height = self._video_stream.codec_context.height

680 pix_format = format or self._video_stream.codec_context.pix_fmt

681 frame_template = av.VideoFrame(video_width, video_height, pix_format)

682

683 shape = _get_frame_shape(frame_template)

684 if index is ...:

685 n_frames = self._video_stream.frames

686 shape = (n_frames,) + shape

687

688 return ImageProperties(

689 shape=tuple(shape),

690 dtype=_format_to_dtype(frame_template.format),

691 n_images=shape[0] if index is ... else None,

692 is_batch=index is ...,

693 )

694

695 def metadata(

696 self,

697 index: int = ...,

698 exclude_applied: bool = True,

699 constant_framerate: bool = None,

700 ) -> Dict[str, Any]:

701 """Format-specific metadata.

702

703 Returns a dictionary filled with metadata that is either stored in the

704 container, the video stream, or the frame's side-data.

705

706 Parameters

707 ----------

708 index : int

709 If ... (Ellipsis, default) return global metadata (the metadata

710 stored in the container and video stream). If not ..., return the

711 side data stored in the frame at the given index.

712 exclude_applied : bool

713 Currently, this parameter has no effect. It exists for compliance with

714 the ImageIO v3 API.

715 constant_framerate : bool

716 If True assume the video's framerate is constant. This allows for

717 faster seeking inside the file. If False, the video is reset before

718 each read and searched from the beginning. If None (default), this

719 value will be read from the container format.

720

721 Returns

722 -------

723 metadata : dict

724 A dictionary filled with format-specific metadata fields and their

725 values.

726

727 """

728

729 metadata = dict()

730

731 if index is ...:

732 # useful flags defined on the container and/or video stream

733 metadata.update(

734 {

735 "video_format": self._video_stream.codec_context.pix_fmt,

736 "codec": self._video_stream.codec.name,

737 "long_codec": self._video_stream.codec.long_name,

738 "profile": self._video_stream.profile,

739 "fps": float(self._video_stream.guessed_rate),

740 }

741 )

742 if self._video_stream.duration is not None:

743 duration = float(

744 self._video_stream.duration * self._video_stream.time_base

745 )

746 metadata.update({"duration": duration})

747

748 metadata.update(self.container_metadata)

749 metadata.update(self.video_stream_metadata)

750 return metadata

751

752 if constant_framerate is None:

753 constant_framerate = not self._container.format.variable_fps

754

755 self._seek(index, constant_framerate=constant_framerate)

756 desired_frame = next(self._decoder)

757 self._next_idx += 1

758

759 # useful flags defined on the frame

760 metadata.update(

761 {

762 "key_frame": bool(desired_frame.key_frame),

763 "time": desired_frame.time,

764 "interlaced_frame": bool(desired_frame.interlaced_frame),

765 "frame_type": desired_frame.pict_type.name,

766 }

767 )

768

769 # side data

770 metadata.update(

771 {item.type.name: item.to_bytes() for item in desired_frame.side_data}

772 )

773

774 return metadata

775

776 def close(self) -> None:

777 """Close the Video."""

778

779 is_write = self.request.mode.io_mode == IOMode.write

780 if is_write and self._video_stream is not None:

781 self._flush_writer()

782

783 if self._video_stream is not None:

784 try:

785 self._video_stream.close()

786 except ValueError:

787 pass # stream already closed

788

789 if self._container is not None:

790 self._container.close()

791

792 self.request.finish()

793

794 def __enter__(self) -> "PyAVPlugin":

795 return super().__enter__()

796

797 # ------------------------------

798 # Add-on Interface inside imopen

799 # ------------------------------

800

801 def init_video_stream(

802 self,

803 codec: str,

804 *,

805 fps: float = 24,

806 pixel_format: str = None,

807 max_keyframe_interval: int = None,

808 force_keyframes: bool = None,

809 ) -> None:

810 """Initialize a new video stream.

811

812 This function adds a new video stream to the ImageResource using the

813 selected encoder (codec), framerate, and colorspace.

814

815 Parameters

816 ----------

817 codec : str

818 The codec to use, e.g. ``"libx264"`` or ``"vp9"``.

819 fps : float

820 The desired framerate of the video stream (frames per second).

821 pixel_format : str

822 The pixel format to use while encoding frames. If None (default) use

823 the codec's default.

824 max_keyframe_interval : int

825 The maximum distance between two intra frames (I-frames). Also known

826 as GOP size. If unspecified use the codec's default. Note that not

827 every I-frame is a keyframe; see the notes for details.

828 force_keyframes : bool

829 If True, limit inter frames dependency to frames within the current

830 keyframe interval (GOP), i.e., force every I-frame to be a keyframe.

831 If unspecified, use the codec's default.

832

833 Notes

834 -----

835 You can usually leave ``max_keyframe_interval`` and ``force_keyframes``

836 at their default values, unless you try to generate seek-optimized video

837 or have a similar specialist use-case. In this case, ``force_keyframes``

838 controls the ability to seek to _every_ I-frame, and

839 ``max_keyframe_interval`` controls how close to a random frame you can

840 seek. Low values allow more fine-grained seek at the expense of

841 file-size (and thus I/O performance).

842

843 """

844

845 stream = self._container.add_stream(codec, fps)

846 stream.time_base = Fraction(1 / fps).limit_denominator(int(2**16 - 1))

847 if pixel_format is not None:

848 stream.pix_fmt = pixel_format

849 if max_keyframe_interval is not None:

850 stream.gop_size = max_keyframe_interval

851 if force_keyframes is not None:

852 stream.closed_gop = force_keyframes

853

854 self._video_stream = stream

855

856 def write_frame(self, frame: np.ndarray, *, pixel_format: str = "rgb24") -> None:

857 """Add a frame to the video stream.

858

859 This function appends a new frame to the video. It assumes that the

860 stream previously has been initialized. I.e., ``init_video_stream`` has

861 to be called before calling this function for the write to succeed.

862

863 Parameters

864 ----------

865 frame : np.ndarray

866 The image to be appended/written to the video stream.

867 pixel_format : str

868 The colorspace (pixel format) of the incoming frame.

869

870 Notes

871 -----

872 Frames may be held in a buffer, e.g., by the filter pipeline used during

873 writing or by FFMPEG to batch them prior to encoding. Make sure to

874 ``.close()`` the plugin or to use a context manager to ensure that all

875 frames are written to the ImageResource.

876

877 """

878

879 # manual packing of ndarray into frame

880 # (this should live in pyAV, but it doesn't support all the formats we

881 # want and PRs there are slow)

882 pixel_format = av.VideoFormat(pixel_format)

883 img_dtype = _format_to_dtype(pixel_format)

884 width = frame.shape[2 if pixel_format.is_planar else 1]

885 height = frame.shape[1 if pixel_format.is_planar else 0]

886 av_frame = av.VideoFrame(width, height, pixel_format.name)

887 if pixel_format.is_planar:

888 for idx, plane in enumerate(av_frame.planes):

889 plane_array = np.frombuffer(plane, dtype=img_dtype)

890 plane_array = as_strided(

891 plane_array,

892 shape=(plane.height, plane.width),

893 strides=(plane.line_size, img_dtype.itemsize),

894 )

895 plane_array[...] = frame[idx]

896 else:

897 if pixel_format.name.startswith("bayer_"):

898 # ffmpeg doesn't describe bayer formats correctly

899 # see https://github.com/imageio/imageio/issues/761#issuecomment-1059318851

900 # and following for details.

901 n_channels = 1

902 else:

903 n_channels = len(pixel_format.components)

904

905 plane = av_frame.planes[0]

906 plane_shape = (plane.height, plane.width)

907 plane_strides = (plane.line_size, n_channels * img_dtype.itemsize)

908 if n_channels > 1:

909 plane_shape += (n_channels,)

910 plane_strides += (img_dtype.itemsize,)

911

912 plane_array = as_strided(

913 np.frombuffer(plane, dtype=img_dtype),

914 shape=plane_shape,

915 strides=plane_strides,

916 )

917 plane_array[...] = frame

918

919 stream = self._video_stream

920 av_frame.time_base = stream.codec_context.time_base

921 av_frame.pts = self.frames_written

922 self.frames_written += 1

923

924 if self._video_filter is not None:

925 av_frame = self._video_filter.send(av_frame)

926 if av_frame is None:

927 return

928

929 if stream.frames == 0:

930 stream.width = av_frame.width

931 stream.height = av_frame.height

932

933 for packet in stream.encode(av_frame):

934 self._container.mux(packet)

935

936 def set_video_filter(

937 self,

938 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

939 filter_graph: Tuple[dict, List] = None,

940 ) -> None:

941 """Set the filter(s) to use.

942

943 This function creates a new FFMPEG filter graph to use when reading or

944 writing video. In the case of reading, frames are passed through the

945 filter graph before begin returned and, in case of writing, frames are

946 passed through the filter before being written to the video.

947

948 Parameters

949 ----------

950 filter_sequence : List[str, str, dict]

951 If not None, apply the given sequence of FFmpeg filters to each

952 ndimage. Check the (module-level) plugin docs for details and

953 examples.

954 filter_graph : (dict, List)

955 If not None, apply the given graph of FFmpeg filters to each

956 ndimage. The graph is given as a tuple of two dicts. The first dict

957 contains a (named) set of nodes, and the second dict contains a set

958 of edges between nodes of the previous dict. Check the

959 (module-level) plugin docs for details and examples.

960

961 Notes

962 -----

963 Changing a filter graph with lag during reading or writing will

964 currently cause frames in the filter queue to be lost.

965

966 """

967

968 if filter_sequence is None and filter_graph is None:

969 self._video_filter = None

970 return

971

972 if filter_sequence is None:

973 filter_sequence = list()

974

975 node_descriptors: Dict[str, Tuple[str, Union[str, Dict]]]

976 edges: List[Tuple[str, str, int, int]]

977 if filter_graph is None:

978 node_descriptors, edges = dict(), [("video_in", "video_out", 0, 0)]

979 else:

980 node_descriptors, edges = filter_graph

981

982 graph = av.filter.Graph()

983

984 previous_node = graph.add_buffer(template=self._video_stream)

985 for filter_name, argument in filter_sequence:

986 if isinstance(argument, str):

987 current_node = graph.add(filter_name, argument)

988 else:

989 current_node = graph.add(filter_name, **argument)

990 previous_node.link_to(current_node)

991 previous_node = current_node

992

993 nodes = dict()

994 nodes["video_in"] = previous_node

995 nodes["video_out"] = graph.add("buffersink")

996 for name, (filter_name, arguments) in node_descriptors.items():

997 if isinstance(arguments, str):

998 nodes[name] = graph.add(filter_name, arguments)

999 else:

1000 nodes[name] = graph.add(filter_name, **arguments)

1001

1002 for from_note, to_node, out_idx, in_idx in edges:

1003 nodes[from_note].link_to(nodes[to_node], out_idx, in_idx)

1004

1005 graph.configure()

1006

1007 def video_filter():

1008 # this starts a co-routine

1009 # send frames using graph.send()

1010 frame = yield None

1011

1012 # send and receive frames in "parallel"

1013 while frame is not None:

1014 graph.push(frame)

1015 try:

1016 frame = yield graph.pull()

1017 except av.error.BlockingIOError:

1018 # filter has lag and needs more frames

1019 frame = yield None

1020 except av.error.EOFError:

1021 break

1022

1023 try:

1024 # send EOF in av>=9.0

1025 graph.push(None)

1026 except ValueError: # pragma: no cover

1027 # handle av<9.0

1028 pass

1029

1030 # all frames have been sent, empty the filter

1031 while True:

1032 try:

1033 yield graph.pull()

1034 except av.error.EOFError:

1035 break # EOF

1036 except av.error.BlockingIOError: # pragma: no cover

1037 # handle av<9.0

1038 break

1039

1040 self._video_filter = video_filter()

1041 self._video_filter.send(None)

1042

1043 @property

1044 def container_metadata(self):

1045 """Container-specific metadata.

1046

1047 A dictionary containing metadata stored at the container level.

1048

1049 """

1050 return self._container.metadata

1051

1052 @property

1053 def video_stream_metadata(self):

1054 """Stream-specific metadata.

1055

1056 A dictionary containing metadata stored at the stream level.

1057

1058 """

1059 return self._video_stream.metadata

1060

1061 # -------------------------------

1062 # Internals and private functions

1063 # -------------------------------

1064

1065 def _unpack_frame(self, frame: av.VideoFrame, *, format: str = None) -> np.ndarray:

1066 """Convert a av.VideoFrame into a ndarray

1067

1068 Parameters

1069 ----------

1070 frame : av.VideoFrame

1071 The frame to unpack.

1072 format : str

1073 If not None, convert the frame to the given format before unpacking.

1074

1075 """

1076

1077 if format is not None:

1078 frame = frame.reformat(format=format)

1079

1080 dtype = _format_to_dtype(frame.format)

1081 shape = _get_frame_shape(frame)

1082

1083 planes = list()

1084 for idx in range(len(frame.planes)):

1085 n_channels = sum(

1086 [

1087 x.bits // (dtype.itemsize * 8)

1088 for x in frame.format.components

1089 if x.plane == idx

1090 ]

1091 )

1092 av_plane = frame.planes[idx]

1093 plane_shape = (av_plane.height, av_plane.width)

1094 plane_strides = (av_plane.line_size, n_channels * dtype.itemsize)

1095 if n_channels > 1:

1096 plane_shape += (n_channels,)

1097 plane_strides += (dtype.itemsize,)

1098

1099 np_plane = as_strided(

1100 np.frombuffer(av_plane, dtype=dtype),

1101 shape=plane_shape,

1102 strides=plane_strides,

1103 )

1104 planes.append(np_plane)

1105

1106 if len(planes) > 1:

1107 # Note: the planes *should* exist inside a contigous memory block

1108 # somewhere inside av.Frame however pyAV does not appear to expose this,

1109 # so we are forced to copy the planes individually instead of wrapping

1110 # them :(

1111 out = np.concatenate(planes).reshape(shape)

1112 else:

1113 out = planes[0]

1114

1115 return out

1116

1117 def _seek(self, index, *, constant_framerate: bool = True) -> Generator:

1118 """Seeks to the frame at the given index."""

1119

1120 if index == self._next_idx:

1121 return # fast path :)

1122

1123 # we must decode at least once before we seek otherwise the

1124 # returned frames become corrupt.

1125 if self._next_idx == 0:

1126 next(self._decoder)

1127 self._next_idx += 1

1128

1129 if index == self._next_idx:

1130 return # fast path :)

1131

1132 # remove this branch until I find a way to efficiently find the next

1133 # keyframe. keeping this as a reminder

1134 # if self._next_idx < index and index < self._next_keyframe_idx:

1135 # frames_to_yield = index - self._next_idx

1136 if not constant_framerate and index > self._next_idx:

1137 frames_to_yield = index - self._next_idx

1138 elif not constant_framerate:

1139 # seek backwards and can't link idx and pts

1140 self._container.seek(0)

1141 self._decoder = self._container.decode(video=0)

1142 self._next_idx = 0

1143

1144 frames_to_yield = index

1145 else:

1146 # we know that the time between consecutive frames is constant

1147 # hence we can link index and pts

1148

1149 # how many pts lie between two frames

1150 sec_delta = 1 / self._video_stream.guessed_rate

1151 pts_delta = sec_delta / self._video_stream.time_base

1152

1153 index_pts = int(index * pts_delta)

1154

1155 # this only seeks to the closed (preceeding) keyframe

1156 self._container.seek(index_pts, stream=self._video_stream)

1157 self._decoder = self._container.decode(video=0)

1158

1159 # this may be made faster if we could get the keyframe's time without

1160 # decoding it

1161 keyframe = next(self._decoder)

1162 keyframe_time = keyframe.pts * keyframe.time_base

1163 keyframe_pts = int(keyframe_time / self._video_stream.time_base)

1164 keyframe_index = keyframe_pts // pts_delta

1165

1166 self._container.seek(index_pts, stream=self._video_stream)

1167 self._next_idx = keyframe_index

1168

1169 frames_to_yield = index - keyframe_index

1170

1171 for _ in range(frames_to_yield):

1172 next(self._decoder)

1173 self._next_idx += 1

1174

1175 def _flush_writer(self):

1176 """Flush the filter and encoder

1177

1178 This will reset the filter to `None` and send EoF to the encoder,

1179 i.e., after calling, no more frames may be written.

1180

1181 """

1182

1183 stream = self._video_stream

1184

1185 if self._video_filter is not None:

1186 # flush encoder

1187 for av_frame in self._video_filter:

1188 if stream.frames == 0:

1189 stream.width = av_frame.width

1190 stream.height = av_frame.height

1191 for packet in stream.encode(av_frame):

1192 self._container.mux(packet)

1193 self._video_filter = None

1194

1195 # flush stream

1196 for packet in stream.encode():

1197 self._container.mux(packet)

1198 self._video_stream = None