Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.10/site-packages/imageio-2.37.0-py3.10.egg/imageio/plugins/pyav.py: 1%

1"""Read/Write Videos (and images) using PyAV.

3.. note::

4 To use this plugin you need to have `PyAV <https://pyav.org/docs/stable/>`_

5 installed::

7 pip install av

9This plugin wraps pyAV, a pythonic binding for the FFMPEG library. It is similar

10to our FFMPEG plugin, has improved performance, features a robust interface, and

11aims to supersede the FFMPEG plugin in the future.

14Methods

15-------

16.. note::

17 Check the respective function for a list of supported kwargs and detailed

18 documentation.

20.. autosummary::

21 :toctree:

23 PyAVPlugin.read

24 PyAVPlugin.iter

25 PyAVPlugin.write

26 PyAVPlugin.properties

27 PyAVPlugin.metadata

29Additional methods available inside the :func:`imopen <imageio.v3.imopen>`

30context:

32.. autosummary::

33 :toctree:

35 PyAVPlugin.init_video_stream

36 PyAVPlugin.write_frame

37 PyAVPlugin.set_video_filter

38 PyAVPlugin.container_metadata

39 PyAVPlugin.video_stream_metadata

41Advanced API

42------------

44In addition to the default ImageIO v3 API this plugin exposes custom functions

45that are specific to reading/writing video and its metadata. These are available

46inside the :func:`imopen <imageio.v3.imopen>` context and allow fine-grained

47control over how the video is processed. The functions are documented above and

48below you can find a usage example::

50 import imageio.v3 as iio

52 with iio.imopen("test.mp4", "w", plugin="pyav") as file:

53 file.init_video_stream("libx264")

54 file.container_metadata["comment"] = "This video was created using ImageIO."

56 for _ in range(5):

57 for frame in iio.imiter("imageio:newtonscradle.gif"):

58 file.write_frame(frame)

60 meta = iio.immeta("test.mp4", plugin="pyav")

61 assert meta["comment"] == "This video was created using ImageIO."

65Pixel Formats (Colorspaces)

66---------------------------

68By default, this plugin converts the video into 8-bit RGB (called ``rgb24`` in

69ffmpeg). This is a useful behavior for many use-cases, but sometimes you may

70want to use the video's native colorspace or you may wish to convert the video

71into an entirely different colorspace. This is controlled using the ``format``

72kwarg. You can use ``format=None`` to leave the image in its native colorspace

73or specify any colorspace supported by FFMPEG as long as it is stridable, i.e.,

74as long as it can be represented by a single numpy array. Some useful choices

75include:

77- rgb24 (default; 8-bit RGB)

78- rgb48le (16-bit lower-endian RGB)

79- bgr24 (8-bit BGR; openCVs default colorspace)

80- gray (8-bit grayscale)

81- yuv444p (8-bit channel-first YUV)

83Further, FFMPEG maintains a list of available formats, albeit not as part of the

84narrative docs. It can be `found here

85<https://ffmpeg.org/doxygen/trunk/pixfmt_8h_source.html>`_ (warning: C source

86code).

88Filters

89-------

91On top of providing basic read/write functionality, this plugin allows you to

92use the full collection of `video filters available in FFMPEG

93<https://ffmpeg.org/ffmpeg-filters.html#Video-Filters>`_. This means that you

94can apply excessive preprocessing to your video before retrieving it as a numpy

95array or apply excessive post-processing before you encode your data.

97Filters come in two forms: sequences or graphs. Filter sequences are, as the

98name suggests, sequences of filters that are applied one after the other. They

99are specified using the ``filter_sequence`` kwarg. Filter graphs, on the other

100hand, come in the form of a directed graph and are specified using the

101``filter_graph`` kwarg.

102

103.. note::

104 All filters are either sequences or graphs. If all you want is to apply a

105 single filter, you can do this by specifying a filter sequence with a single

106 entry.

107

108A ``filter_sequence`` is a list of filters, each defined through a 2-element

109tuple of the form ``(filter_name, filter_parameters)``. The first element of the

110tuple is the name of the filter. The second element are the filter parameters,

111which can be given either as a string or a dict. The string matches the same

112format that you would use when specifying the filter using the ffmpeg

113command-line tool and the dict has entries of the form ``parameter:value``. For

114example::

115

116 import imageio.v3 as iio

117

118 # using a filter_parameters str

119 img1 = iio.imread(

120 "imageio:cockatoo.mp4",

121 plugin="pyav",

122 filter_sequence=[

123 ("rotate", "45*PI/180")

124 ]

125 )

126

127 # using a filter_parameters dict

128 img2 = iio.imread(

129 "imageio:cockatoo.mp4",

130 plugin="pyav",

131 filter_sequence=[

132 ("rotate", {"angle":"45*PI/180", "fillcolor":"AliceBlue"})

133 ]

134 )

135

136A ``filter_graph``, on the other hand, is specified using a ``(nodes, edges)``

137tuple. It is best explained using an example::

138

139 img = iio.imread(

140 "imageio:cockatoo.mp4",

141 plugin="pyav",

142 filter_graph=(

143 {

144 "split": ("split", ""),

145 "scale_overlay":("scale", "512:-1"),

146 "overlay":("overlay", "x=25:y=25:enable='between(t,1,8)'"),

147 },

148 [

149 ("video_in", "split", 0, 0),

150 ("split", "overlay", 0, 0),

151 ("split", "scale_overlay", 1, 0),

152 ("scale_overlay", "overlay", 0, 1),

153 ("overlay", "video_out", 0, 0),

154 ]

155 )

156 )

157

158The above transforms the video to have picture-in-picture of itself in the top

159left corner. As you can see, nodes are specified using a dict which has names as

160its keys and filter tuples as values; the same tuples as the ones used when

161defining a filter sequence. Edges are a list of a 4-tuples of the form

162``(node_out, node_in, output_idx, input_idx)`` and specify which two filters are

163connected and which inputs/outputs should be used for this.

164

165Further, there are two special nodes in a filter graph: ``video_in`` and

166``video_out``, which represent the graph's input and output respectively. These

167names can not be chosen for other nodes (those nodes would simply be

168overwritten), and for a graph to be valid there must be a path from the input to

169the output and all nodes in the graph must be connected.

170

171While most graphs are quite simple, they can become very complex and we

172recommend that you read through the `FFMPEG documentation

173<https://ffmpeg.org/ffmpeg-filters.html#Filtergraph-description>`_ and their

174examples to better understand how to use them.

175

176"""

177

178from fractions import Fraction

179from math import ceil

180from typing import Any, Dict, Generator, List, Optional, Tuple, Union

181

182import av

183import av.filter

184import numpy as np

185from av.codec.context import Flags

186from numpy.lib.stride_tricks import as_strided

187

188from ..core import Request

189from ..core.request import URI_BYTES, InitializationError, IOMode

190from ..core.v3_plugin_api import ImageProperties, PluginV3

191

192

193def _format_to_dtype(format: av.VideoFormat) -> np.dtype:

194 """Convert a pyAV video format into a numpy dtype"""

195

196 if len(format.components) == 0:

197 # fake format

198 raise ValueError(

199 f"Can't determine dtype from format `{format.name}`. It has no channels."

200 )

201

202 endian = ">" if format.is_big_endian else "<"

203 dtype = "f" if "f32" in format.name else "u"

204 bits_per_channel = [x.bits for x in format.components]

205 n_bytes = str(int(ceil(bits_per_channel[0] / 8)))

206

207 return np.dtype(endian + dtype + n_bytes)

208

209

210def _get_frame_shape(frame: av.VideoFrame) -> Tuple[int, ...]:

211 """Compute the frame's array shape

212

213 Parameters

214 ----------

215 frame : av.VideoFrame

216 A frame for which the resulting shape should be computed.

217

218 Returns

219 -------

220 shape : Tuple[int, ...]

221 A tuple describing the shape of the image data in the frame.

222

223 """

224

225 widths = [component.width for component in frame.format.components]

226 heights = [component.height for component in frame.format.components]

227 bits = np.array([component.bits for component in frame.format.components])

228 line_sizes = [plane.line_size for plane in frame.planes]

229

230 subsampled_width = widths[:-1] != widths[1:]

231 subsampled_height = heights[:-1] != heights[1:]

232 unaligned_components = np.any(bits % 8 != 0) or (line_sizes[:-1] != line_sizes[1:])

233 if subsampled_width or subsampled_height or unaligned_components:

234 raise IOError(

235 f"{frame.format.name} can't be expressed as a strided array."

236 "Use `format=` to select a format to convert into."

237 )

238

239 shape = [frame.height, frame.width]

240

241 # ffmpeg doesn't have a notion of channel-first or channel-last formats

242 # instead it stores frames in one or more planes which contain individual

243 # components of a pixel depending on the pixel format. For channel-first

244 # formats each component lives on a separate plane (n_planes) and for

245 # channel-last formats all components are packed on a single plane

246 # (n_channels)

247 n_planes = max([component.plane for component in frame.format.components]) + 1

248 if n_planes > 1:

249 shape = [n_planes] + shape

250

251 channels_per_plane = [0] * n_planes

252 for component in frame.format.components:

253 channels_per_plane[component.plane] += 1

254 n_channels = max(channels_per_plane)

255

256 if n_channels > 1:

257 shape = shape + [n_channels]

258

259 return tuple(shape)

260

261

262def _get_frame_type(picture_type: int) -> str:

263 """Return a human-readable name for provided picture type

264

265 Parameters

266 ----------

267 picture_type : int

268 The picture type extracted from Frame.pict_type

269

270 Returns

271 -------

272 picture_name : str

273 A human readable name of the picture type

274

275 """

276

277 if not isinstance(picture_type, int):

278 # old pyAV versions send an enum, not an int

279 return picture_type.name

280

281 picture_types = [

282 "NONE",

283 "I",

284 "P",

285 "B",

286 "S",

287 "SI",

288 "SP",

289 "BI",

290 ]

291

292 return picture_types[picture_type]

293

294

295class PyAVPlugin(PluginV3):

296 """Support for pyAV as backend.

297

298 Parameters

299 ----------

300 request : iio.Request

301 A request object that represents the users intent. It provides a

302 standard interface to access various the various ImageResources and

303 serves them to the plugin as a file object (or file). Check the docs for

304 details.

305 container : str

306 Only used during `iio_mode="w"`! If not None, overwrite the default container

307 format chosen by pyav.

308 kwargs : Any

309 Additional kwargs are forwarded to PyAV's constructor.

310

311 """

312

313 def __init__(self, request: Request, *, container: str = None, **kwargs) -> None:

314 """Initialize a new Plugin Instance.

315

316 See Plugin's docstring for detailed documentation.

317

318 Notes

319 -----

320 The implementation here stores the request as a local variable that is

321 exposed using a @property below. If you inherit from PluginV3, remember

322 to call ``super().__init__(request)``.

323

324 """

325

326 super().__init__(request)

327

328 self._container = None

329 self._video_stream = None

330 self._video_filter = None

331

332 if request.mode.io_mode == IOMode.read:

333 self._next_idx = 0

334 try:

335 if request._uri_type == 5: # 5 is the value of URI_HTTP

336 # pyav should read from HTTP by itself. This enables reading

337 # HTTP-based streams like DASH. Note that solving streams

338 # like this is temporary until the new request object gets

339 # implemented.

340 self._container = av.open(request.raw_uri, **kwargs)

341 else:

342 self._container = av.open(request.get_file(), **kwargs)

343 self._video_stream = self._container.streams.video[0]

344 self._decoder = self._container.decode(video=0)

345 except av.FFmpegError:

346 if isinstance(request.raw_uri, bytes):

347 msg = "PyAV does not support these `<bytes>`"

348 else:

349 msg = f"PyAV does not support `{request.raw_uri}`"

350 raise InitializationError(msg) from None

351 else:

352 self.frames_written = 0

353 file_handle = self.request.get_file()

354 filename = getattr(file_handle, "name", None)

355 extension = self.request.extension or self.request.format_hint

356 if extension is None:

357 raise InitializationError("Can't determine output container to use.")

358

359 # hacky, but beats running our own format selection logic

360 # (since av_guess_format is not exposed)

361 try:

362 setattr(file_handle, "name", filename or "tmp" + extension)

363 except AttributeError:

364 pass # read-only, nothing we can do

365

366 try:

367 self._container = av.open(

368 file_handle, mode="w", format=container, **kwargs

369 )

370 except ValueError:

371 raise InitializationError(

372 f"PyAV can not write to `{self.request.raw_uri}`"

373 )

374

375 # ---------------------

376 # Standard V3 Interface

377 # ---------------------

378

379 def read(

380 self,

381 *,

382 index: int = ...,

383 format: str = "rgb24",

384 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

385 filter_graph: Tuple[dict, List] = None,

386 constant_framerate: bool = None,

387 thread_count: int = 0,

388 thread_type: str = None,

389 ) -> np.ndarray:

390 """Read frames from the video.

391

392 If ``index`` is an integer, this function reads the index-th frame from

393 the file. If ``index`` is ... (Ellipsis), this function reads all frames

394 from the video, stacks them along the first dimension, and returns a

395 batch of frames.

396

397 Parameters

398 ----------

399 index : int

400 The index of the frame to read, e.g. ``index=5`` reads the 5th

401 frame. If ``...``, read all the frames in the video and stack them

402 along a new, prepended, batch dimension.

403 format : str

404 Set the returned colorspace. If not None (default: rgb24), convert

405 the data into the given format before returning it. If ``None``

406 return the data in the encoded format if it can be expressed as a

407 strided array; otherwise raise an Exception.

408 filter_sequence : List[str, str, dict]

409 If not None, apply the given sequence of FFmpeg filters to each

410 ndimage. Check the (module-level) plugin docs for details and

411 examples.

412 filter_graph : (dict, List)

413 If not None, apply the given graph of FFmpeg filters to each

414 ndimage. The graph is given as a tuple of two dicts. The first dict

415 contains a (named) set of nodes, and the second dict contains a set

416 of edges between nodes of the previous dict. Check the (module-level)

417 plugin docs for details and examples.

418 constant_framerate : bool

419 If True assume the video's framerate is constant. This allows for

420 faster seeking inside the file. If False, the video is reset before

421 each read and searched from the beginning. If None (default), this

422 value will be read from the container format.

423 thread_count : int

424 How many threads to use when decoding a frame. The default is 0,

425 which will set the number using ffmpeg's default, which is based on

426 the codec, number of available cores, threadding model, and other

427 considerations.

428 thread_type : str

429 The threading model to be used. One of

430

431 - `"SLICE"`: threads assemble parts of the current frame

432 - `"FRAME"`: threads may assemble future frames

433 - None (default): Uses ``"FRAME"`` if ``index=...`` and ffmpeg's

434 default otherwise.

435

436

437 Returns

438 -------

439 frame : np.ndarray

440 A numpy array containing loaded frame data.

441

442 Notes

443 -----

444 Accessing random frames repeatedly is costly (O(k), where k is the

445 average distance between two keyframes). You should do so only sparingly

446 if possible. In some cases, it can be faster to bulk-read the video (if

447 it fits into memory) and to then access the returned ndarray randomly.

448

449 The current implementation may cause problems for b-frames, i.e.,

450 bidirectionaly predicted pictures. I lack test videos to write unit

451 tests for this case.

452

453 Reading from an index other than ``...``, i.e. reading a single frame,

454 currently doesn't support filters that introduce delays.

455

456 """

457

458 if index is ...:

459 props = self.properties(format=format)

460 uses_filter = (

461 self._video_filter is not None

462 or filter_graph is not None

463 or filter_sequence is not None

464 )

465

466 self._container.seek(0)

467 if not uses_filter and props.shape[0] != 0:

468 frames = np.empty(props.shape, dtype=props.dtype)

469 for idx, frame in enumerate(

470 self.iter(

471 format=format,

472 filter_sequence=filter_sequence,

473 filter_graph=filter_graph,

474 thread_count=thread_count,

475 thread_type=thread_type or "FRAME",

476 )

477 ):

478 frames[idx] = frame

479 else:

480 frames = np.stack(

481 [

482 x

483 for x in self.iter(

484 format=format,

485 filter_sequence=filter_sequence,

486 filter_graph=filter_graph,

487 thread_count=thread_count,

488 thread_type=thread_type or "FRAME",

489 )

490 ]

491 )

492

493 # reset stream container, because threading model can't change after

494 # first access

495 self._video_stream = self._container.streams.video[0]

496

497 return frames

498

499 if thread_type is not None and not (

500 self._video_stream.thread_type == thread_type

501 or self._video_stream.thread_type.name == thread_type

502 ):

503 self._video_stream.thread_type = thread_type

504

505 if (

506 thread_count != 0

507 and thread_count != self._video_stream.codec_context.thread_count

508 ):

509 # in FFMPEG thread_count == 0 means use the default count, which we

510 # change to mean don't change the thread count.

511 self._video_stream.codec_context.thread_count = thread_count

512

513 if constant_framerate is None:

514 # "variable_fps" is now a flag (handle got removed). Full list at

515 # https://pyav.org/docs/stable/api/container.html#module-av.format

516 variable_fps = bool(self._container.format.flags & 0x400)

517 constant_framerate = not variable_fps

518

519 # note: cheap for contigous incremental reads

520 self._seek(index, constant_framerate=constant_framerate)

521 desired_frame = next(self._decoder)

522 self._next_idx += 1

523

524 self.set_video_filter(filter_sequence, filter_graph)

525 if self._video_filter is not None:

526 desired_frame = self._video_filter.send(desired_frame)

527

528 return self._unpack_frame(desired_frame, format=format)

529

530 def iter(

531 self,

532 *,

533 format: str = "rgb24",

534 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

535 filter_graph: Tuple[dict, List] = None,

536 thread_count: int = 0,

537 thread_type: str = None,

538 ) -> np.ndarray:

539 """Yield frames from the video.

540

541 Parameters

542 ----------

543 frame : np.ndarray

544 A numpy array containing loaded frame data.

545 format : str

546 Convert the data into the given format before returning it. If None,

547 return the data in the encoded format if it can be expressed as a

548 strided array; otherwise raise an Exception.

549 filter_sequence : List[str, str, dict]

550 Set the returned colorspace. If not None (default: rgb24), convert

551 the data into the given format before returning it. If ``None``

552 return the data in the encoded format if it can be expressed as a

553 strided array; otherwise raise an Exception.

554 filter_graph : (dict, List)

555 If not None, apply the given graph of FFmpeg filters to each

556 ndimage. The graph is given as a tuple of two dicts. The first dict

557 contains a (named) set of nodes, and the second dict contains a set

558 of edges between nodes of the previous dict. Check the (module-level)

559 plugin docs for details and examples.

560 thread_count : int

561 How many threads to use when decoding a frame. The default is 0,

562 which will set the number using ffmpeg's default, which is based on

563 the codec, number of available cores, threadding model, and other

564 considerations.

565 thread_type : str

566 The threading model to be used. One of

567

568 - `"SLICE"` (default): threads assemble parts of the current frame

569 - `"FRAME"`: threads may assemble future frames (faster for bulk reading)

570

571

572 Yields

573 ------

574 frame : np.ndarray

575 A (decoded) video frame.

576

577

578 """

579

580 self._video_stream.thread_type = thread_type or "SLICE"

581 self._video_stream.codec_context.thread_count = thread_count

582

583 self.set_video_filter(filter_sequence, filter_graph)

584

585 for frame in self._decoder:

586 self._next_idx += 1

587

588 if self._video_filter is not None:

589 try:

590 frame = self._video_filter.send(frame)

591 except StopIteration:

592 break

593

594 if frame is None:

595 continue

596

597 yield self._unpack_frame(frame, format=format)

598

599 if self._video_filter is not None:

600 for frame in self._video_filter:

601 yield self._unpack_frame(frame, format=format)

602

603 def write(

604 self,

605 ndimage: Union[np.ndarray, List[np.ndarray]],

606 *,

607 codec: str = None,

608 is_batch: bool = True,

609 fps: int = 24,

610 in_pixel_format: str = "rgb24",

611 out_pixel_format: str = None,

612 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

613 filter_graph: Tuple[dict, List] = None,

614 ) -> Optional[bytes]:

615 """Save a ndimage as a video.

616

617 Given a batch of frames (stacked along the first axis) or a list of

618 frames, encode them and add the result to the ImageResource.

619

620 Parameters

621 ----------

622 ndimage : ArrayLike, List[ArrayLike]

623 The ndimage to encode and write to the ImageResource.

624 codec : str

625 The codec to use when encoding frames. Only needed on first write

626 and ignored on subsequent writes.

627 is_batch : bool

628 If True (default), the ndimage is a batch of images, otherwise it is

629 a single image. This parameter has no effect on lists of ndimages.

630 fps : str

631 The resulting videos frames per second.

632 in_pixel_format : str

633 The pixel format of the incoming ndarray. Defaults to "rgb24" and can

634 be any stridable pix_fmt supported by FFmpeg.

635 out_pixel_format : str

636 The pixel format to use while encoding frames. If None (default)

637 use the codec's default.

638 filter_sequence : List[str, str, dict]

639 If not None, apply the given sequence of FFmpeg filters to each

640 ndimage. Check the (module-level) plugin docs for details and

641 examples.

642 filter_graph : (dict, List)

643 If not None, apply the given graph of FFmpeg filters to each

644 ndimage. The graph is given as a tuple of two dicts. The first dict

645 contains a (named) set of nodes, and the second dict contains a set

646 of edges between nodes of the previous dict. Check the (module-level)

647 plugin docs for details and examples.

648

649 Returns

650 -------

651 encoded_image : bytes or None

652 If the chosen ImageResource is the special target ``"<bytes>"`` then

653 write will return a byte string containing the encoded image data.

654 Otherwise, it returns None.

655

656 Notes

657 -----

658 When writing ``<bytes>``, the video is finalized immediately after the

659 first write call and calling write multiple times to append frames is

660 not possible.

661

662 """

663

664 if isinstance(ndimage, list):

665 # frames shapes must agree for video

666 if any(f.shape != ndimage[0].shape for f in ndimage):

667 raise ValueError("All frames should have the same shape")

668 elif not is_batch:

669 ndimage = np.asarray(ndimage)[None, ...]

670 else:

671 ndimage = np.asarray(ndimage)

672

673 if self._video_stream is None:

674 self.init_video_stream(codec, fps=fps, pixel_format=out_pixel_format)

675

676 self.set_video_filter(filter_sequence, filter_graph)

677

678 for img in ndimage:

679 self.write_frame(img, pixel_format=in_pixel_format)

680

681 if self.request._uri_type == URI_BYTES:

682 # bytes are immutuable, so we have to flush immediately

683 # and can't support appending

684 self._flush_writer()

685 self._container.close()

686

687 return self.request.get_file().getvalue()

688

689 def properties(self, index: int = ..., *, format: str = "rgb24") -> ImageProperties:

690 """Standardized ndimage metadata.

691

692 Parameters

693 ----------

694 index : int

695 The index of the ndimage for which to return properties. If ``...``

696 (Ellipsis, default), return the properties for the resulting batch

697 of frames.

698 format : str

699 If not None (default: rgb24), convert the data into the given format

700 before returning it. If None return the data in the encoded format

701 if that can be expressed as a strided array; otherwise raise an

702 Exception.

703

704 Returns

705 -------

706 properties : ImageProperties

707 A dataclass filled with standardized image metadata.

708

709 Notes

710 -----

711 This function is efficient and won't process any pixel data.

712

713 The provided metadata does not include modifications by any filters

714 (through ``filter_sequence`` or ``filter_graph``).

715

716 """

717

718 video_width = self._video_stream.codec_context.width

719 video_height = self._video_stream.codec_context.height

720 pix_format = format or self._video_stream.codec_context.pix_fmt

721 frame_template = av.VideoFrame(video_width, video_height, pix_format)

722

723 shape = _get_frame_shape(frame_template)

724 if index is ...:

725 n_frames = self._video_stream.frames

726 shape = (n_frames,) + shape

727

728 return ImageProperties(

729 shape=tuple(shape),

730 dtype=_format_to_dtype(frame_template.format),

731 n_images=shape[0] if index is ... else None,

732 is_batch=index is ...,

733 )

734

735 def metadata(

736 self,

737 index: int = ...,

738 exclude_applied: bool = True,

739 constant_framerate: bool = None,

740 ) -> Dict[str, Any]:

741 """Format-specific metadata.

742

743 Returns a dictionary filled with metadata that is either stored in the

744 container, the video stream, or the frame's side-data.

745

746 Parameters

747 ----------

748 index : int

749 If ... (Ellipsis, default) return global metadata (the metadata

750 stored in the container and video stream). If not ..., return the

751 side data stored in the frame at the given index.

752 exclude_applied : bool

753 Currently, this parameter has no effect. It exists for compliance with

754 the ImageIO v3 API.

755 constant_framerate : bool

756 If True assume the video's framerate is constant. This allows for

757 faster seeking inside the file. If False, the video is reset before

758 each read and searched from the beginning. If None (default), this

759 value will be read from the container format.

760

761 Returns

762 -------

763 metadata : dict

764 A dictionary filled with format-specific metadata fields and their

765 values.

766

767 """

768

769 metadata = dict()

770

771 if index is ...:

772 # useful flags defined on the container and/or video stream

773 metadata.update(

774 {

775 "video_format": self._video_stream.codec_context.pix_fmt,

776 "codec": self._video_stream.codec.name,

777 "long_codec": self._video_stream.codec.long_name,

778 "profile": self._video_stream.profile,

779 "fps": float(self._video_stream.guessed_rate),

780 }

781 )

782 if self._video_stream.duration is not None:

783 duration = float(

784 self._video_stream.duration * self._video_stream.time_base

785 )

786 metadata.update({"duration": duration})

787

788 metadata.update(self.container_metadata)

789 metadata.update(self.video_stream_metadata)

790 return metadata

791

792 if constant_framerate is None:

793 # "variable_fps" is now a flag (handle got removed). Full list at

794 # https://pyav.org/docs/stable/api/container.html#module-av.format

795 variable_fps = bool(self._container.format.flags & 0x400)

796 constant_framerate = not variable_fps

797

798 self._seek(index, constant_framerate=constant_framerate)

799 desired_frame = next(self._decoder)

800 self._next_idx += 1

801

802 # useful flags defined on the frame

803 metadata.update(

804 {

805 "key_frame": bool(desired_frame.key_frame),

806 "time": desired_frame.time,

807 "interlaced_frame": bool(desired_frame.interlaced_frame),

808 "frame_type": _get_frame_type(desired_frame.pict_type),

809 }

810 )

811

812 # side data

813 metadata.update(

814 {item.type.name: bytes(item) for item in desired_frame.side_data}

815 )

816

817 return metadata

818

819 def close(self) -> None:

820 """Close the Video."""

821

822 is_write = self.request.mode.io_mode == IOMode.write

823 if is_write and self._video_stream is not None:

824 self._flush_writer()

825

826 if self._video_stream is not None:

827 self._video_stream = None

828

829 if self._container is not None:

830 self._container.close()

831

832 self.request.finish()

833

834 def __enter__(self) -> "PyAVPlugin":

835 return super().__enter__()

836

837 # ------------------------------

838 # Add-on Interface inside imopen

839 # ------------------------------

840

841 def init_video_stream(

842 self,

843 codec: str,

844 *,

845 fps: float = 24,

846 pixel_format: str = None,

847 max_keyframe_interval: int = None,

848 force_keyframes: bool = None,

849 ) -> None:

850 """Initialize a new video stream.

851

852 This function adds a new video stream to the ImageResource using the

853 selected encoder (codec), framerate, and colorspace.

854

855 Parameters

856 ----------

857 codec : str

858 The codec to use, e.g. ``"h264"`` or ``"vp9"``.

859 fps : float

860 The desired framerate of the video stream (frames per second).

861 pixel_format : str

862 The pixel format to use while encoding frames. If None (default) use

863 the codec's default.

864 max_keyframe_interval : int

865 The maximum distance between two intra frames (I-frames). Also known

866 as GOP size. If unspecified use the codec's default. Note that not

867 every I-frame is a keyframe; see the notes for details.

868 force_keyframes : bool

869 If True, limit inter frames dependency to frames within the current

870 keyframe interval (GOP), i.e., force every I-frame to be a keyframe.

871 If unspecified, use the codec's default.

872

873 Notes

874 -----

875 You can usually leave ``max_keyframe_interval`` and ``force_keyframes``

876 at their default values, unless you try to generate seek-optimized video

877 or have a similar specialist use-case. In this case, ``force_keyframes``

878 controls the ability to seek to _every_ I-frame, and

879 ``max_keyframe_interval`` controls how close to a random frame you can

880 seek. Low values allow more fine-grained seek at the expense of

881 file-size (and thus I/O performance).

882

883 """

884

885 fps = Fraction.from_float(fps)

886 stream = self._container.add_stream(codec, fps)

887 stream.time_base = Fraction(1 / fps).limit_denominator(int(2**16 - 1))

888 if pixel_format is not None:

889 stream.pix_fmt = pixel_format

890 if max_keyframe_interval is not None:

891 stream.gop_size = max_keyframe_interval

892 if force_keyframes is not None:

893 if force_keyframes:

894 stream.codec_context.flags |= Flags.closed_gop

895 else:

896 stream.codec_context.flags &= ~Flags.closed_gop

897

898 self._video_stream = stream

899

900 def write_frame(self, frame: np.ndarray, *, pixel_format: str = "rgb24") -> None:

901 """Add a frame to the video stream.

902

903 This function appends a new frame to the video. It assumes that the

904 stream previously has been initialized. I.e., ``init_video_stream`` has

905 to be called before calling this function for the write to succeed.

906

907 Parameters

908 ----------

909 frame : np.ndarray

910 The image to be appended/written to the video stream.

911 pixel_format : str

912 The colorspace (pixel format) of the incoming frame.

913

914 Notes

915 -----

916 Frames may be held in a buffer, e.g., by the filter pipeline used during

917 writing or by FFMPEG to batch them prior to encoding. Make sure to

918 ``.close()`` the plugin or to use a context manager to ensure that all

919 frames are written to the ImageResource.

920

921 """

922

923 # manual packing of ndarray into frame

924 # (this should live in pyAV, but it doesn't support all the formats we

925 # want and PRs there are slow)

926 pixel_format = av.VideoFormat(pixel_format)

927 img_dtype = _format_to_dtype(pixel_format)

928 width = frame.shape[2 if pixel_format.is_planar else 1]

929 height = frame.shape[1 if pixel_format.is_planar else 0]

930 av_frame = av.VideoFrame(width, height, pixel_format.name)

931 if pixel_format.is_planar:

932 for idx, plane in enumerate(av_frame.planes):

933 plane_array = np.frombuffer(plane, dtype=img_dtype)

934 plane_array = as_strided(

935 plane_array,

936 shape=(plane.height, plane.width),

937 strides=(plane.line_size, img_dtype.itemsize),

938 )

939 plane_array[...] = frame[idx]

940 else:

941 if pixel_format.name.startswith("bayer_"):

942 # ffmpeg doesn't describe bayer formats correctly

943 # see https://github.com/imageio/imageio/issues/761#issuecomment-1059318851

944 # and following for details.

945 n_channels = 1

946 else:

947 n_channels = len(pixel_format.components)

948

949 plane = av_frame.planes[0]

950 plane_shape = (plane.height, plane.width)

951 plane_strides = (plane.line_size, n_channels * img_dtype.itemsize)

952 if n_channels > 1:

953 plane_shape += (n_channels,)

954 plane_strides += (img_dtype.itemsize,)

955

956 plane_array = as_strided(

957 np.frombuffer(plane, dtype=img_dtype),

958 shape=plane_shape,

959 strides=plane_strides,

960 )

961 plane_array[...] = frame

962

963 stream = self._video_stream

964 av_frame.time_base = stream.codec_context.time_base

965 av_frame.pts = self.frames_written

966 self.frames_written += 1

967

968 if self._video_filter is not None:

969 av_frame = self._video_filter.send(av_frame)

970 if av_frame is None:

971 return

972

973 if stream.frames == 0:

974 stream.width = av_frame.width

975 stream.height = av_frame.height

976

977 for packet in stream.encode(av_frame):

978 self._container.mux(packet)

979

980 def set_video_filter(

981 self,

982 filter_sequence: List[Tuple[str, Union[str, dict]]] = None,

983 filter_graph: Tuple[dict, List] = None,

984 ) -> None:

985 """Set the filter(s) to use.

986

987 This function creates a new FFMPEG filter graph to use when reading or

988 writing video. In the case of reading, frames are passed through the

989 filter graph before begin returned and, in case of writing, frames are

990 passed through the filter before being written to the video.

991

992 Parameters

993 ----------

994 filter_sequence : List[str, str, dict]

995 If not None, apply the given sequence of FFmpeg filters to each

996 ndimage. Check the (module-level) plugin docs for details and

997 examples.

998 filter_graph : (dict, List)

999 If not None, apply the given graph of FFmpeg filters to each

1000 ndimage. The graph is given as a tuple of two dicts. The first dict

1001 contains a (named) set of nodes, and the second dict contains a set

1002 of edges between nodes of the previous dict. Check the

1003 (module-level) plugin docs for details and examples.

1004

1005 Notes

1006 -----

1007 Changing a filter graph with lag during reading or writing will

1008 currently cause frames in the filter queue to be lost.

1009

1010 """

1011

1012 if filter_sequence is None and filter_graph is None:

1013 self._video_filter = None

1014 return

1015

1016 if filter_sequence is None:

1017 filter_sequence = list()

1018

1019 node_descriptors: Dict[str, Tuple[str, Union[str, Dict]]]

1020 edges: List[Tuple[str, str, int, int]]

1021 if filter_graph is None:

1022 node_descriptors, edges = dict(), [("video_in", "video_out", 0, 0)]

1023 else:

1024 node_descriptors, edges = filter_graph

1025

1026 graph = av.filter.Graph()

1027

1028 previous_node = graph.add_buffer(template=self._video_stream)

1029 for filter_name, argument in filter_sequence:

1030 if isinstance(argument, str):

1031 current_node = graph.add(filter_name, argument)

1032 else:

1033 current_node = graph.add(filter_name, **argument)

1034 previous_node.link_to(current_node)

1035 previous_node = current_node

1036

1037 nodes = dict()

1038 nodes["video_in"] = previous_node

1039 nodes["video_out"] = graph.add("buffersink")

1040 for name, (filter_name, arguments) in node_descriptors.items():

1041 if isinstance(arguments, str):

1042 nodes[name] = graph.add(filter_name, arguments)

1043 else:

1044 nodes[name] = graph.add(filter_name, **arguments)

1045

1046 for from_note, to_node, out_idx, in_idx in edges:

1047 nodes[from_note].link_to(nodes[to_node], out_idx, in_idx)

1048

1049 graph.configure()

1050

1051 def video_filter():

1052 # this starts a co-routine

1053 # send frames using graph.send()

1054 frame = yield None

1055

1056 # send and receive frames in "parallel"

1057 while frame is not None:

1058 graph.push(frame)

1059 try:

1060 frame = yield graph.pull()

1061 except av.error.BlockingIOError:

1062 # filter has lag and needs more frames

1063 frame = yield None

1064 except av.error.EOFError:

1065 break

1066

1067 try:

1068 # send EOF in av>=9.0

1069 graph.push(None)

1070 except ValueError: # pragma: no cover

1071 # handle av<9.0

1072 pass

1073

1074 # all frames have been sent, empty the filter

1075 while True:

1076 try:

1077 yield graph.pull()

1078 except av.error.EOFError:

1079 break # EOF

1080 except av.error.BlockingIOError: # pragma: no cover

1081 # handle av<9.0

1082 break

1083

1084 self._video_filter = video_filter()

1085 self._video_filter.send(None)

1086

1087 @property

1088 def container_metadata(self):

1089 """Container-specific metadata.

1090

1091 A dictionary containing metadata stored at the container level.

1092

1093 """

1094 return self._container.metadata

1095

1096 @property

1097 def video_stream_metadata(self):

1098 """Stream-specific metadata.

1099

1100 A dictionary containing metadata stored at the stream level.

1101

1102 """

1103 return self._video_stream.metadata

1104

1105 # -------------------------------

1106 # Internals and private functions

1107 # -------------------------------

1108

1109 def _unpack_frame(self, frame: av.VideoFrame, *, format: str = None) -> np.ndarray:

1110 """Convert a av.VideoFrame into a ndarray

1111

1112 Parameters

1113 ----------

1114 frame : av.VideoFrame

1115 The frame to unpack.

1116 format : str

1117 If not None, convert the frame to the given format before unpacking.

1118

1119 """

1120

1121 if format is not None:

1122 frame = frame.reformat(format=format)

1123

1124 dtype = _format_to_dtype(frame.format)

1125 shape = _get_frame_shape(frame)

1126

1127 planes = list()

1128 for idx in range(len(frame.planes)):

1129 n_channels = sum(

1130 [

1131 x.bits // (dtype.itemsize * 8)

1132 for x in frame.format.components

1133 if x.plane == idx

1134 ]

1135 )

1136 av_plane = frame.planes[idx]

1137 plane_shape = (av_plane.height, av_plane.width)

1138 plane_strides = (av_plane.line_size, n_channels * dtype.itemsize)

1139 if n_channels > 1:

1140 plane_shape += (n_channels,)

1141 plane_strides += (dtype.itemsize,)

1142

1143 np_plane = as_strided(

1144 np.frombuffer(av_plane, dtype=dtype),

1145 shape=plane_shape,

1146 strides=plane_strides,

1147 )

1148 planes.append(np_plane)

1149

1150 if len(planes) > 1:

1151 # Note: the planes *should* exist inside a contigous memory block

1152 # somewhere inside av.Frame however pyAV does not appear to expose this,

1153 # so we are forced to copy the planes individually instead of wrapping

1154 # them :(

1155 out = np.concatenate(planes).reshape(shape)

1156 else:

1157 out = planes[0]

1158

1159 return out

1160

1161 def _seek(self, index, *, constant_framerate: bool = True) -> Generator:

1162 """Seeks to the frame at the given index."""

1163

1164 if index == self._next_idx:

1165 return # fast path :)

1166

1167 # we must decode at least once before we seek otherwise the

1168 # returned frames become corrupt.

1169 if self._next_idx == 0:

1170 next(self._decoder)

1171 self._next_idx += 1

1172

1173 if index == self._next_idx:

1174 return # fast path :)

1175

1176 # remove this branch until I find a way to efficiently find the next

1177 # keyframe. keeping this as a reminder

1178 # if self._next_idx < index and index < self._next_keyframe_idx:

1179 # frames_to_yield = index - self._next_idx

1180 if not constant_framerate and index > self._next_idx:

1181 frames_to_yield = index - self._next_idx

1182 elif not constant_framerate:

1183 # seek backwards and can't link idx and pts

1184 self._container.seek(0)

1185 self._decoder = self._container.decode(video=0)

1186 self._next_idx = 0

1187

1188 frames_to_yield = index

1189 else:

1190 # we know that the time between consecutive frames is constant

1191 # hence we can link index and pts

1192

1193 # how many pts lie between two frames

1194 sec_delta = 1 / self._video_stream.guessed_rate

1195 pts_delta = sec_delta / self._video_stream.time_base

1196

1197 index_pts = int(index * pts_delta)

1198

1199 # this only seeks to the closed (preceeding) keyframe

1200 self._container.seek(index_pts, stream=self._video_stream)

1201 self._decoder = self._container.decode(video=0)

1202

1203 # this may be made faster if we could get the keyframe's time without

1204 # decoding it

1205 keyframe = next(self._decoder)

1206 keyframe_time = keyframe.pts * keyframe.time_base

1207 keyframe_pts = int(keyframe_time / self._video_stream.time_base)

1208 keyframe_index = keyframe_pts // pts_delta

1209

1210 self._container.seek(index_pts, stream=self._video_stream)

1211 self._next_idx = keyframe_index

1212

1213 frames_to_yield = index - keyframe_index

1214

1215 for _ in range(frames_to_yield):

1216 next(self._decoder)

1217 self._next_idx += 1

1218

1219 def _flush_writer(self):

1220 """Flush the filter and encoder

1221

1222 This will reset the filter to `None` and send EoF to the encoder,

1223 i.e., after calling, no more frames may be written.

1224

1225 """

1226

1227 stream = self._video_stream

1228

1229 if self._video_filter is not None:

1230 # flush encoder

1231 for av_frame in self._video_filter:

1232 if stream.frames == 0:

1233 stream.width = av_frame.width

1234 stream.height = av_frame.height

1235 for packet in stream.encode(av_frame):

1236 self._container.mux(packet)

1237 self._video_filter = None

1238

1239 # flush stream

1240 for packet in stream.encode():

1241 self._container.mux(packet)

1242 self._video_stream = None