Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.9/dist-packages/pandas/core/shared

1from __future__ import annotations

3_shared_docs: dict[str, str] = {}

5_shared_docs[

6 "aggregate"

7] = """

8Aggregate using one or more operations over the specified axis.

10Parameters

11----------

12func : function, str, list or dict

13 Function to use for aggregating the data. If a function, must either

14 work when passed a {klass} or when passed to {klass}.apply.

16 Accepted combinations are:

18 - function

19 - string function name

20 - list of functions and/or function names, e.g. ``[np.sum, 'mean']``

21 - dict of axis labels -> functions, function names or list of such.

22{axis}

23*args

24 Positional arguments to pass to `func`.

25**kwargs

26 Keyword arguments to pass to `func`.

28Returns

29-------

30scalar, Series or DataFrame

32 The return can be:

34 * scalar : when Series.agg is called with single function

35 * Series : when DataFrame.agg is called with a single function

36 * DataFrame : when DataFrame.agg is called with several functions

37{see_also}

38Notes

39-----

40The aggregation operations are always performed over an axis, either the

41index (default) or the column axis. This behavior is different from

42`numpy` aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,

43`var`), where the default is to compute the aggregation of the flattened

44array, e.g., ``numpy.mean(arr_2d)`` as opposed to

45``numpy.mean(arr_2d, axis=0)``.

47`agg` is an alias for `aggregate`. Use the alias.

49Functions that mutate the passed object can produce unexpected

50behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`

51for more details.

53A passed user-defined-function will be passed a Series for evaluation.

54{examples}"""

56_shared_docs[

57 "compare"

58] = """

59Compare to another {klass} and show the differences.

61Parameters

62----------

63other : {klass}

64 Object to compare with.

66align_axis : {{0 or 'index', 1 or 'columns'}}, default 1

67 Determine which axis to align the comparison on.

69 * 0, or 'index' : Resulting differences are stacked vertically

70 with rows drawn alternately from self and other.

71 * 1, or 'columns' : Resulting differences are aligned horizontally

72 with columns drawn alternately from self and other.

74keep_shape : bool, default False

75 If true, all rows and columns are kept.

76 Otherwise, only the ones with different values are kept.

78keep_equal : bool, default False

79 If true, the result keeps values that are equal.

80 Otherwise, equal values are shown as NaNs.

82result_names : tuple, default ('self', 'other')

83 Set the dataframes names in the comparison.

85 .. versionadded:: 1.5.0

86"""

88_shared_docs[

89 "groupby"

90] = """

91Group %(klass)s using a mapper or by a Series of columns.

93A groupby operation involves some combination of splitting the

94object, applying a function, and combining the results. This can be

95used to group large amounts of data and compute operations on these

96groups.

98Parameters

99----------

100by : mapping, function, label, pd.Grouper or list of such

101 Used to determine the groups for the groupby.

102 If ``by`` is a function, it's called on each value of the object's

103 index. If a dict or Series is passed, the Series or dict VALUES

104 will be used to determine the groups (the Series' values are first

105 aligned; see ``.align()`` method). If a list or ndarray of length

106 equal to the selected axis is passed (see the `groupby user guide

107 <https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#splitting-an-object-into-groups>`_),

108 the values are used as-is to determine the groups. A label or list

109 of labels may be passed to group by the columns in ``self``.

110 Notice that a tuple is interpreted as a (single) key.

111axis : {0 or 'index', 1 or 'columns'}, default 0

112 Split along rows (0) or columns (1). For `Series` this parameter

113 is unused and defaults to 0.

114

115 .. deprecated:: 2.1.0

116

117 Will be removed and behave like axis=0 in a future version.

118 For ``axis=1``, do ``frame.T.groupby(...)`` instead.

119

120level : int, level name, or sequence of such, default None

121 If the axis is a MultiIndex (hierarchical), group by a particular

122 level or levels. Do not specify both ``by`` and ``level``.

123as_index : bool, default True

124 Return object with group labels as the

125 index. Only relevant for DataFrame input. as_index=False is

126 effectively "SQL-style" grouped output. This argument has no effect

127 on filtrations (see the `filtrations in the user guide

128 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#filtration>`_),

129 such as ``head()``, ``tail()``, ``nth()`` and in transformations

130 (see the `transformations in the user guide

131 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#transformation>`_).

132sort : bool, default True

133 Sort group keys. Get better performance by turning this off.

134 Note this does not influence the order of observations within each

135 group. Groupby preserves the order of rows within each group. If False,

136 the groups will appear in the same order as they did in the original DataFrame.

137 This argument has no effect on filtrations (see the `filtrations in the user guide

138 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#filtration>`_),

139 such as ``head()``, ``tail()``, ``nth()`` and in transformations

140 (see the `transformations in the user guide

141 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#transformation>`_).

142

143 .. versionchanged:: 2.0.0

144

145 Specifying ``sort=False`` with an ordered categorical grouper will no

146 longer sort the values.

147

148group_keys : bool, default True

149 When calling apply and the ``by`` argument produces a like-indexed

150 (i.e. :ref:`a transform <groupby.transform>`) result, add group keys to

151 index to identify pieces. By default group keys are not included

152 when the result's index (and column) labels match the inputs, and

153 are included otherwise.

154

155 .. versionchanged:: 1.5.0

156

157 Warns that ``group_keys`` will no longer be ignored when the

158 result from ``apply`` is a like-indexed Series or DataFrame.

159 Specify ``group_keys`` explicitly to include the group keys or

160 not.

161

162 .. versionchanged:: 2.0.0

163

164 ``group_keys`` now defaults to ``True``.

165

166observed : bool, default False

167 This only applies if any of the groupers are Categoricals.

168 If True: only show observed values for categorical groupers.

169 If False: show all values for categorical groupers.

170

171 .. deprecated:: 2.1.0

172

173 The default value will change to True in a future version of pandas.

174

175dropna : bool, default True

176 If True, and if group keys contain NA values, NA values together

177 with row/column will be dropped.

178 If False, NA values will also be treated as the key in groups.

179

180Returns

181-------

182pandas.api.typing.%(klass)sGroupBy

183 Returns a groupby object that contains information about the groups.

184

185See Also

186--------

187resample : Convenience method for frequency conversion and resampling

188 of time series.

189

190Notes

191-----

192See the `user guide

193<https://pandas.pydata.org/pandas-docs/stable/groupby.html>`__ for more

194detailed usage and examples, including splitting an object into groups,

195iterating through groups, selecting a group, aggregation, and more.

196"""

197

198_shared_docs[

199 "melt"

200] = """

201Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

202

203This function is useful to massage a DataFrame into a format where one

204or more columns are identifier variables (`id_vars`), while all other

205columns, considered measured variables (`value_vars`), are "unpivoted" to

206the row axis, leaving just two non-identifier columns, 'variable' and

207'value'.

208

209Parameters

210----------

211id_vars : scalar, tuple, list, or ndarray, optional

212 Column(s) to use as identifier variables.

213value_vars : scalar, tuple, list, or ndarray, optional

214 Column(s) to unpivot. If not specified, uses all columns that

215 are not set as `id_vars`.

216var_name : scalar, default None

217 Name to use for the 'variable' column. If None it uses

218 ``frame.columns.name`` or 'variable'.

219value_name : scalar, default 'value'

220 Name to use for the 'value' column, can't be an existing column label.

221col_level : scalar, optional

222 If columns are a MultiIndex then use this level to melt.

223ignore_index : bool, default True

224 If True, original index is ignored. If False, the original index is retained.

225 Index labels will be repeated as necessary.

226

227Returns

228-------

229DataFrame

230 Unpivoted DataFrame.

231

232See Also

233--------

234%(other)s : Identical method.

235pivot_table : Create a spreadsheet-style pivot table as a DataFrame.

236DataFrame.pivot : Return reshaped DataFrame organized

237 by given index / column values.

238DataFrame.explode : Explode a DataFrame from list-like

239 columns to long format.

240

241Notes

242-----

243Reference :ref:`the user guide <reshaping.melt>` for more examples.

244

245Examples

246--------

247>>> df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},

248... 'B': {0: 1, 1: 3, 2: 5},

249... 'C': {0: 2, 1: 4, 2: 6}})

250>>> df

251 A B C

2520 a 1 2

2531 b 3 4

2542 c 5 6

255

256>>> %(caller)sid_vars=['A'], value_vars=['B'])

257 A variable value

2580 a B 1

2591 b B 3

2602 c B 5

261

262>>> %(caller)sid_vars=['A'], value_vars=['B', 'C'])

263 A variable value

2640 a B 1

2651 b B 3

2662 c B 5

2673 a C 2

2684 b C 4

2695 c C 6

270

271The names of 'variable' and 'value' columns can be customized:

272

273>>> %(caller)sid_vars=['A'], value_vars=['B'],

274... var_name='myVarname', value_name='myValname')

275 A myVarname myValname

2760 a B 1

2771 b B 3

2782 c B 5

279

280Original index values can be kept around:

281

282>>> %(caller)sid_vars=['A'], value_vars=['B', 'C'], ignore_index=False)

283 A variable value

2840 a B 1

2851 b B 3

2862 c B 5

2870 a C 2

2881 b C 4

2892 c C 6

290

291If you have multi-index columns:

292

293>>> df.columns = [list('ABC'), list('DEF')]

294>>> df

295 A B C

296 D E F

2970 a 1 2

2981 b 3 4

2992 c 5 6

300

301>>> %(caller)scol_level=0, id_vars=['A'], value_vars=['B'])

302 A variable value

3030 a B 1

3041 b B 3

3052 c B 5

306

307>>> %(caller)sid_vars=[('A', 'D')], value_vars=[('B', 'E')])

308 (A, D) variable_0 variable_1 value

3090 a B E 1

3101 b B E 3

3112 c B E 5

312"""

313

314_shared_docs[

315 "transform"

316] = """

317Call ``func`` on self producing a {klass} with the same axis shape as self.

318

319Parameters

320----------

321func : function, str, list-like or dict-like

322 Function to use for transforming the data. If a function, must either

323 work when passed a {klass} or when passed to {klass}.apply. If func

324 is both list-like and dict-like, dict-like behavior takes precedence.

325

326 Accepted combinations are:

327

328 - function

329 - string function name

330 - list-like of functions and/or function names, e.g. ``[np.exp, 'sqrt']``

331 - dict-like of axis labels -> functions, function names or list-like of such.

332{axis}

333*args

334 Positional arguments to pass to `func`.

335**kwargs

336 Keyword arguments to pass to `func`.

337

338Returns

339-------

340{klass}

341 A {klass} that must have the same length as self.

342

343Raises

344------

345ValueError : If the returned {klass} has a different length than self.

346

347See Also

348--------

349{klass}.agg : Only perform aggregating type operations.

350{klass}.apply : Invoke function on a {klass}.

351

352Notes

353-----

354Functions that mutate the passed object can produce unexpected

355behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`

356for more details.

357

358Examples

359--------

360>>> df = pd.DataFrame({{'A': range(3), 'B': range(1, 4)}})

361>>> df

362 A B

3630 0 1

3641 1 2

3652 2 3

366>>> df.transform(lambda x: x + 1)

367 A B

3680 1 2

3691 2 3

3702 3 4

371

372Even though the resulting {klass} must have the same length as the

373input {klass}, it is possible to provide several input functions:

374

375>>> s = pd.Series(range(3))

376>>> s

3770 0

3781 1

3792 2

380dtype: int64

381>>> s.transform([np.sqrt, np.exp])

382 sqrt exp

3830 0.000000 1.000000

3841 1.000000 2.718282

3852 1.414214 7.389056

386

387You can call transform on a GroupBy object:

388

389>>> df = pd.DataFrame({{

390... "Date": [

391... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05",

392... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05"],

393... "Data": [5, 8, 6, 1, 50, 100, 60, 120],

394... }})

395>>> df

396 Date Data

3970 2015-05-08 5

3981 2015-05-07 8

3992 2015-05-06 6

4003 2015-05-05 1

4014 2015-05-08 50

4025 2015-05-07 100

4036 2015-05-06 60

4047 2015-05-05 120

405>>> df.groupby('Date')['Data'].transform('sum')

4060 55

4071 108

4082 66

4093 121

4104 55

4115 108

4126 66

4137 121

414Name: Data, dtype: int64

415

416>>> df = pd.DataFrame({{

417... "c": [1, 1, 1, 2, 2, 2, 2],

418... "type": ["m", "n", "o", "m", "m", "n", "n"]

419... }})

420>>> df

421 c type

4220 1 m

4231 1 n

4242 1 o

4253 2 m

4264 2 m

4275 2 n

4286 2 n

429>>> df['size'] = df.groupby('c')['type'].transform(len)

430>>> df

431 c type size

4320 1 m 3

4331 1 n 3

4342 1 o 3

4353 2 m 4

4364 2 m 4

4375 2 n 4

4386 2 n 4

439"""

440

441_shared_docs[

442 "storage_options"

443] = """storage_options : dict, optional

444 Extra options that make sense for a particular storage connection, e.g.

445 host, port, username, password, etc. For HTTP(S) URLs the key-value pairs

446 are forwarded to ``urllib.request.Request`` as header options. For other

447 URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are

448 forwarded to ``fsspec.open``. Please see ``fsspec`` and ``urllib`` for more

449 details, and for more examples on storage options refer `here

450 <https://pandas.pydata.org/docs/user_guide/io.html?

451 highlight=storage_options#reading-writing-remote-files>`_."""

452

453_shared_docs[

454 "compression_options"

455] = """compression : str or dict, default 'infer'

456 For on-the-fly compression of the output data. If 'infer' and '%s' is

457 path-like, then detect compression from the following extensions: '.gz',

458 '.bz2', '.zip', '.xz', '.zst', '.tar', '.tar.gz', '.tar.xz' or '.tar.bz2'

459 (otherwise no compression).

460 Set to ``None`` for no compression.

461 Can also be a dict with key ``'method'`` set

462 to one of {``'zip'``, ``'gzip'``, ``'bz2'``, ``'zstd'``, ``'xz'``, ``'tar'``} and

463 other key-value pairs are forwarded to

464 ``zipfile.ZipFile``, ``gzip.GzipFile``,

465 ``bz2.BZ2File``, ``zstandard.ZstdCompressor``, ``lzma.LZMAFile`` or

466 ``tarfile.TarFile``, respectively.

467 As an example, the following could be passed for faster compression and to create

468 a reproducible gzip archive:

469 ``compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}``.

470

471 .. versionadded:: 1.5.0

472 Added support for `.tar` files."""

473

474_shared_docs[

475 "decompression_options"

476] = """compression : str or dict, default 'infer'

477 For on-the-fly decompression of on-disk data. If 'infer' and '%s' is

478 path-like, then detect compression from the following extensions: '.gz',

479 '.bz2', '.zip', '.xz', '.zst', '.tar', '.tar.gz', '.tar.xz' or '.tar.bz2'

480 (otherwise no compression).

481 If using 'zip' or 'tar', the ZIP file must contain only one data file to be read in.

482 Set to ``None`` for no decompression.

483 Can also be a dict with key ``'method'`` set

484 to one of {``'zip'``, ``'gzip'``, ``'bz2'``, ``'zstd'``, ``'xz'``, ``'tar'``} and

485 other key-value pairs are forwarded to

486 ``zipfile.ZipFile``, ``gzip.GzipFile``,

487 ``bz2.BZ2File``, ``zstandard.ZstdDecompressor``, ``lzma.LZMAFile`` or

488 ``tarfile.TarFile``, respectively.

489 As an example, the following could be passed for Zstandard decompression using a

490 custom compression dictionary:

491 ``compression={'method': 'zstd', 'dict_data': my_compression_dict}``.

492

493 .. versionadded:: 1.5.0

494 Added support for `.tar` files."""

495

496_shared_docs[

497 "replace"

498] = """

499 Replace values given in `to_replace` with `value`.

500

501 Values of the {klass} are replaced with other values dynamically.

502 This differs from updating with ``.loc`` or ``.iloc``, which require

503 you to specify a location to update with some value.

504

505 Parameters

506 ----------

507 to_replace : str, regex, list, dict, Series, int, float, or None

508 How to find the values that will be replaced.

509

510 * numeric, str or regex:

511

512 - numeric: numeric values equal to `to_replace` will be

513 replaced with `value`

514 - str: string exactly matching `to_replace` will be replaced

515 with `value`

516 - regex: regexs matching `to_replace` will be replaced with

517 `value`

518

519 * list of str, regex, or numeric:

520

521 - First, if `to_replace` and `value` are both lists, they

522 **must** be the same length.

523 - Second, if ``regex=True`` then all of the strings in **both**

524 lists will be interpreted as regexs otherwise they will match

525 directly. This doesn't matter much for `value` since there

526 are only a few possible substitution regexes you can use.

527 - str, regex and numeric rules apply as above.

528

529 * dict:

530

531 - Dicts can be used to specify different replacement values

532 for different existing values. For example,

533 ``{{'a': 'b', 'y': 'z'}}`` replaces the value 'a' with 'b' and

534 'y' with 'z'. To use a dict in this way, the optional `value`

535 parameter should not be given.

536 - For a DataFrame a dict can specify that different values

537 should be replaced in different columns. For example,

538 ``{{'a': 1, 'b': 'z'}}`` looks for the value 1 in column 'a'

539 and the value 'z' in column 'b' and replaces these values

540 with whatever is specified in `value`. The `value` parameter

541 should not be ``None`` in this case. You can treat this as a

542 special case of passing two lists except that you are

543 specifying the column to search in.

544 - For a DataFrame nested dictionaries, e.g.,

545 ``{{'a': {{'b': np.nan}}}}``, are read as follows: look in column

546 'a' for the value 'b' and replace it with NaN. The optional `value`

547 parameter should not be specified to use a nested dict in this

548 way. You can nest regular expressions as well. Note that

549 column names (the top-level dictionary keys in a nested

550 dictionary) **cannot** be regular expressions.

551

552 * None:

553

554 - This means that the `regex` argument must be a string,

555 compiled regular expression, or list, dict, ndarray or

556 Series of such elements. If `value` is also ``None`` then

557 this **must** be a nested dictionary or Series.

558

559 See the examples section for examples of each of these.

560 value : scalar, dict, list, str, regex, default None

561 Value to replace any values matching `to_replace` with.

562 For a DataFrame a dict of values can be used to specify which

563 value to use for each column (columns not in the dict will not be

564 filled). Regular expressions, strings and lists or dicts of such

565 objects are also allowed.

566 {inplace}

567 limit : int, default None

568 Maximum size gap to forward or backward fill.

569

570 .. deprecated:: 2.1.0

571 regex : bool or same types as `to_replace`, default False

572 Whether to interpret `to_replace` and/or `value` as regular

573 expressions. Alternatively, this could be a regular expression or a

574 list, dict, or array of regular expressions in which case

575 `to_replace` must be ``None``.

576 method : {{'pad', 'ffill', 'bfill'}}

577 The method to use when for replacement, when `to_replace` is a

578 scalar, list or tuple and `value` is ``None``.

579

580 .. deprecated:: 2.1.0

581

582 Returns

583 -------

584 {klass}

585 Object after replacement.

586

587 Raises

588 ------

589 AssertionError

590 * If `regex` is not a ``bool`` and `to_replace` is not

591 ``None``.

592

593 TypeError

594 * If `to_replace` is not a scalar, array-like, ``dict``, or ``None``

595 * If `to_replace` is a ``dict`` and `value` is not a ``list``,

596 ``dict``, ``ndarray``, or ``Series``

597 * If `to_replace` is ``None`` and `regex` is not compilable

598 into a regular expression or is a list, dict, ndarray, or

599 Series.

600 * When replacing multiple ``bool`` or ``datetime64`` objects and

601 the arguments to `to_replace` does not match the type of the

602 value being replaced

603

604 ValueError

605 * If a ``list`` or an ``ndarray`` is passed to `to_replace` and

606 `value` but they are not the same length.

607

608 See Also

609 --------

610 Series.fillna : Fill NA values.

611 DataFrame.fillna : Fill NA values.

612 Series.where : Replace values based on boolean condition.

613 DataFrame.where : Replace values based on boolean condition.

614 DataFrame.map: Apply a function to a Dataframe elementwise.

615 Series.map: Map values of Series according to an input mapping or function.

616 Series.str.replace : Simple string replacement.

617

618 Notes

619 -----

620 * Regex substitution is performed under the hood with ``re.sub``. The

621 rules for substitution for ``re.sub`` are the same.

622 * Regular expressions will only substitute on strings, meaning you

623 cannot provide, for example, a regular expression matching floating

624 point numbers and expect the columns in your frame that have a

625 numeric dtype to be matched. However, if those floating point

626 numbers *are* strings, then you can do this.

627 * This method has *a lot* of options. You are encouraged to experiment

628 and play with this method to gain intuition about how it works.

629 * When dict is used as the `to_replace` value, it is like

630 key(s) in the dict are the to_replace part and

631 value(s) in the dict are the value parameter.

632

633 Examples

634 --------

635

636 **Scalar `to_replace` and `value`**

637

638 >>> s = pd.Series([1, 2, 3, 4, 5])

639 >>> s.replace(1, 5)

640 0 5

641 1 2

642 2 3

643 3 4

644 4 5

645 dtype: int64

646

647 >>> df = pd.DataFrame({{'A': [0, 1, 2, 3, 4],

648 ... 'B': [5, 6, 7, 8, 9],

649 ... 'C': ['a', 'b', 'c', 'd', 'e']}})

650 >>> df.replace(0, 5)

651 A B C

652 0 5 5 a

653 1 1 6 b

654 2 2 7 c

655 3 3 8 d

656 4 4 9 e

657

658 **List-like `to_replace`**

659

660 >>> df.replace([0, 1, 2, 3], 4)

661 A B C

662 0 4 5 a

663 1 4 6 b

664 2 4 7 c

665 3 4 8 d

666 4 4 9 e

667

668 >>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])

669 A B C

670 0 4 5 a

671 1 3 6 b

672 2 2 7 c

673 3 1 8 d

674 4 4 9 e

675

676 >>> s.replace([1, 2], method='bfill')

677 0 3

678 1 3

679 2 3

680 3 4

681 4 5

682 dtype: int64

683

684 **dict-like `to_replace`**

685

686 >>> df.replace({{0: 10, 1: 100}})

687 A B C

688 0 10 5 a

689 1 100 6 b

690 2 2 7 c

691 3 3 8 d

692 4 4 9 e

693

694 >>> df.replace({{'A': 0, 'B': 5}}, 100)

695 A B C

696 0 100 100 a

697 1 1 6 b

698 2 2 7 c

699 3 3 8 d

700 4 4 9 e

701

702 >>> df.replace({{'A': {{0: 100, 4: 400}}}})

703 A B C

704 0 100 5 a

705 1 1 6 b

706 2 2 7 c

707 3 3 8 d

708 4 400 9 e

709

710 **Regular expression `to_replace`**

711

712 >>> df = pd.DataFrame({{'A': ['bat', 'foo', 'bait'],

713 ... 'B': ['abc', 'bar', 'xyz']}})

714 >>> df.replace(to_replace=r'^ba.$', value='new', regex=True)

715 A B

716 0 new abc

717 1 foo new

718 2 bait xyz

719

720 >>> df.replace({{'A': r'^ba.$'}}, {{'A': 'new'}}, regex=True)

721 A B

722 0 new abc

723 1 foo bar

724 2 bait xyz

725

726 >>> df.replace(regex=r'^ba.$', value='new')

727 A B

728 0 new abc

729 1 foo new

730 2 bait xyz

731

732 >>> df.replace(regex={{r'^ba.$': 'new', 'foo': 'xyz'}})

733 A B

734 0 new abc

735 1 xyz new

736 2 bait xyz

737

738 >>> df.replace(regex=[r'^ba.$', 'foo'], value='new')

739 A B

740 0 new abc

741 1 new new

742 2 bait xyz

743

744 Compare the behavior of ``s.replace({{'a': None}})`` and

745 ``s.replace('a', None)`` to understand the peculiarities

746 of the `to_replace` parameter:

747

748 >>> s = pd.Series([10, 'a', 'a', 'b', 'a'])

749

750 When one uses a dict as the `to_replace` value, it is like the

751 value(s) in the dict are equal to the `value` parameter.

752 ``s.replace({{'a': None}})`` is equivalent to

753 ``s.replace(to_replace={{'a': None}}, value=None, method=None)``:

754

755 >>> s.replace({{'a': None}})

756 0 10

757 1 None

758 2 None

759 3 b

760 4 None

761 dtype: object

762

763 When ``value`` is not explicitly passed and `to_replace` is a scalar, list

764 or tuple, `replace` uses the method parameter (default 'pad') to do the

765 replacement. So this is why the 'a' values are being replaced by 10

766 in rows 1 and 2 and 'b' in row 4 in this case.

767

768 >>> s.replace('a')

769 0 10

770 1 10

771 2 10

772 3 b

773 4 b

774 dtype: object

775

776 .. deprecated:: 2.1.0

777 The 'method' parameter and padding behavior are deprecated.

778

779 On the other hand, if ``None`` is explicitly passed for ``value``, it will

780 be respected:

781

782 >>> s.replace('a', None)

783 0 10

784 1 None

785 2 None

786 3 b

787 4 None

788 dtype: object

789

790 .. versionchanged:: 1.4.0

791 Previously the explicit ``None`` was silently ignored.

792

793 When ``regex=True``, ``value`` is not ``None`` and `to_replace` is a string,

794 the replacement will be applied in all columns of the DataFrame.

795

796 >>> df = pd.DataFrame({{'A': [0, 1, 2, 3, 4],

797 ... 'B': ['a', 'b', 'c', 'd', 'e'],

798 ... 'C': ['f', 'g', 'h', 'i', 'j']}})

799

800 >>> df.replace(to_replace='^[a-g]', value='e', regex=True)

801 A B C

802 0 0 e e

803 1 1 e e

804 2 2 e h

805 3 3 e i

806 4 4 e j

807

808 If ``value`` is not ``None`` and `to_replace` is a dictionary, the dictionary

809 keys will be the DataFrame columns that the replacement will be applied.

810

811 >>> df.replace(to_replace={{'B': '^[a-c]', 'C': '^[h-j]'}}, value='e', regex=True)

812 A B C

813 0 0 e f

814 1 1 e g

815 2 2 e e

816 3 3 d e

817 4 4 e e

818"""

819

820_shared_docs[

821 "idxmin"

822] = """

823 Return index of first occurrence of minimum over requested axis.

824

825 NA/null values are excluded.

826

827 Parameters

828 ----------

829 axis : {{0 or 'index', 1 or 'columns'}}, default 0

830 The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.

831 skipna : bool, default True

832 Exclude NA/null values. If an entire row/column is NA, the result

833 will be NA.

834 numeric_only : bool, default {numeric_only_default}

835 Include only `float`, `int` or `boolean` data.

836

837 .. versionadded:: 1.5.0

838

839 Returns

840 -------

841 Series

842 Indexes of minima along the specified axis.

843

844 Raises

845 ------

846 ValueError

847 * If the row/column is empty

848

849 See Also

850 --------

851 Series.idxmin : Return index of the minimum element.

852

853 Notes

854 -----

855 This method is the DataFrame version of ``ndarray.argmin``.

856

857 Examples

858 --------

859 Consider a dataset containing food consumption in Argentina.

860

861 >>> df = pd.DataFrame({{'consumption': [10.51, 103.11, 55.48],

862 ... 'co2_emissions': [37.2, 19.66, 1712]}},

863 ... index=['Pork', 'Wheat Products', 'Beef'])

864

865 >>> df

866 consumption co2_emissions

867 Pork 10.51 37.20

868 Wheat Products 103.11 19.66

869 Beef 55.48 1712.00

870

871 By default, it returns the index for the minimum value in each column.

872

873 >>> df.idxmin()

874 consumption Pork

875 co2_emissions Wheat Products

876 dtype: object

877

878 To return the index for the minimum value in each row, use ``axis="columns"``.

879

880 >>> df.idxmin(axis="columns")

881 Pork consumption

882 Wheat Products co2_emissions

883 Beef consumption

884 dtype: object

885"""

886

887_shared_docs[

888 "idxmax"

889] = """

890 Return index of first occurrence of maximum over requested axis.

891

892 NA/null values are excluded.

893

894 Parameters

895 ----------

896 axis : {{0 or 'index', 1 or 'columns'}}, default 0

897 The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.

898 skipna : bool, default True

899 Exclude NA/null values. If an entire row/column is NA, the result

900 will be NA.

901 numeric_only : bool, default {numeric_only_default}

902 Include only `float`, `int` or `boolean` data.

903

904 .. versionadded:: 1.5.0

905

906 Returns

907 -------

908 Series

909 Indexes of maxima along the specified axis.

910

911 Raises

912 ------

913 ValueError

914 * If the row/column is empty

915

916 See Also

917 --------

918 Series.idxmax : Return index of the maximum element.

919

920 Notes

921 -----

922 This method is the DataFrame version of ``ndarray.argmax``.

923

924 Examples

925 --------

926 Consider a dataset containing food consumption in Argentina.

927

928 >>> df = pd.DataFrame({{'consumption': [10.51, 103.11, 55.48],

929 ... 'co2_emissions': [37.2, 19.66, 1712]}},

930 ... index=['Pork', 'Wheat Products', 'Beef'])

931

932 >>> df

933 consumption co2_emissions

934 Pork 10.51 37.20

935 Wheat Products 103.11 19.66

936 Beef 55.48 1712.00

937

938 By default, it returns the index for the maximum value in each column.

939

940 >>> df.idxmax()

941 consumption Wheat Products

942 co2_emissions Beef

943 dtype: object

944

945 To return the index for the maximum value in each row, use ``axis="columns"``.

946

947 >>> df.idxmax(axis="columns")

948 Pork co2_emissions

949 Wheat Products consumption

950 Beef co2_emissions

951 dtype: object

952"""

Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.9/dist-packages/pandas/core/shared_docs.py: 100%

13 statements