1from __future__ import annotations
2
3_shared_docs: dict[str, str] = {}
4
5_shared_docs[
6 "aggregate"
7] = """
8Aggregate using one or more operations over the specified axis.
9
10Parameters
11----------
12func : function, str, list or dict
13 Function to use for aggregating the data. If a function, must either
14 work when passed a {klass} or when passed to {klass}.apply.
15
16 Accepted combinations are:
17
18 - function
19 - string function name
20 - list of functions and/or function names, e.g. ``[np.sum, 'mean']``
21 - dict of axis labels -> functions, function names or list of such.
22{axis}
23*args
24 Positional arguments to pass to `func`.
25**kwargs
26 Keyword arguments to pass to `func`.
27
28Returns
29-------
30scalar, Series or DataFrame
31
32 The return can be:
33
34 * scalar : when Series.agg is called with single function
35 * Series : when DataFrame.agg is called with a single function
36 * DataFrame : when DataFrame.agg is called with several functions
37{see_also}
38Notes
39-----
40The aggregation operations are always performed over an axis, either the
41index (default) or the column axis. This behavior is different from
42`numpy` aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,
43`var`), where the default is to compute the aggregation of the flattened
44array, e.g., ``numpy.mean(arr_2d)`` as opposed to
45``numpy.mean(arr_2d, axis=0)``.
46
47`agg` is an alias for `aggregate`. Use the alias.
48
49Functions that mutate the passed object can produce unexpected
50behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
51for more details.
52
53A passed user-defined-function will be passed a Series for evaluation.
54{examples}"""
55
56_shared_docs[
57 "compare"
58] = """
59Compare to another {klass} and show the differences.
60
61Parameters
62----------
63other : {klass}
64 Object to compare with.
65
66align_axis : {{0 or 'index', 1 or 'columns'}}, default 1
67 Determine which axis to align the comparison on.
68
69 * 0, or 'index' : Resulting differences are stacked vertically
70 with rows drawn alternately from self and other.
71 * 1, or 'columns' : Resulting differences are aligned horizontally
72 with columns drawn alternately from self and other.
73
74keep_shape : bool, default False
75 If true, all rows and columns are kept.
76 Otherwise, only the ones with different values are kept.
77
78keep_equal : bool, default False
79 If true, the result keeps values that are equal.
80 Otherwise, equal values are shown as NaNs.
81
82result_names : tuple, default ('self', 'other')
83 Set the dataframes names in the comparison.
84
85 .. versionadded:: 1.5.0
86"""
87
88_shared_docs[
89 "groupby"
90] = """
91Group %(klass)s using a mapper or by a Series of columns.
92
93A groupby operation involves some combination of splitting the
94object, applying a function, and combining the results. This can be
95used to group large amounts of data and compute operations on these
96groups.
97
98Parameters
99----------
100by : mapping, function, label, pd.Grouper or list of such
101 Used to determine the groups for the groupby.
102 If ``by`` is a function, it's called on each value of the object's
103 index. If a dict or Series is passed, the Series or dict VALUES
104 will be used to determine the groups (the Series' values are first
105 aligned; see ``.align()`` method). If a list or ndarray of length
106 equal to the selected axis is passed (see the `groupby user guide
107 <https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#splitting-an-object-into-groups>`_),
108 the values are used as-is to determine the groups. A label or list
109 of labels may be passed to group by the columns in ``self``.
110 Notice that a tuple is interpreted as a (single) key.
111axis : {0 or 'index', 1 or 'columns'}, default 0
112 Split along rows (0) or columns (1). For `Series` this parameter
113 is unused and defaults to 0.
114
115 .. deprecated:: 2.1.0
116
117 Will be removed and behave like axis=0 in a future version.
118 For ``axis=1``, do ``frame.T.groupby(...)`` instead.
119
120level : int, level name, or sequence of such, default None
121 If the axis is a MultiIndex (hierarchical), group by a particular
122 level or levels. Do not specify both ``by`` and ``level``.
123as_index : bool, default True
124 Return object with group labels as the
125 index. Only relevant for DataFrame input. as_index=False is
126 effectively "SQL-style" grouped output. This argument has no effect
127 on filtrations (see the `filtrations in the user guide
128 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#filtration>`_),
129 such as ``head()``, ``tail()``, ``nth()`` and in transformations
130 (see the `transformations in the user guide
131 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#transformation>`_).
132sort : bool, default True
133 Sort group keys. Get better performance by turning this off.
134 Note this does not influence the order of observations within each
135 group. Groupby preserves the order of rows within each group. If False,
136 the groups will appear in the same order as they did in the original DataFrame.
137 This argument has no effect on filtrations (see the `filtrations in the user guide
138 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#filtration>`_),
139 such as ``head()``, ``tail()``, ``nth()`` and in transformations
140 (see the `transformations in the user guide
141 <https://pandas.pydata.org/docs/dev/user_guide/groupby.html#transformation>`_).
142
143 .. versionchanged:: 2.0.0
144
145 Specifying ``sort=False`` with an ordered categorical grouper will no
146 longer sort the values.
147
148group_keys : bool, default True
149 When calling apply and the ``by`` argument produces a like-indexed
150 (i.e. :ref:`a transform <groupby.transform>`) result, add group keys to
151 index to identify pieces. By default group keys are not included
152 when the result's index (and column) labels match the inputs, and
153 are included otherwise.
154
155 .. versionchanged:: 1.5.0
156
157 Warns that ``group_keys`` will no longer be ignored when the
158 result from ``apply`` is a like-indexed Series or DataFrame.
159 Specify ``group_keys`` explicitly to include the group keys or
160 not.
161
162 .. versionchanged:: 2.0.0
163
164 ``group_keys`` now defaults to ``True``.
165
166observed : bool, default False
167 This only applies if any of the groupers are Categoricals.
168 If True: only show observed values for categorical groupers.
169 If False: show all values for categorical groupers.
170
171 .. deprecated:: 2.1.0
172
173 The default value will change to True in a future version of pandas.
174
175dropna : bool, default True
176 If True, and if group keys contain NA values, NA values together
177 with row/column will be dropped.
178 If False, NA values will also be treated as the key in groups.
179
180Returns
181-------
182pandas.api.typing.%(klass)sGroupBy
183 Returns a groupby object that contains information about the groups.
184
185See Also
186--------
187resample : Convenience method for frequency conversion and resampling
188 of time series.
189
190Notes
191-----
192See the `user guide
193<https://pandas.pydata.org/pandas-docs/stable/groupby.html>`__ for more
194detailed usage and examples, including splitting an object into groups,
195iterating through groups, selecting a group, aggregation, and more.
196"""
197
198_shared_docs[
199 "melt"
200] = """
201Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
202
203This function is useful to massage a DataFrame into a format where one
204or more columns are identifier variables (`id_vars`), while all other
205columns, considered measured variables (`value_vars`), are "unpivoted" to
206the row axis, leaving just two non-identifier columns, 'variable' and
207'value'.
208
209Parameters
210----------
211id_vars : scalar, tuple, list, or ndarray, optional
212 Column(s) to use as identifier variables.
213value_vars : scalar, tuple, list, or ndarray, optional
214 Column(s) to unpivot. If not specified, uses all columns that
215 are not set as `id_vars`.
216var_name : scalar, default None
217 Name to use for the 'variable' column. If None it uses
218 ``frame.columns.name`` or 'variable'.
219value_name : scalar, default 'value'
220 Name to use for the 'value' column, can't be an existing column label.
221col_level : scalar, optional
222 If columns are a MultiIndex then use this level to melt.
223ignore_index : bool, default True
224 If True, original index is ignored. If False, the original index is retained.
225 Index labels will be repeated as necessary.
226
227Returns
228-------
229DataFrame
230 Unpivoted DataFrame.
231
232See Also
233--------
234%(other)s : Identical method.
235pivot_table : Create a spreadsheet-style pivot table as a DataFrame.
236DataFrame.pivot : Return reshaped DataFrame organized
237 by given index / column values.
238DataFrame.explode : Explode a DataFrame from list-like
239 columns to long format.
240
241Notes
242-----
243Reference :ref:`the user guide <reshaping.melt>` for more examples.
244
245Examples
246--------
247>>> df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
248... 'B': {0: 1, 1: 3, 2: 5},
249... 'C': {0: 2, 1: 4, 2: 6}})
250>>> df
251 A B C
2520 a 1 2
2531 b 3 4
2542 c 5 6
255
256>>> %(caller)sid_vars=['A'], value_vars=['B'])
257 A variable value
2580 a B 1
2591 b B 3
2602 c B 5
261
262>>> %(caller)sid_vars=['A'], value_vars=['B', 'C'])
263 A variable value
2640 a B 1
2651 b B 3
2662 c B 5
2673 a C 2
2684 b C 4
2695 c C 6
270
271The names of 'variable' and 'value' columns can be customized:
272
273>>> %(caller)sid_vars=['A'], value_vars=['B'],
274... var_name='myVarname', value_name='myValname')
275 A myVarname myValname
2760 a B 1
2771 b B 3
2782 c B 5
279
280Original index values can be kept around:
281
282>>> %(caller)sid_vars=['A'], value_vars=['B', 'C'], ignore_index=False)
283 A variable value
2840 a B 1
2851 b B 3
2862 c B 5
2870 a C 2
2881 b C 4
2892 c C 6
290
291If you have multi-index columns:
292
293>>> df.columns = [list('ABC'), list('DEF')]
294>>> df
295 A B C
296 D E F
2970 a 1 2
2981 b 3 4
2992 c 5 6
300
301>>> %(caller)scol_level=0, id_vars=['A'], value_vars=['B'])
302 A variable value
3030 a B 1
3041 b B 3
3052 c B 5
306
307>>> %(caller)sid_vars=[('A', 'D')], value_vars=[('B', 'E')])
308 (A, D) variable_0 variable_1 value
3090 a B E 1
3101 b B E 3
3112 c B E 5
312"""
313
314_shared_docs[
315 "transform"
316] = """
317Call ``func`` on self producing a {klass} with the same axis shape as self.
318
319Parameters
320----------
321func : function, str, list-like or dict-like
322 Function to use for transforming the data. If a function, must either
323 work when passed a {klass} or when passed to {klass}.apply. If func
324 is both list-like and dict-like, dict-like behavior takes precedence.
325
326 Accepted combinations are:
327
328 - function
329 - string function name
330 - list-like of functions and/or function names, e.g. ``[np.exp, 'sqrt']``
331 - dict-like of axis labels -> functions, function names or list-like of such.
332{axis}
333*args
334 Positional arguments to pass to `func`.
335**kwargs
336 Keyword arguments to pass to `func`.
337
338Returns
339-------
340{klass}
341 A {klass} that must have the same length as self.
342
343Raises
344------
345ValueError : If the returned {klass} has a different length than self.
346
347See Also
348--------
349{klass}.agg : Only perform aggregating type operations.
350{klass}.apply : Invoke function on a {klass}.
351
352Notes
353-----
354Functions that mutate the passed object can produce unexpected
355behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
356for more details.
357
358Examples
359--------
360>>> df = pd.DataFrame({{'A': range(3), 'B': range(1, 4)}})
361>>> df
362 A B
3630 0 1
3641 1 2
3652 2 3
366>>> df.transform(lambda x: x + 1)
367 A B
3680 1 2
3691 2 3
3702 3 4
371
372Even though the resulting {klass} must have the same length as the
373input {klass}, it is possible to provide several input functions:
374
375>>> s = pd.Series(range(3))
376>>> s
3770 0
3781 1
3792 2
380dtype: int64
381>>> s.transform([np.sqrt, np.exp])
382 sqrt exp
3830 0.000000 1.000000
3841 1.000000 2.718282
3852 1.414214 7.389056
386
387You can call transform on a GroupBy object:
388
389>>> df = pd.DataFrame({{
390... "Date": [
391... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05",
392... "2015-05-08", "2015-05-07", "2015-05-06", "2015-05-05"],
393... "Data": [5, 8, 6, 1, 50, 100, 60, 120],
394... }})
395>>> df
396 Date Data
3970 2015-05-08 5
3981 2015-05-07 8
3992 2015-05-06 6
4003 2015-05-05 1
4014 2015-05-08 50
4025 2015-05-07 100
4036 2015-05-06 60
4047 2015-05-05 120
405>>> df.groupby('Date')['Data'].transform('sum')
4060 55
4071 108
4082 66
4093 121
4104 55
4115 108
4126 66
4137 121
414Name: Data, dtype: int64
415
416>>> df = pd.DataFrame({{
417... "c": [1, 1, 1, 2, 2, 2, 2],
418... "type": ["m", "n", "o", "m", "m", "n", "n"]
419... }})
420>>> df
421 c type
4220 1 m
4231 1 n
4242 1 o
4253 2 m
4264 2 m
4275 2 n
4286 2 n
429>>> df['size'] = df.groupby('c')['type'].transform(len)
430>>> df
431 c type size
4320 1 m 3
4331 1 n 3
4342 1 o 3
4353 2 m 4
4364 2 m 4
4375 2 n 4
4386 2 n 4
439"""
440
441_shared_docs[
442 "storage_options"
443] = """storage_options : dict, optional
444 Extra options that make sense for a particular storage connection, e.g.
445 host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
446 are forwarded to ``urllib.request.Request`` as header options. For other
447 URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are
448 forwarded to ``fsspec.open``. Please see ``fsspec`` and ``urllib`` for more
449 details, and for more examples on storage options refer `here
450 <https://pandas.pydata.org/docs/user_guide/io.html?
451 highlight=storage_options#reading-writing-remote-files>`_."""
452
453_shared_docs[
454 "compression_options"
455] = """compression : str or dict, default 'infer'
456 For on-the-fly compression of the output data. If 'infer' and '%s' is
457 path-like, then detect compression from the following extensions: '.gz',
458 '.bz2', '.zip', '.xz', '.zst', '.tar', '.tar.gz', '.tar.xz' or '.tar.bz2'
459 (otherwise no compression).
460 Set to ``None`` for no compression.
461 Can also be a dict with key ``'method'`` set
462 to one of {``'zip'``, ``'gzip'``, ``'bz2'``, ``'zstd'``, ``'xz'``, ``'tar'``} and
463 other key-value pairs are forwarded to
464 ``zipfile.ZipFile``, ``gzip.GzipFile``,
465 ``bz2.BZ2File``, ``zstandard.ZstdCompressor``, ``lzma.LZMAFile`` or
466 ``tarfile.TarFile``, respectively.
467 As an example, the following could be passed for faster compression and to create
468 a reproducible gzip archive:
469 ``compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}``.
470
471 .. versionadded:: 1.5.0
472 Added support for `.tar` files."""
473
474_shared_docs[
475 "decompression_options"
476] = """compression : str or dict, default 'infer'
477 For on-the-fly decompression of on-disk data. If 'infer' and '%s' is
478 path-like, then detect compression from the following extensions: '.gz',
479 '.bz2', '.zip', '.xz', '.zst', '.tar', '.tar.gz', '.tar.xz' or '.tar.bz2'
480 (otherwise no compression).
481 If using 'zip' or 'tar', the ZIP file must contain only one data file to be read in.
482 Set to ``None`` for no decompression.
483 Can also be a dict with key ``'method'`` set
484 to one of {``'zip'``, ``'gzip'``, ``'bz2'``, ``'zstd'``, ``'xz'``, ``'tar'``} and
485 other key-value pairs are forwarded to
486 ``zipfile.ZipFile``, ``gzip.GzipFile``,
487 ``bz2.BZ2File``, ``zstandard.ZstdDecompressor``, ``lzma.LZMAFile`` or
488 ``tarfile.TarFile``, respectively.
489 As an example, the following could be passed for Zstandard decompression using a
490 custom compression dictionary:
491 ``compression={'method': 'zstd', 'dict_data': my_compression_dict}``.
492
493 .. versionadded:: 1.5.0
494 Added support for `.tar` files."""
495
496_shared_docs[
497 "replace"
498] = """
499 Replace values given in `to_replace` with `value`.
500
501 Values of the {klass} are replaced with other values dynamically.
502 This differs from updating with ``.loc`` or ``.iloc``, which require
503 you to specify a location to update with some value.
504
505 Parameters
506 ----------
507 to_replace : str, regex, list, dict, Series, int, float, or None
508 How to find the values that will be replaced.
509
510 * numeric, str or regex:
511
512 - numeric: numeric values equal to `to_replace` will be
513 replaced with `value`
514 - str: string exactly matching `to_replace` will be replaced
515 with `value`
516 - regex: regexs matching `to_replace` will be replaced with
517 `value`
518
519 * list of str, regex, or numeric:
520
521 - First, if `to_replace` and `value` are both lists, they
522 **must** be the same length.
523 - Second, if ``regex=True`` then all of the strings in **both**
524 lists will be interpreted as regexs otherwise they will match
525 directly. This doesn't matter much for `value` since there
526 are only a few possible substitution regexes you can use.
527 - str, regex and numeric rules apply as above.
528
529 * dict:
530
531 - Dicts can be used to specify different replacement values
532 for different existing values. For example,
533 ``{{'a': 'b', 'y': 'z'}}`` replaces the value 'a' with 'b' and
534 'y' with 'z'. To use a dict in this way, the optional `value`
535 parameter should not be given.
536 - For a DataFrame a dict can specify that different values
537 should be replaced in different columns. For example,
538 ``{{'a': 1, 'b': 'z'}}`` looks for the value 1 in column 'a'
539 and the value 'z' in column 'b' and replaces these values
540 with whatever is specified in `value`. The `value` parameter
541 should not be ``None`` in this case. You can treat this as a
542 special case of passing two lists except that you are
543 specifying the column to search in.
544 - For a DataFrame nested dictionaries, e.g.,
545 ``{{'a': {{'b': np.nan}}}}``, are read as follows: look in column
546 'a' for the value 'b' and replace it with NaN. The optional `value`
547 parameter should not be specified to use a nested dict in this
548 way. You can nest regular expressions as well. Note that
549 column names (the top-level dictionary keys in a nested
550 dictionary) **cannot** be regular expressions.
551
552 * None:
553
554 - This means that the `regex` argument must be a string,
555 compiled regular expression, or list, dict, ndarray or
556 Series of such elements. If `value` is also ``None`` then
557 this **must** be a nested dictionary or Series.
558
559 See the examples section for examples of each of these.
560 value : scalar, dict, list, str, regex, default None
561 Value to replace any values matching `to_replace` with.
562 For a DataFrame a dict of values can be used to specify which
563 value to use for each column (columns not in the dict will not be
564 filled). Regular expressions, strings and lists or dicts of such
565 objects are also allowed.
566 {inplace}
567 limit : int, default None
568 Maximum size gap to forward or backward fill.
569
570 .. deprecated:: 2.1.0
571 regex : bool or same types as `to_replace`, default False
572 Whether to interpret `to_replace` and/or `value` as regular
573 expressions. Alternatively, this could be a regular expression or a
574 list, dict, or array of regular expressions in which case
575 `to_replace` must be ``None``.
576 method : {{'pad', 'ffill', 'bfill'}}
577 The method to use when for replacement, when `to_replace` is a
578 scalar, list or tuple and `value` is ``None``.
579
580 .. deprecated:: 2.1.0
581
582 Returns
583 -------
584 {klass}
585 Object after replacement.
586
587 Raises
588 ------
589 AssertionError
590 * If `regex` is not a ``bool`` and `to_replace` is not
591 ``None``.
592
593 TypeError
594 * If `to_replace` is not a scalar, array-like, ``dict``, or ``None``
595 * If `to_replace` is a ``dict`` and `value` is not a ``list``,
596 ``dict``, ``ndarray``, or ``Series``
597 * If `to_replace` is ``None`` and `regex` is not compilable
598 into a regular expression or is a list, dict, ndarray, or
599 Series.
600 * When replacing multiple ``bool`` or ``datetime64`` objects and
601 the arguments to `to_replace` does not match the type of the
602 value being replaced
603
604 ValueError
605 * If a ``list`` or an ``ndarray`` is passed to `to_replace` and
606 `value` but they are not the same length.
607
608 See Also
609 --------
610 Series.fillna : Fill NA values.
611 DataFrame.fillna : Fill NA values.
612 Series.where : Replace values based on boolean condition.
613 DataFrame.where : Replace values based on boolean condition.
614 DataFrame.map: Apply a function to a Dataframe elementwise.
615 Series.map: Map values of Series according to an input mapping or function.
616 Series.str.replace : Simple string replacement.
617
618 Notes
619 -----
620 * Regex substitution is performed under the hood with ``re.sub``. The
621 rules for substitution for ``re.sub`` are the same.
622 * Regular expressions will only substitute on strings, meaning you
623 cannot provide, for example, a regular expression matching floating
624 point numbers and expect the columns in your frame that have a
625 numeric dtype to be matched. However, if those floating point
626 numbers *are* strings, then you can do this.
627 * This method has *a lot* of options. You are encouraged to experiment
628 and play with this method to gain intuition about how it works.
629 * When dict is used as the `to_replace` value, it is like
630 key(s) in the dict are the to_replace part and
631 value(s) in the dict are the value parameter.
632
633 Examples
634 --------
635
636 **Scalar `to_replace` and `value`**
637
638 >>> s = pd.Series([1, 2, 3, 4, 5])
639 >>> s.replace(1, 5)
640 0 5
641 1 2
642 2 3
643 3 4
644 4 5
645 dtype: int64
646
647 >>> df = pd.DataFrame({{'A': [0, 1, 2, 3, 4],
648 ... 'B': [5, 6, 7, 8, 9],
649 ... 'C': ['a', 'b', 'c', 'd', 'e']}})
650 >>> df.replace(0, 5)
651 A B C
652 0 5 5 a
653 1 1 6 b
654 2 2 7 c
655 3 3 8 d
656 4 4 9 e
657
658 **List-like `to_replace`**
659
660 >>> df.replace([0, 1, 2, 3], 4)
661 A B C
662 0 4 5 a
663 1 4 6 b
664 2 4 7 c
665 3 4 8 d
666 4 4 9 e
667
668 >>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])
669 A B C
670 0 4 5 a
671 1 3 6 b
672 2 2 7 c
673 3 1 8 d
674 4 4 9 e
675
676 >>> s.replace([1, 2], method='bfill')
677 0 3
678 1 3
679 2 3
680 3 4
681 4 5
682 dtype: int64
683
684 **dict-like `to_replace`**
685
686 >>> df.replace({{0: 10, 1: 100}})
687 A B C
688 0 10 5 a
689 1 100 6 b
690 2 2 7 c
691 3 3 8 d
692 4 4 9 e
693
694 >>> df.replace({{'A': 0, 'B': 5}}, 100)
695 A B C
696 0 100 100 a
697 1 1 6 b
698 2 2 7 c
699 3 3 8 d
700 4 4 9 e
701
702 >>> df.replace({{'A': {{0: 100, 4: 400}}}})
703 A B C
704 0 100 5 a
705 1 1 6 b
706 2 2 7 c
707 3 3 8 d
708 4 400 9 e
709
710 **Regular expression `to_replace`**
711
712 >>> df = pd.DataFrame({{'A': ['bat', 'foo', 'bait'],
713 ... 'B': ['abc', 'bar', 'xyz']}})
714 >>> df.replace(to_replace=r'^ba.$', value='new', regex=True)
715 A B
716 0 new abc
717 1 foo new
718 2 bait xyz
719
720 >>> df.replace({{'A': r'^ba.$'}}, {{'A': 'new'}}, regex=True)
721 A B
722 0 new abc
723 1 foo bar
724 2 bait xyz
725
726 >>> df.replace(regex=r'^ba.$', value='new')
727 A B
728 0 new abc
729 1 foo new
730 2 bait xyz
731
732 >>> df.replace(regex={{r'^ba.$': 'new', 'foo': 'xyz'}})
733 A B
734 0 new abc
735 1 xyz new
736 2 bait xyz
737
738 >>> df.replace(regex=[r'^ba.$', 'foo'], value='new')
739 A B
740 0 new abc
741 1 new new
742 2 bait xyz
743
744 Compare the behavior of ``s.replace({{'a': None}})`` and
745 ``s.replace('a', None)`` to understand the peculiarities
746 of the `to_replace` parameter:
747
748 >>> s = pd.Series([10, 'a', 'a', 'b', 'a'])
749
750 When one uses a dict as the `to_replace` value, it is like the
751 value(s) in the dict are equal to the `value` parameter.
752 ``s.replace({{'a': None}})`` is equivalent to
753 ``s.replace(to_replace={{'a': None}}, value=None, method=None)``:
754
755 >>> s.replace({{'a': None}})
756 0 10
757 1 None
758 2 None
759 3 b
760 4 None
761 dtype: object
762
763 When ``value`` is not explicitly passed and `to_replace` is a scalar, list
764 or tuple, `replace` uses the method parameter (default 'pad') to do the
765 replacement. So this is why the 'a' values are being replaced by 10
766 in rows 1 and 2 and 'b' in row 4 in this case.
767
768 >>> s.replace('a')
769 0 10
770 1 10
771 2 10
772 3 b
773 4 b
774 dtype: object
775
776 .. deprecated:: 2.1.0
777 The 'method' parameter and padding behavior are deprecated.
778
779 On the other hand, if ``None`` is explicitly passed for ``value``, it will
780 be respected:
781
782 >>> s.replace('a', None)
783 0 10
784 1 None
785 2 None
786 3 b
787 4 None
788 dtype: object
789
790 .. versionchanged:: 1.4.0
791 Previously the explicit ``None`` was silently ignored.
792
793 When ``regex=True``, ``value`` is not ``None`` and `to_replace` is a string,
794 the replacement will be applied in all columns of the DataFrame.
795
796 >>> df = pd.DataFrame({{'A': [0, 1, 2, 3, 4],
797 ... 'B': ['a', 'b', 'c', 'd', 'e'],
798 ... 'C': ['f', 'g', 'h', 'i', 'j']}})
799
800 >>> df.replace(to_replace='^[a-g]', value='e', regex=True)
801 A B C
802 0 0 e e
803 1 1 e e
804 2 2 e h
805 3 3 e i
806 4 4 e j
807
808 If ``value`` is not ``None`` and `to_replace` is a dictionary, the dictionary
809 keys will be the DataFrame columns that the replacement will be applied.
810
811 >>> df.replace(to_replace={{'B': '^[a-c]', 'C': '^[h-j]'}}, value='e', regex=True)
812 A B C
813 0 0 e f
814 1 1 e g
815 2 2 e e
816 3 3 d e
817 4 4 e e
818"""
819
820_shared_docs[
821 "idxmin"
822] = """
823 Return index of first occurrence of minimum over requested axis.
824
825 NA/null values are excluded.
826
827 Parameters
828 ----------
829 axis : {{0 or 'index', 1 or 'columns'}}, default 0
830 The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.
831 skipna : bool, default True
832 Exclude NA/null values. If an entire row/column is NA, the result
833 will be NA.
834 numeric_only : bool, default {numeric_only_default}
835 Include only `float`, `int` or `boolean` data.
836
837 .. versionadded:: 1.5.0
838
839 Returns
840 -------
841 Series
842 Indexes of minima along the specified axis.
843
844 Raises
845 ------
846 ValueError
847 * If the row/column is empty
848
849 See Also
850 --------
851 Series.idxmin : Return index of the minimum element.
852
853 Notes
854 -----
855 This method is the DataFrame version of ``ndarray.argmin``.
856
857 Examples
858 --------
859 Consider a dataset containing food consumption in Argentina.
860
861 >>> df = pd.DataFrame({{'consumption': [10.51, 103.11, 55.48],
862 ... 'co2_emissions': [37.2, 19.66, 1712]}},
863 ... index=['Pork', 'Wheat Products', 'Beef'])
864
865 >>> df
866 consumption co2_emissions
867 Pork 10.51 37.20
868 Wheat Products 103.11 19.66
869 Beef 55.48 1712.00
870
871 By default, it returns the index for the minimum value in each column.
872
873 >>> df.idxmin()
874 consumption Pork
875 co2_emissions Wheat Products
876 dtype: object
877
878 To return the index for the minimum value in each row, use ``axis="columns"``.
879
880 >>> df.idxmin(axis="columns")
881 Pork consumption
882 Wheat Products co2_emissions
883 Beef consumption
884 dtype: object
885"""
886
887_shared_docs[
888 "idxmax"
889] = """
890 Return index of first occurrence of maximum over requested axis.
891
892 NA/null values are excluded.
893
894 Parameters
895 ----------
896 axis : {{0 or 'index', 1 or 'columns'}}, default 0
897 The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.
898 skipna : bool, default True
899 Exclude NA/null values. If an entire row/column is NA, the result
900 will be NA.
901 numeric_only : bool, default {numeric_only_default}
902 Include only `float`, `int` or `boolean` data.
903
904 .. versionadded:: 1.5.0
905
906 Returns
907 -------
908 Series
909 Indexes of maxima along the specified axis.
910
911 Raises
912 ------
913 ValueError
914 * If the row/column is empty
915
916 See Also
917 --------
918 Series.idxmax : Return index of the maximum element.
919
920 Notes
921 -----
922 This method is the DataFrame version of ``ndarray.argmax``.
923
924 Examples
925 --------
926 Consider a dataset containing food consumption in Argentina.
927
928 >>> df = pd.DataFrame({{'consumption': [10.51, 103.11, 55.48],
929 ... 'co2_emissions': [37.2, 19.66, 1712]}},
930 ... index=['Pork', 'Wheat Products', 'Beef'])
931
932 >>> df
933 consumption co2_emissions
934 Pork 10.51 37.20
935 Wheat Products 103.11 19.66
936 Beef 55.48 1712.00
937
938 By default, it returns the index for the maximum value in each column.
939
940 >>> df.idxmax()
941 consumption Wheat Products
942 co2_emissions Beef
943 dtype: object
944
945 To return the index for the maximum value in each row, use ``axis="columns"``.
946
947 >>> df.idxmax(axis="columns")
948 Pork co2_emissions
949 Wheat Products consumption
950 Beef co2_emissions
951 dtype: object
952"""