1"""Joblib is a set of tools to provide **lightweight pipelining in
2Python**. In particular:
3
41. transparent disk-caching of functions and lazy re-evaluation
5 (memoize pattern)
6
72. easy simple parallel computing
8
9Joblib is optimized to be **fast** and **robust** on large
10data in particular and has specific optimizations for `numpy` arrays. It is
11**BSD-licensed**.
12
13
14 ==================== ===============================================
15 **Documentation:** https://joblib.readthedocs.io
16
17 **Download:** https://pypi.python.org/pypi/joblib#downloads
18
19 **Source code:** https://github.com/joblib/joblib
20
21 **Report issues:** https://github.com/joblib/joblib/issues
22 ==================== ===============================================
23
24
25Vision
26--------
27
28The vision is to provide tools to easily achieve better performance and
29reproducibility when working with long running jobs.
30
31 * **Avoid computing the same thing twice**: code is often rerun again and
32 again, for instance when prototyping computational-heavy jobs (as in
33 scientific development), but hand-crafted solutions to alleviate this
34 issue are error-prone and often lead to unreproducible results.
35
36 * **Persist to disk transparently**: efficiently persisting
37 arbitrary objects containing large data is hard. Using
38 joblib's caching mechanism avoids hand-written persistence and
39 implicitly links the file on disk to the execution context of
40 the original Python object. As a result, joblib's persistence is
41 good for resuming an application status or computational job, eg
42 after a crash.
43
44Joblib addresses these problems while **leaving your code and your flow
45control as unmodified as possible** (no framework, no new paradigms).
46
47Main features
48------------------
49
501) **Transparent and fast disk-caching of output value:** a memoize or
51 make-like functionality for Python functions that works well for
52 arbitrary Python objects, including very large numpy arrays. Separate
53 persistence and flow-execution logic from domain logic or algorithmic
54 code by writing the operations as a set of steps with well-defined
55 inputs and outputs: Python functions. Joblib can save their
56 computation to disk and rerun it only if necessary::
57
58 >>> from joblib import Memory
59 >>> location = 'your_cache_dir_goes_here'
60 >>> mem = Memory(location, verbose=1)
61 >>> import numpy as np
62 >>> a = np.vander(np.arange(3)).astype(float)
63 >>> square = mem.cache(np.square)
64 >>> b = square(a) # doctest: +ELLIPSIS
65 ______________________________________________________________________...
66 [Memory] Calling ...square...
67 square(array([[0., 0., 1.],
68 [1., 1., 1.],
69 [4., 2., 1.]]))
70 _________________________________________________...square - ...s, 0.0min
71
72 >>> c = square(a)
73 >>> # The above call did not trigger an evaluation
74
752) **Embarrassingly parallel helper:** to make it easy to write readable
76 parallel code and debug it quickly::
77
78 >>> from joblib import Parallel, delayed
79 >>> from math import sqrt
80 >>> Parallel(n_jobs=1)(delayed(sqrt)(i**2) for i in range(10))
81 [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
82
83
843) **Fast compressed Persistence**: a replacement for pickle to work
85 efficiently on Python objects containing large data (
86 *joblib.dump* & *joblib.load* ).
87
88..
89 >>> import shutil ; shutil.rmtree(location)
90
91"""
92
93# PEP0440 compatible formatted version, see:
94# https://www.python.org/dev/peps/pep-0440/
95#
96# Generic release markers:
97# X.Y
98# X.Y.Z # For bugfix releases
99#
100# Admissible pre-release markers:
101# X.YaN # Alpha release
102# X.YbN # Beta release
103# X.YrcN # Release Candidate
104# X.Y # Final release
105#
106# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
107# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
108#
109__version__ = "1.5.1"
110
111
112import os
113
114from ._cloudpickle_wrapper import wrap_non_picklable_objects
115from ._parallel_backends import ParallelBackendBase
116from ._store_backends import StoreBackendBase
117from .compressor import register_compressor
118from .hashing import hash
119from .logger import Logger, PrintTime
120from .memory import MemorizedResult, Memory, expires_after, register_store_backend
121from .numpy_pickle import dump, load
122from .parallel import (
123 Parallel,
124 cpu_count,
125 delayed,
126 effective_n_jobs,
127 parallel_backend,
128 parallel_config,
129 register_parallel_backend,
130)
131
132__all__ = [
133 # On-disk result caching
134 "Memory",
135 "MemorizedResult",
136 "expires_after",
137 # Parallel code execution
138 "Parallel",
139 "delayed",
140 "cpu_count",
141 "effective_n_jobs",
142 "wrap_non_picklable_objects",
143 # Context to change the backend globally
144 "parallel_config",
145 "parallel_backend",
146 # Helpers to define and register store/parallel backends
147 "ParallelBackendBase",
148 "StoreBackendBase",
149 "register_compressor",
150 "register_parallel_backend",
151 "register_store_backend",
152 # Helpers kept for backward compatibility
153 "PrintTime",
154 "Logger",
155 "hash",
156 "dump",
157 "load",
158]
159
160
161# Workaround issue discovered in intel-openmp 2019.5:
162# https://github.com/ContinuumIO/anaconda-issues/issues/11294
163os.environ.setdefault("KMP_INIT_AT_FORK", "FALSE")