

.. _sphx_glr_auto_examples_parallel_distributed_backend_simple.py:


Using dask distributed for single-machine parallel computing
=============================================================

This example shows the simplest usage of the dask `distributed
<https://distributed.readthedocs.io>`__ backend, on the local computer.

This is useful for prototyping a solution, to later be run on a truly
distributed cluster, as the only change to be made is the address of the
scheduler.

Another realistic usage scenario: combining dask code with joblib code,
for instance using dask for preprocessing data, and scikit-learn for
machine learning. In such a setting, it may be interesting to use
distributed as a backend scheduler for both dask and joblib, to
orchestrate well the computation.



Setup the distributed client
##############################################################################



.. code-block:: python

    from distributed import Client
    # Typically, to execute on a remote machine, the address of the scheduler
    # would go there
    client = Client()

    # Recover the address
    address = client.scheduler_info()['address']

    # This import registers the dask.distributed backend for joblib
    import distributed.joblib  # noqa







Run parallel computation using dask.distributed
##############################################################################



.. code-block:: python


    import time
    import joblib


    def long_running_function(i):
        time.sleep(.1)
        return i








The verbose messages below show that the backend is indeed the
dask.distributed one



.. code-block:: python

    with joblib.parallel_backend('dask.distributed', scheduler_host=address):
        joblib.Parallel(n_jobs=2, verbose=100)(
            joblib.delayed(long_running_function)(i)
            for i in range(10))





.. rst-class:: sphx-glr-script-out

 Out::

    [Parallel(n_jobs=2)]: Using backend DaskDistributedBackend with 4 concurrent workers.
    [Parallel(n_jobs=2)]: Done   1 tasks      | elapsed:    0.3s
    [Parallel(n_jobs=2)]: Done   2 tasks      | elapsed:    0.3s
    [Parallel(n_jobs=2)]: Done   3 tasks      | elapsed:    0.3s
    [Parallel(n_jobs=2)]: Done   4 out of  10 | elapsed:    0.4s remaining:    0.6s
    [Parallel(n_jobs=2)]: Done   5 out of  10 | elapsed:    0.4s remaining:    0.4s
    [Parallel(n_jobs=2)]: Done   6 out of  10 | elapsed:    0.4s remaining:    0.3s
    [Parallel(n_jobs=2)]: Done   7 out of  10 | elapsed:    0.4s remaining:    0.2s
    [Parallel(n_jobs=2)]: Done   8 out of  10 | elapsed:    0.5s remaining:    0.1s
    [Parallel(n_jobs=2)]: Done  10 out of  10 | elapsed:    0.5s remaining:    0.0s
    [Parallel(n_jobs=2)]: Done  10 out of  10 | elapsed:    0.5s finished


Progress in computation can be followed on the distributed web
interface, see http://distributed.readthedocs.io/en/latest/web.html


**Total running time of the script:** ( 0 minutes  1.152 seconds)



.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example



  .. container:: sphx-glr-download

     :download:`Download Python source code: distributed_backend_simple.py <distributed_backend_simple.py>`



  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: distributed_backend_simple.ipynb <distributed_backend_simple.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_
