API documentation

This package provides a single function parq.run(), which runs jobs using multiple Python processes and returns a parq.Result value when all jobs have completed.

Note

By default, the return value of each job is ignored. To obtain the return value of each job, pass results=True to parq.run().

Warning

If you use parq.run() to run jobs that return very large data structures, you should consider saving the results of each job to an external file, rather than passing results=True.

parq.run(func, iterable, n_proc, fail_early=True, trace=True, level=None, results=False, timeout=10)

Perform multiple jobs in parallel by spawning multiple processes.

Parameters
  • func – The function that performs a single job.

  • iterable – A sequence of job arguments, represented as tuples and unpacked before passing to func (i.e., func(*args)).

  • n_proc – The number of processes to spawn.

  • fail_early – Whether to stop running jobs if one fails.

  • trace – Whether to print stack traces for jobs that raise an exception.

  • level – The logging level for worker processes. By default, only warnings and errors will be shown.

  • results – Whether to return the results of each job.

  • timeout – The optional timeout (in seconds) when polling for job results. Set this to None to block until a result is received.

Returns

A Result instance.

Return type

parq.Result

Warning

If a worker process is terminated unexpectedly (e.g., by running out of memory) this function will deadlock if timeout is None.

class parq.Result(success: bool, job_count: int, successful_jobs: List[Any], unsuccessful_jobs: List[Any], failed_worker_count: int, job_results: Optional[Dict[int, Any]] = None)

The result of running a number of jobs.

Parameters
  • success (bool) – Whether all jobs completed successfully.

  • job_count (int) – The number of jobs that were submitted.

  • successful_jobs ([Any]) – A list that contains the arguments for each job that was completed successfully.

  • unsuccessful_jobs ([Any]) – A list that contains the arguments for each job that was not completed successfully.

  • failed_worker_count (int) – The number of worker processes that terminated early.

  • job_results (Optional[Dict[int, Any]]) – An optional dictionary that maps successful job numbers to the returned results of those jobs. If results were not collected, this will be None.

Instances are considered true if success is true, otherwise they are considered false.

>>> from parq import Result
>>> res_good = Result(True, 0, [], [], 0)
>>> assert res_good
>>> res_bad = Result(False, 0, [], [], 0)
>>> assert not res_bad
num_successful()

Return the number of jobs that were completed successfully.

num_unsuccessful()

Return the number of jobs that were not completed successfully.