API documentation¶
This package provides a single function parq.run()
, which runs jobs using multiple Python processes and returns a parq.Result
value when all jobs have completed.
Note
By default, the return value of each job is ignored.
To obtain the return value of each job, pass results=True
to parq.run()
.
Warning
If you use parq.run()
to run jobs that return very large data structures, you should consider saving the results of each job to an external file, rather than passing results=True
.
- parq.run(func, iterable, n_proc, fail_early=True, trace=True, level=None, results=False, timeout=10)¶
Perform multiple jobs in parallel by spawning multiple processes.
- Parameters
func – The function that performs a single job.
iterable – A sequence of job arguments, represented as tuples and unpacked before passing to
func
(i.e.,func(*args)
).n_proc – The number of processes to spawn.
fail_early – Whether to stop running jobs if one fails.
trace – Whether to print stack traces for jobs that raise an exception.
level – The logging level for worker processes. By default, only warnings and errors will be shown.
results – Whether to return the results of each job.
timeout – The optional timeout (in seconds) when polling for job results. Set this to
None
to block until a result is received.
- Returns
A
Result
instance.- Return type
Warning
If a worker process is terminated unexpectedly (e.g., by running out of memory) this function will deadlock if
timeout
isNone
.
- class parq.Result(success: bool, job_count: int, successful_jobs: List[Any], unsuccessful_jobs: List[Any], failed_worker_count: int, job_results: Optional[Dict[int, Any]] = None)¶
The result of running a number of jobs.
- Parameters
success (bool) – Whether all jobs completed successfully.
job_count (int) – The number of jobs that were submitted.
successful_jobs ([Any]) – A list that contains the arguments for each job that was completed successfully.
unsuccessful_jobs ([Any]) – A list that contains the arguments for each job that was not completed successfully.
failed_worker_count (int) – The number of worker processes that terminated early.
job_results (Optional[Dict[int, Any]]) – An optional dictionary that maps successful job numbers to the returned results of those jobs. If results were not collected, this will be
None
.
Instances are considered true if
success
is true, otherwise they are considered false.>>> from parq import Result >>> res_good = Result(True, 0, [], [], 0) >>> assert res_good >>> res_bad = Result(False, 0, [], [], 0) >>> assert not res_bad
- num_successful()¶
Return the number of jobs that were completed successfully.
- num_unsuccessful()¶
Return the number of jobs that were not completed successfully.