Title: | Unified Interface to Parallelization Back-Ends |
Version: | 1.5.1 |
Description: | Unified parallelization framework for multiple back-end, designed for internal package and interactive usage. The main operation is parallel mapping over lists. Supports 'local', 'multicore', 'mpi' and 'BatchJobs' mode. Allows tagging of the parallel operation with a level name that can be later selected by the user to switch on parallel execution for exactly this operation. |
License: | BSD_2_clause + file LICENSE |
URL: | https://parallelmap.mlr-org.com, https://github.com/mlr-org/parallelMap |
BugReports: | https://github.com/mlr-org/parallelMap/issues |
Depends: | R (≥ 3.0.0) |
Imports: | BBmisc (≥ 1.8), checkmate (≥ 1.8.0), parallel, stats, utils |
Suggests: | BatchJobs (≥ 1.8), batchtools (≥ 0.9.6), data.table, Rmpi, rpart, snow, testthat |
ByteCompile: | yes |
Encoding: | UTF-8 |
RoxygenNote: | 7.1.0 |
NeedsCompilation: | no |
Packaged: | 2021-06-27 13:21:26 UTC; pjs |
Author: | Bernd Bischl [cre, aut],
Michel Lang |
Maintainer: | Bernd Bischl <bernd_bischl@gmx.net> |
Repository: | CRAN |
Date/Publication: | 2021-06-28 06:40:04 UTC |
Export R objects for parallelization.
Description
Makes sure that the objects are exported to slave process so
that they can be used in a job function which is later run with
parallelMap()
.
Usage
parallelExport(
...,
objnames,
master = TRUE,
level = NA_character_,
show.info = NA
)
Arguments
... |
|
objnames |
( |
master |
( |
level |
( |
show.info |
( |
Value
Nothing.
Retrieve the configured package options.
Description
Returned are current and default settings, both as lists.
The return value has slots elements settings
and defaults
,
which are both lists of the same structure, named by option names.
A printer exists to display this object.
For details on the configuration procedure please read
parallelStart()
and https://github.com/mlr-org/parallelMap.
Usage
parallelGetOptions()
Value
ParallelMapOptions
. See above.
Get registered parallelization levels for all currently loaded packages.
Description
With flatten = FALSE
, a structured S3 object is returned. The S3 object
only has one slot, which is called levels
. This contains a named list. Each
name refers to package
from the call to parallelRegisterLevels()
, while
the entries are character vectors of the form “package.level”. With
flatten = TRUE
, a simple character vector is returned that contains all
concatenated entries of levels
from above.
Usage
parallelGetRegisteredLevels(flatten = FALSE)
Arguments
flatten |
( |
Value
RegisteredLevels
| character
. See above.
Parallel versions of apply-family functions.
Description
parallelLapply
: A parallel lapply()
version.
parallelSapply
: A parallel sapply()
version.
All functions are simple wrappers for parallelMap()
.
Usage
parallelLapply(xs, fun, ..., impute.error = NULL, level = NA_character_)
parallelSapply(
xs,
fun,
...,
simplify = TRUE,
use.names = TRUE,
impute.error = NULL,
level = NA_character_
)
Arguments
xs |
( |
fun |
|
... |
(any) |
impute.error |
( |
level |
( |
simplify |
( |
use.names |
( |
Value
For parallelLapply
a named list, for parallelSapply
it depends
on the return value of fun
and the settings of simplify
and
use.names
.
Load packages for parallelization.
Description
Makes sure that the packages are loaded in slave process so that
they can be used in a job function which is later run with parallelMap()
.
For all modes, the packages are also (potentially) loaded on the master.
Usage
parallelLibrary(
...,
packages,
master = TRUE,
level = NA_character_,
show.info = NA
)
Arguments
... |
character |
packages |
( |
master |
( |
level |
( |
show.info |
( |
Value
Nothing.
Maps a function over lists or vectors in parallel.
Description
Uses the parallelization mode and the other options specified in
parallelStart()
.
Libraries and source file can be initialized on slaves with
parallelLibrary()
and parallelSource()
.
Large objects can be separately exported via parallelExport()
,
they can be simply used under their exported name in slave body code.
Regarding error handling, see the argument impute.error
.
Usage
parallelMap(
fun,
...,
more.args = list(),
simplify = FALSE,
use.names = FALSE,
impute.error = NULL,
level = NA_character_,
show.info = NA
)
Arguments
fun |
function |
... |
(any) |
more.args |
list |
simplify |
( |
use.names |
( |
impute.error |
( |
level |
( |
show.info |
( |
Value
Result.
Examples
parallelStart()
parallelMap(identity, 1:2)
parallelStop()
Register a parallelization level
Description
Package developers should call this function in their packages'
base::.onLoad()
. This enables the user to query available levels and bind
parallelization to specific levels. This is especially helpful for nested
calls to parallelMap()
, e.g. where the inner call should be parallelized
instead of the outer one.
To avoid name clashes, we encourage developers to always specify the argument
package
. This will prefix the specified levels with the string containing
the package name, e.g. parallelRegisterLevels(package="foo", levels="dummy")
will register the level “foo.dummy” and users can
start parallelization for this level with parallelStart(<backend>, level = "parallelMap.dummy")
. If you do not provide package
, the level names will
be associated with category “custom” and can there be later referred
to with “custom.dummy”.
Usage
parallelRegisterLevels(package = "custom", levels)
Arguments
package |
( |
levels |
( |
Value
Nothing.
Source R files for parallelization.
Description
Makes sure that the files are sourced in slave process so that
they can be used in a job function which is later run with parallelMap()
.
For all modes, the files are also (potentially) loaded on the master.
Usage
parallelSource(
...,
files,
master = TRUE,
level = NA_character_,
show.info = NA
)
Arguments
... |
character |
files |
character |
master |
( |
level |
( |
show.info |
( |
Value
Nothing.
Parallelization setup for parallelMap.
Description
Defines the underlying parallelization mode for parallelMap()
. Also allows
to set a “level” of parallelization. Only calls to parallelMap()
with a matching level are parallelized. The defaults of all settings are
taken from your options, which you can also define in your R profile. For an
introductory tutorial and information on the options configuration, please go
to the project's github page at https://github.com/mlr-org/parallelMap.
Usage
parallelStart(
mode,
cpus,
socket.hosts,
bj.resources = list(),
bt.resources = list(),
logging,
storagedir,
level,
load.balancing = FALSE,
show.info,
suppress.local.errors = FALSE,
reproducible,
...
)
parallelStartLocal(show.info, suppress.local.errors = FALSE, ...)
parallelStartMulticore(
cpus,
logging,
storagedir,
level,
load.balancing = FALSE,
show.info,
reproducible,
...
)
parallelStartSocket(
cpus,
socket.hosts,
logging,
storagedir,
level,
load.balancing = FALSE,
show.info,
reproducible,
...
)
parallelStartMPI(
cpus,
logging,
storagedir,
level,
load.balancing = FALSE,
show.info,
reproducible,
...
)
parallelStartBatchJobs(
bj.resources = list(),
logging,
storagedir,
level,
show.info,
...
)
parallelStartBatchtools(
bt.resources = list(),
logging,
storagedir,
level,
show.info,
...
)
Arguments
mode |
( |
cpus |
( |
socket.hosts |
character |
bj.resources |
list |
bt.resources |
list |
logging |
( |
storagedir |
( |
level |
( |
load.balancing |
( |
show.info |
( |
suppress.local.errors |
( |
reproducible |
( |
... |
(any) |
Details
Currently the following modes are supported, which internally dispatch the mapping operation to functions from different parallelization packages:
-
local: No parallelization with
mapply()
-
multicore: Multicore execution on a single machine with
parallel::mclapply()
. -
socket: Socket cluster on one or multiple machines with
parallel::makePSOCKcluster()
andparallel::clusterMap()
. -
mpi: Snow MPI cluster on one or multiple machines with
parallel::makeCluster()
andparallel::clusterMap()
. -
BatchJobs: Parallelization on batch queuing HPC clusters, e.g., Torque, SLURM, etc., with
BatchJobs::batchMap()
.
For BatchJobs mode you need to define a storage directory through the
argument storagedir
or the option parallelMap.default.storagedir
.
Value
Nothing.
Stops parallelization.
Description
Sets mode to “local”, i.e., parallelization is turned off and all necessary stuff is cleaned up.
For socket and mpi mode parallel::stopCluster()
is called.
For BatchJobs mode the subdirectory of the storagedir
containing the exported objects is removed.
After a subsequent call of parallelStart()
, no exported objects
are present on the slaves and no libraries are loaded,
i.e., you have clean R sessions on the slaves.
Usage
parallelStop()
Value
Nothing.