Public API
Database Management
Experimenter.open_db
— Functionopen_db(database_name, [experiment_folder, create_folder]; in_memory=false)
Opens a database and prepares it with the Experimenter.jl schema with tables for Experiment, Trial and Snapshot. If the database already exists, it will open it and not overwrite the existing data.
Setting in_memory
to true
will skip all of the arguments and create the database "in memory" and hence, will not persist.
Experimenter.export_db
— Functionexport_db(db::ExperimentDatabase, outfile::AbstractString, experiment_names...)
Opens a new database at outfile
and inserts experiments from db
into the new db, where the names of the experiment are listed in the final input.
Experimenter.restore_from_db
— Functionrestore_from_db(db::ExperimentDatabase, experiment::Experiment)
Searches the db
for the supplied experiment, matching on the configuration and the name, disregarding the unique ID.
If experiment already exists in the db, returns that experiment with the db's UUID for it, otherwise return the input experiment.
Will error if the experiment exists but does not match the input experiment configuration.
Experimenter.merge_databases!
— Functionmerge_databases!(primary_db, secondary_db)
Searches all of the records from the secondary database and adds them to the first database.
Experiments
Experimenter.Experiment
— TypeExperiment
A database object for storing the configuration options of an experiment.
The signature of the function supplied should be:
fn(configuration::Dict{Symbol, Any}, trial_id::UUID)
The function should be available when including the file provided.
A name is required to uniquely label this experiment.
Experimenter.get_progress
— Functionget_progress(db::ExperimentDatabase, name)
Returns a table of the trials of an experiment, identified by the name parameter. Returns details of the progress and configuration, but not the results.
Experimenter.get_experiment
— Functionget_experiment(db::ExperimentDatabase, experiment_id)
Searches the db for the given experiment_id
which can be given as a string or UUID.
Experimenter.get_experiments
— Functionget_experiments(db::ExperimentDatabase)
Returns a vector of all experiments in the database.
Experimenter.get_experiment_by_name
— Functionget_experiment(db::ExperimentDatabase, name)
Searches the db for an experiment with the experiment name set to name
. Returns that experiment.
Experimenter.get_ratio_completed_trials_by_name
— Functionget_ratio_completed_trials_by_name(db::ExperimentDatabase, name)
Calculates the ratio of completed trials for the given experiment with name name
, without fetching the results.
Data Storage
Experimenter.get_global_store
— Functionget_global_store()
Tries to get the global store that is initialised by the supplied function with the name specified by init_store_function_name
set in the running experiment. This store is local to each worker.
Setup
To create the store, add a function in your include file which returns a dictionary of type Dict{Symbol, Any}, which has the signature similar to:
function create_global_store(config)
# config is the global configuration given to the experiment
data = Dict{Symbol, Any}(
:dataset => rand(1000),
:flag => false,
# etc...
)
return data
end
Inside your main experiment execution function, you can get this store via get_global_store
, which is exported by Experimenter
.
function myrunner(config, trial_id)
store = get_global_store()
dataset = store[:dataset] # Retrieve the keys from the store
# process data
return results
end
Experimenter.get_results_from_trial_global_database
— Functionget_results_from_trial_global_database(trial_id::UUID)
Gets the results of a specific trial from the global database. Redirects to the master node if on a worker node. Locks to secure access.
Trials
Experimenter.get_trial
— Functionget_trial(db::ExperimentDatabase, trial_id)
Gets the trial with the matching trial_id
(string or UUID) from the database.
Experimenter.get_trials
— Functionget_trial(db::ExperimentDatabase, experiment_id)
Gets all trials from the database under the experiment_id
supplied.
Experimenter.get_trials_by_name
— Functionget_trials_by_name(db::ExperimentDatabase, name)
Gets all trials from the database for the experiment with the name name
.
Experimenter.get_trials_ids_by_name
— Functionget_trials_ids_by_name(db::ExperimentDatabase, name)
Gets just the trial IDs from the database for the experiment with the name name
.
Execution
Experimenter.@execute
— Macro@execute experiment database [mode=SerialMode use_progress=false directory=pwd()]
Runs the experiment out of global scope, saving results in the database
, skipping all already executed trials.
Args:
mode: Specifies SerialMode, MultithreadedMode or DistributedMode to execute serially or in parallel. use_progress: Shows a progress bar directory: Directory to change the current process (or worker processes) to for execution.
Experimenter.ExecutionModes.SerialMode
— TypeExecutes the trials of the experiment one of the other, sequentially.
Experimenter.ExecutionModes.MultithreadedMode
— TypeExecutes the trials of the experiment in parallel using Threads.@Threads
Experimenter.ExecutionModes.DistributedMode
— TypeExecutes the trials of the experiment in parallel using Distributed.jl
s pmap
.
Experimenter.ExecutionModes.HeterogeneousMode
— TypeExecutes the trials of the experiment in parallel using a custom scheduler that uses all threads of each worker.
Experimenter.ExecutionModes.MPIMode
— TypeExecutes the trials of the experiment in parallel using MPI
, which uses one MPI node for coordination and saving of jobs.
Cluster Management
Experimenter.Cluster.init
— Functioninit(; kwargs...)
Checks the environment variables to see if a script is running on a cluster and then launches the processes as determined by the environment variables.
Arguments
The keyword arguments are forwarded to the init function for each cluster management system. Check the ext
folder for extensions to see which keywords are supported.
Snapshots
Experimenter.get_snapshots
— Functionget_snapshots(db::ExperimentDatabase, trial_id)
Gets all the associated snapshots (as a vector) from the database for a given trial with matching trial_id
.
Experimenter.latest_snapshot
— Functionlatest_snapshot(db::ExperimentDatabase, trial_id)
Gets the latest snapshot from the database for a given trial with matching trial_id
, using the date of the most recent snapshot.
Known to have issues when snapshots are created within the same second.
Experimenter.save_snapshot!
— Functionsave_snapshot!(db::ExperimentDatabase, trial_id::UUID, state::Dict{Symbol,Any}, [label])
Saves the snapshot with given state
in the database, associating with the trial with matching trial_id
. Automatically saves the time of the snapshot.
Experimenter.get_latest_snapshot_from_global_database
— Functionget_latest_snapshot_from_global_database(trial_id::UUID)
Same as get_latest_snapshot
, but in the given global database. Redirects to the master worker if on a distributed node. Only works when using @execute
.
Experimenter.save_snapshot_in_global_database
— Functionsave_snapshot_in_global_database(trial_id::UUID, state, [label])
Save the results of a specific trial from the global database, with the supplied state
and optional label
. Redirects to the master node if on a worker node. Locks to secure access.
Misc
Experimenter.LinearVariable
— TypeLinearVariable(min, max, n)
Specifies a range for a parameter variable to take, from min to max inclusive with n
total values.
Experimenter.LogLinearVariable
— TypeLogLinearVariable(min, max, n)
A linearly spaced parameter variable in log space. If min=1 and max=100 and n=3 then the values are [1,10,100].
Experimenter.RepeatVariable
— TypeRepeatVariable(val, n)
Specifies a parameter variable that outputs the same value val
n
times.
Experimenter.IterableVariable
— TypeIterableVariable(iter)
Wraps a given iterator iter
to tell the experiment to perform a grid search over each element of the iterator for the given parameter.
Experimenter.MatchIterableVariable
— TypeMatchIterableVariable(iter)
This type of variable matches with the product from the other AbstractVariables
in the configuration.
This does not form part of the product variables (grid search), but instead uniques matches with that product.