Optimizer

The Comet Optimizer is a powerful intuitive tool in your automated hyperparameter tuning toolbox.

Use Optimizer to dynamically find the best set of hyperparameter values that will minimize or maximize a particular metric. It can make suggestions for what hyperparameter values to try next, either in serial or in parallel (or a combination).

In its simplest form, you can use the hyperparameter search this way:

# file: example-1.py

from comet_ml import Optimizer

# You need to specify the algorithm and hyperparameters to use:
config = {
    # Pick the Bayes algorithm:
    "algorithm": "bayes",

    # Declare your hyperparameters:
    "parameters": {
        "x": {"type": "integer", "min": 1, "max": 5},
    },

    # Declare what to optimize, and how:
    "spec": {
        "metric": "loss",
        "objective": "minimize",
    },
}

# Next, create an optimizer, passing in the configuration:
opt = Optimizer(config)

# define fit function here!

# Finally, get experiments, and train your models:
for experiment in opt.get_experiments(
        project_name="optimizer-search-01"):
    # Test the model
    loss = fit(experiment.get_parameter("x"))
    experiment.log_metric("loss", loss)

That's it! Comet will provide you with an Experiment object already set up with the suggested parameters to try. You merely need to train the model and log the metric to optimize ("loss" in this case).

See the Optimizer class for more details on creating an optimizer.

Optimizer configuration¶

Optimizer Configuration is performed through a dictionary, either specified in code, or in a config file. The dictionary format is a JSON structure similar to the following:

{"algorithm": "bayes",
 "spec": {
    "maxCombo": 0,
    "objective": "minimize",
    "metric": "loss",
    "minSampleSize": 100,
    "retryLimit": 20,
    "retryAssignLimit": 0,
 },
 "parameters": {
     "hidden-layer-size": {"type": "integer", "min": 5, "max": 100},
     "hidden2-layer-size": {"type": "discrete", "values": [16, 32, 64]},
 },
 "name": "My Bayesian Search",
 "trials": 1,
}

As shown, the Optimizer configuration dictionary has five sections:

Algorithm	Description
`algorithm`	String, indicating the search algorithm to use
`spec`	Dictionary, defining the algorithm-specific specifications
`parameters`	Dictionary, defining the parameter distribution space
`name`	(Optional) String, specifying a personalizable name to associate with this search instance
`trials`	(Optional) Integer, specifying the number of trials to run per experiment. Defaults to 1.

Details of the mandatory sections (algorithm, spec, and parameters) follow.

algorithm¶

Algorithm	Description
random	For the Random sampling algorithm.
grid	For the Grid search algorithm. Grid is a sweep algorithm based on picking parameter values from discrete, possibly sampled, regions.
Bayes	For the Bayesian search algorithm. Algorithm based on distributions, balancing exploitation, and exploration.

spec¶

This table describes algorithm-specific specifications. Relevant options are indicated for the different algorithms.

Option	Description	Relevant algorithm
`maxCombo`	Integer. The limit of parameter combinations to try (default 0, meaning, to use 10 times the number of hyperparameters).	random, grid, bayesian
`metric`	String. The metric name that you are logging and want to minimize or maximize (default `loss`).	random, grid, bayesian
`gridSize`	Integer. When creating a grid, the number of bins per parameter (default 10).	random, grid
`minSampleSize`	Integer. The number of samples to help find appropriate grid ranges (default 100).	random, grid, bayesian
`retryLimit`	Integer. The limit to try creating a unique parameter set before giving up (default is 20).	random, grid, bayesian
`retryAssignLimit`	Integer. The limit to re-assign non-completed experiments (default is 0).	random, grid, bayesian
`objective`	String. Specify `minimize` or `maximize`, for the objective metric (default is `minimize`)	bayesian

parameters¶

The parameters section of the Optimizer configuration is a dictionary containing all the hyperparameters to be optimized.

The format of each parameter was inspired by Google's Vizier, and exemplified by the open source version called Advisor.

The following example shows the configuration of the hyperparameters hidden-layer-size, momentum, and batch_size:

"parameters": {
     "hidden-layer-size": {"type": "integer", "scaling_type": "uniform", "min": 5, "max": 100},
     "momentum": {"type": "float", "scaling_type": "normal", "mu": 10, "sigma": 5},
     "batch_size": {"type": "discrete", "values": [16, 32, 64]},
 }

For each hyperparameter, you must make settings to the following parameters:

type (mandatory). Specify one of the following options:
- integer
- float
- double
- categorical
- discrete: Integer. The number of samples to help find appropriate grid ranges (default is 100).
- categorical: The values must be strings.

Depending on the type used (unless type is categorical or discrete), you must also specify a value in the following list:

scaling_type (optional and not available when type is categorical or discrete). Specify one of the following options:
- linear (default)
- uniform
- normal
- loguniform
- lognormal

Depending on the scaling_type used, you must also specify a value in the following list:

values: Only when the type is categorical or discrete.
min: Only when the scaling is one of [linear, uniform, loguniform, lognormal]
max: Only when the scaling is one of [linear, uniform, loguniform, lognormal]
mu: Only when the scaling is one of [normal, lognormal]
sigma: Only when the scaling is one of [normal, lognormal]
grid_size: Only when algorithm is grid. Each parameter is considered a distribution for those algorithms that sample randomly. Those algorithms include bayes and random. However, other algorithms need to know a resolution size for how to divide up the parameter space into discrete bins. Those algorithms include grid. For those, an additional entry named gridSize can be set for each parameter.

Note

The "integer" type with "linear" scalingType when using the bayes algorithm indicates an independent distribution. This is useful for using integer values that have no relationship with one another, such as seed values. If your distribution is meaningful (for example, 2 is closer to 1 than it is to 6) then you should use the "uniform" scalingType.

Running the Comet Optimizer in parallel¶

When tuning hyper-parameters it is common to want to speed up the time it takes to run the search by parallelizing the search. This can easily be achieved when using the Comet optimizer by running the search script using comet optimize.

The hyper-parameter search can be run in parallel by specifying the -j parameter when using the command line function comet optimize. For example if you wanted to train two models in parallel you can use:

comet optimize -j 2 training_script.py optimizer.config

Parallel execution example¶

In order to run the example defined above in parallel we will make two small changes:

Move the Optimizer config file to a separate file called optimizer.config
Update the training script to read the optimizer config via sys.argv

The code becomes:

Running the Optimizer in paralleltraining_script.pyoptimizer.config

comet optimize -j 2 training_script.py optimizer.config

training_script.py

from comet_ml import Optimizer
import sys

# Next, create an optimizer, passing in the config:
# (You can leave out API_KEY if you already set it)
opt = Optimizer(sys.argv[1])

# define fit function here!

# Finally, get experiments, and train your models:
for experiment in opt.get_experiments(
        project_name="optimizer-search-02"):
    # Test the model
    loss = fit(experiment.get_parameter("x"))
    experiment.log_metric("loss", loss)

optimizer.config

{
    # We pick the Bayes algorithm:
    "algorithm": "bayes",

    # Declare your hyperparameters in the Vizier-inspired format:
    "parameters": {
        "x": {"type": "integer", "min": 1, "max": 5},
    },

    # Declare what we will be optimizing, and how:
    "spec": {
    "metric": "loss",
        "objective": "minimize",
    },
}

End-to-end example¶

This Colab Notebook is an end-to-end program using Keras with the Comet Optimizer.

Comet optimize¶

comet is a command-line utility that is installed with comet_ml. optimize is one of the commands that comet can use. The format is:

$ comet optimize [options] [PYTHON_SCRIPT] OPTIMIZER

For more information on comet optimize, see Comet Command-Line Utilities.

Learn more¶

Feb. 9, 2024