lightgbm

Automatically logging LightGBM experiments¶

Comet.ml can autolog your model graph, logging metrics, and parameters from your LightGBM code without requiring you to do anything other than adding these lines of code to your LightGBM script:

from comet_ml import Experiment
import lightgbm as lgbm

experiment = Experiment()

# Your code here...

gbm = lgbm.train()

For more information on getting started, see details on the Comet config file.

LightGBM Auto-logging Controls¶

The Comet LightGBM auto-logger can automatically log:

model/graph description
steps
metrics (such as loss and accuracy)
hyperparameters
command-line arguments

You can turn these off by passing arguments to the Comet Experiment class:

experiment = Experiment(
    auto_metric_logging=False,
    auto_param_logging=False,
    log_graph=False,
    parse_args=False,
)

Each of these can be controlled through an experiment parameter, environment variable, or a configuration setting:

Item	Experiment Parameter	Environment Setting	Configuration Setting
model/graph description	log_graph	COMET_AUTO_LOG_GRAPH	comet.auto_log.graph
metrics	auto_metric_logging	COMET_AUTO_LOG_METRICS	comet.auto_log.metrics
hyperparameters	auto_param_logging	COMET_AUTO_LOG_PARAMETERS	comet.auto_log.parameters
command-line arguments	parse_args	COMET_AUTO_LOG_CLI_ARGUMENTS	comet.auto_log.cli_arguments

For a complete list of items logged, please see Experiment Configuration Parameters.

How to report manually¶

You can log additional parameters beyond what Comet.ml automatically collects using Experiment.log_parameter().

from comet_ml import Experiment

import lightgbm as lgbm

#create an experiment
experiment = Experiment(
    project_name='mnist',
)

batch_size = 128

experiment.log_parameter("batch_size", 128)

You can log an entirely customized list of parameters to your experiment by using Experiment.log_parameters().

from comet_ml import Experiment

import lightgbm as lgbm

experiment = Experiment(
    project_name="my project name",
    auto_param_logging=False,
)
batch_size = 128
num_classes = 10
epochs = 20

params={
    "batch_size":batch_size,
    "epochs":epochs,
    "num_classes":num_classes}

experiment.log_parameters(params)

Context Manager (Train/Test/Validate)¶

You can also log specific metrics to training and test contexts with our context managers Experiment.train(), Experiment.validate() and Experiment.test()

from comet_ml import Experiment

import lightgbm as lgbm

experiment = Experiment(
    project_name="my project name",
    auto_param_logging=True,
)
batch_size = 128
num_classes = 10
epochs = 20

params={
    "batch_size":batch_size,
    "epochs":epochs,
    "num_classes":num_classes}

experiment.log_parameters(params)

# define dataset here:

ytest = ...

# train:

with experiment.train():
    gbm = lgbm.train()

# test:

with experiment.test():
     y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
    experiment.log_metric("rmse", mean_squared_error(y_test, y_pred) ** 0.5)

End-to-end example¶

Here is a simple end-to-end LightGBM example.

# Get the data for this script:
# wget https://raw.githubusercontent.com/microsoft/LightGBM/master/examples/regression/regression.train -qq
# wget https://raw.githubusercontent.com/microsoft/LightGBM/master/examples/regression/regression.test -qq

import comet_ml

import lightgbm as lgb
import os
import pandas as pd
from sklearn.metrics import mean_squared_error

experiment = comet_ml.Experiment()

dirname = os.path.dirname(__file__)
df_train = pd.read_csv(os.path.join(dirname, "regression.train"), header=None, sep="\t")
df_test = pd.read_csv(os.path.join(dirname, "regression.test"), header=None, sep="\t")

y_train = df_train[0]
y_test = df_test[0]
X_train = df_train.drop(0, axis=1)
X_test = df_test.drop(0, axis=1)

lgb_train = lgb.Dataset(X_train, y_train)
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train)

params = {
    "boosting_type": "gbdt",
    "objective": "regression",
    "metric": {"rmse", "l2", "l1", "huber"},
    "num_leaves": 31,
    "learning_rate": 0.05,
    "feature_fraction": 0.9,
    "bagging_fraction": 0.8,
    "bagging_freq": 5,
    "verbosity": -1,
}

gbm = lgb.train(
    params,
    lgb_train,
    num_boost_round=20,
    valid_sets=lgb_eval,
    valid_names=("validation"),
    early_stopping_rounds=5,
)

y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
print("The rmse of prediction is:", mean_squared_error(y_test, y_pred) ** 0.5)