Open In App

LightGBM Key Hyperparameters

Last Updated : 06 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

LightGBM is a popular machine learning algorithm used for solving classification and regression problems. It is known for its speed and accuracy and it is especially good when working with large datasets. However, to get the best results from LightGBM it needs to be set up correctly that is by tuning the hyperparameters well. These are settings that you give to the model before training. These settings decide how model learns and how complex it becomes. Unlike model parameters like weights, you do not learn them from data instead you choose them manually or use automated tools to tune them.

Important Hyperparameters in LightGBM

Below are some important hyperparameters in lightgbm:

1. num_leaves (Number of Leaves)

  • What it does: This sets the maximum number of leaf nodes in each decision tree.
  • Why it matters: More leaves make the tree more complex and better at learning patterns but too many can lead to overfitting.
  • Tip: Start with 31 (the default) and increase slowly. A good rule: num\_leaves <= 2^{(max\_depth)}. Try 31, 63, 127, etc.

2. max_depth (Maximum Tree Depth)

  • What it does: Limits how deep the tree can grow.
  • Why it matters: Deeper trees can learn more complex patterns but are also more likely to overfit.
  • Tip: Use this to control overfitting. Common values are between 3 and 15. If you set it to -1 it means "no limit".

3. learning_rate (Step Size)

  • What it does: Controls how much the model learns in each round.
  • Why it matters: A smaller value means the model learns slowly but can give better results. A larger value trains faster but can be less accurate.
  • Tip: Use a small value like 0.01 or 0.05 and increase the number of trees (num_iterations) to balance it.

4. n_estimators / num_iterations (Number of Trees)

  • What it does: This sets how many decision trees the model will build.
  • Why it matters: More trees usually mean better accuracy, but too many can make training slow and may overfit.
  • Tip: If your learning_rate is small use more trees. Common values are between 100 and 1000.

5. min_data_in_leaf (Minimum Data in One Leaf)

  • What it does: Sets the smallest number of data points that a leaf can have.
  • Why it matters: If this number is too small, the model may create too many small leaves and overfit.
  • Tip: Use higher values for large datasets like 100 or more. For small datasets start with 20 or 30.

6. feature_fraction (Column Sampling)

  • What it does: Only uses a portion of features (columns) to build each tree.
  • Why it matters: Helps the model avoid overfitting and trains faster.
  • Tip: Set it between 0.6 and 0.9. For example 0.8 means use 80% of features randomly for each tree.

7. bagging_fraction and bagging_freq (Row Sampling)

  • What it does: bagging_fraction uses only a part of the data (rows) to build each tree while bagging_freq how often to perform bagging (every n rounds).
  • Why it matters: Helps with randomness, reduces overfitting and speeds up training.
  • Tip: Try bagging_fraction = 0.8 and bagging_freq = 5.

8. lambda_l1 and lambda_l2 (Regularization)

  • What they do: lambda_l1: Adds L1 regularization to the weights while, lambda_l2: Adds L2 regularization to the weights.
  • Why they matter: Help prevent overfitting by keeping the model simpler.
  • Tip: Common values are 0, 0.1, 1 or 10. Try tuning them if your model overfits.

9. objective (Learning Task)

  • What it does: Tells LightGBM what kind of problem you are solving.
  • Common values: "binary" for binary classification, "multiclass" for multiclass classification, "regression" for regression problems
  • Tip: Set this correctly based on your task or the model will not work well.

10. metric (Evaluation Metric)

  • What it does: This decides how the model’s performance is measured during training.
  • Common metrics: "binary_logloss" for binary classification, "multi_logloss" for multiclass, "rmse" for regression, "auc" for classification (area under ROC curve)
  • Tip: You can set more than one metric by passing a list like auc or binary_logloss.

11. early_stopping_rounds

  • What it does: Stops training if the model does not improve for a certain number of rounds.
  • Why it matters: Saves time and avoids overfitting.
  • Tip: Use values like 50 or 100. You also need to set a validation set for this to work.

12. boosting_type

  • What it does: Sets the method used for boosting.
  • Options: "gbdt": Traditional gradient boosting, "dart": Drop trees randomly like Dropout in neural networks, "goss": Keeps only the most important samples
  • Tip: Use gbdt for most problems. Try dart if your model overfits.

13. verbosity

  • What it does: Controls how much information is printed during training.
  • Tip: Set it to -1 to turn off messages or 1 to see progress.

Tips for Tuning Hyperparameters

  1. Start simple: Use default values and only tune a few key parameters first (num_leaves, learning_rate, n_estimators).
  2. Use a validation set: Always test your model on data it hasn’t seen during training.
  3. Use grid search or random search: These are methods to try many combinations of hyperparameters.
  4. Use early stopping: This saves time and improves results.
  5. Try automated tools: Libraries like Optuna, Hyperopt or Scikit-Optimize can help tune parameters automatically.

Next Article
Article Tags :
Practice Tags :

Similar Reads