All-in-one tuning class¶

tune_easy.all_in_one_tuning module¶

class tune_easy.all_in_one_tuning.AllInOneTuning¶

Bases: object

LEARNING_ALGOS = {'binary': ['svm', 'logistic', 'randomforest', 'lightgbm'], 'multiclass': ['svm', 'logistic', 'randomforest', 'lightgbm'], 'regression': ['linear_regression', 'elasticnet', 'svr', 'randomforest', 'lightgbm']}¶

N_ITER = {'binary': {'lightgbm': 200, 'logistic': 500, 'randomforest': 300, 'svm': 500, 'xgboost': 100}, 'multiclass': {'lightgbm': 200, 'logistic': 500, 'randomforest': 300, 'svm': 50, 'xgboost': 100}, 'regression': {'elasticnet': 500, 'lightgbm': 200, 'randomforest': 300, 'svr': 500, 'xgboost': 100}}¶

OTHER_SCORES = {'binary': ['accuracy', 'precision', 'recall', 'f1', 'logloss', 'auc'], 'multiclass': ['accuracy', 'precision_macro', 'recall_macro', 'f1_macro', 'logloss', 'auc_ovr'], 'regression': ['rmse', 'mae', 'mape', 'r2']}¶

SCORING = {'binary': 'logloss', 'multiclass': 'logloss', 'regression': 'rmse'}¶

all_in_one_tuning(x, y, data=None, x_colnames=None, cv_group=None, objective=None, scoring=None, other_scores=None, learning_algos=None, n_iter=None, cv=5, tuning_algo='optuna', seed=42, estimators=None, tuning_params=None, mlflow_logging=False, mlflow_tracking_uri=None, mlflow_artifact_location=None, mlflow_experiment_name=None, tuning_kws=None)¶

Parameter tuning with multiple estimators, extremely easy to use.

Parameters

x (list[str], or numpy.ndarray) – Explanatory variables. Should be list[str] if data is pd.DataFrame. Should be numpy.ndarray if data is None.
y (str or numpy.ndarray) – Target variable. Should be str if data is pd.DataFrame. Should be numpy.ndarray if data is None.
data (pd.DataFrame, default=None) – Input data structure.
x_colnames (list[str], default=None) – Names of explanatory variables. Available only if data is NOT pd.DataFrame.
cv_group (str or numpy.ndarray, default=None) – Grouping variable that will be used for GroupKFold or LeaveOneGroupOut. Should be str if data is pd.DataFrame.
objective ({'classification', 'regression'}, default=None) – Specify the learning task. If None, select task by target variable automatically.
scoring (str, default=None) –
Score name used to parameter tuning.
- In regression:
  - ’rmse’ : Root mean squared error
  - ’mse’ : Mean squared error
  - ’mae’ : Mean absolute error
  - ’rmsle’ : Rot mean absolute logarithmic error
  - ’mape’ : Mean absolute percentage error
  - ’r2’ : R2 Score
- In binary classification:
  - ’logloss’ : Logarithmic Loss
  - ’accuracy’ : Accuracy
  - ’precision’ : Precision
  - ’recall’ : Recall
  - ’f1’ : F1 score
  - ’pr_auc’ : PR-AUC
  - ’auc’ : AUC
- In multiclass classification:
  - ’logloss’ : Logarithmic Loss
  - ’accuracy’ : Accuracy
  - ’precision_macro’ : Precision macro
  - ’recall_macro’ : Recall macro
  - ’f1_micro’ : F1 micro
  - ’f1_macro’ : F1 macro
  - ’f1_weighted’ : F1 weighted
  - ’auc_ovr’ : One-vs-rest AUC
  - ’auc_ovo’ : One-vs-one AUC
  - ’auc_ovr’ : One-vs-rest AUC weighted
  - ’auc_ovo’ : One-vs-one AUC weighted
If None, the SCORING constant is used.

See https://c60evaporator.github.io/tune-easy/all_in_one_tuning.html#tune_easy.all_in_one_tuning.AllInOneTuning.SCORING
other_scores (list[str], default=None) –
Score names calculated after tuning. Available score names are written in the explatnation of scoring argument.

If None, the OTHER_SCORES constant is used.

See https://c60evaporator.github.io/tune-easy/all_in_one_tuning.html#tune_easy.all_in_one_tuning.AllInOneTuning.OTHER_SCORES
learning_algos (list[str], default=None) –
Estimator algorithm. Select the following algorithms and make a list of them.
- In regression:
  - ’linear_regression’ : LinearRegression
  - ’elasticnet’ : ElasticNet
  - ’svr’ : SVR
  - ’randomforest’ : RandomForestRegressor
  - ’lightgbm’ : LGBMRegressor
  - ’xgboost’ : XGBRegressor
- In regression:
  - ’svm’ : SVC
  - ’logistic’ : LogisticRegression
  - ’randomforest’ : RandomForestClassifier
  - ’lightgbm’ : LGBMClassifier
  - ’xgboost’ : XGBClassifier
If None, the LEARNING_ALGOS constant is used.

See https://c60evaporator.github.io/tune-easy/all_in_one_tuning.html#tune_easy.all_in_one_tuning.AllInOneTuning.LEARNING_ALGOS
n_iter (dict[str, int], default=None) –
Iteration number of parameter tuning. Keys should be members of learning_algos argument. Values should be iteration numbers.

If None, the N_ITER constant is used.

See https://c60evaporator.github.io/tune-easy/all_in_one_tuning.html#tune_easy.all_in_one_tuning.AllInOneTuning.N_ITER
cv (int, cross-validation generator, or an iterable, default=5) – Determines the cross-validation splitting strategy. If None, to use the default 5-fold cross validation. If int, to specify the number of folds in a KFold.
tuning_algo ({'grid', 'random', 'bo', 'optuna'}, default='optuna') – Tuning algorithm using following libraries. ‘grid’: sklearn.model_selection.GridSearchCV, ‘random’: sklearn.model_selection.RandomizedSearchCV, ‘bo’: BayesianOptimization, ‘optuna’: Optuna.
seed (int, default=42) – Seed for random number generator of cross validation, estimators, and optuna.sampler.
estimators (dict[str, estimator object implementing 'fit'], default=None) –
Classification or regression estimators used to tuning. Keys should be members of learning_algos argument. Values are assumed to implement the scikit-learn estimator interface.

If None, use default estimators of tuning instances

See https://c60evaporator.github.io/tune-easy/each_estimators.html
tuning_params (dict[str, dict[str, {list, tuple}]], default=None) –
Values should be dictionary with parameters names as keys and lists of parameter settings or parameter range to try as values. Keys should be members of learning_algos argument.

If None, use default values of tuning instances

See https://c60evaporator.github.io/tune-easy/each_estimators.html
mlflow_logging (str, default=None) –
Strategy to record the result by MLflow library.

If True, nested runs are created. The parent run records conparison of all estimatiors such as max score history. The child runs are created in each tuning instances by setting mlflow_logging argument to “outside”

If False, MLflow runs are not created.
mlflow_tracking_uri (str, default=None) –
Tracking uri for MLflow. This argument is passed to tracking_uri in mlflow.set_tracking_uri()

See https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uri
mlflow_artifact_location (str, default=None) –
Artifact store for MLflow. This argument is passed to artifact_location in mlflow.create_experiment()

See https://mlflow.org/docs/latest/tracking.html#artifact-stores
mlflow_experiment_name (str, default=None) –
Experiment name for MLflow. This argument is passed to name in mlflow.create_experiment()

See https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experiment
tuning_kws (dict[str, dict], default=None) –
Additional parameters passed to tuning instances. Keys should be members of learning_algos argument. Values should be dict of parameters passed to tuning instances, e.g. {‘not_opt_params’: {‘’kernel’: ‘rbf’}}.

See API Reference of tuning instances.

Returns

df_result – Validation scores of before and after tuning model.

Return type

pd.DataFrame

print_estimator(learner_name, printed_name, mlflow_logging=False)¶

Print estimator after tuning

Parameters: learner_name ({'linear_regression', 'elasticnet', 'svr', 'randomforest', 'lightgbm', 'xgboost', 'svm', 'logistic'}, or np.ndarray) – Printed learning algorithm name