All-in-one tuning class¶
tune_easy.all_in_one_tuning module¶
- class tune_easy.all_in_one_tuning.AllInOneTuning¶
Bases:
object
- LEARNING_ALGOS = {'binary': ['svm', 'logistic', 'randomforest', 'lightgbm'], 'multiclass': ['svm', 'logistic', 'randomforest', 'lightgbm'], 'regression': ['linear_regression', 'elasticnet', 'svr', 'randomforest', 'lightgbm']}¶
- N_ITER = {'binary': {'lightgbm': 200, 'logistic': 500, 'randomforest': 300, 'svm': 500, 'xgboost': 100}, 'multiclass': {'lightgbm': 200, 'logistic': 500, 'randomforest': 300, 'svm': 50, 'xgboost': 100}, 'regression': {'elasticnet': 500, 'lightgbm': 200, 'randomforest': 300, 'svr': 500, 'xgboost': 100}}¶
- OTHER_SCORES = {'binary': ['accuracy', 'precision', 'recall', 'f1', 'logloss', 'auc'], 'multiclass': ['accuracy', 'precision_macro', 'recall_macro', 'f1_macro', 'logloss', 'auc_ovr'], 'regression': ['rmse', 'mae', 'mape', 'r2']}¶
- SCORING = {'binary': 'logloss', 'multiclass': 'logloss', 'regression': 'rmse'}¶
- all_in_one_tuning(x, y, data=None, x_colnames=None, cv_group=None, objective=None, scoring=None, other_scores=None, learning_algos=None, n_iter=None, cv=5, tuning_algo='optuna', seed=42, estimators=None, tuning_params=None, mlflow_logging=False, mlflow_tracking_uri=None, mlflow_artifact_location=None, mlflow_experiment_name=None, tuning_kws=None)¶
Parameter tuning with multiple estimators, extremely easy to use.
- Parameters
x (list[str], or numpy.ndarray) – Explanatory variables. Should be list[str] if
data
is pd.DataFrame. Should be numpy.ndarray ifdata
is None.y (str or numpy.ndarray) – Target variable. Should be str if
data
is pd.DataFrame. Should be numpy.ndarray ifdata
is None.data (pd.DataFrame, default=None) – Input data structure.
x_colnames (list[str], default=None) – Names of explanatory variables. Available only if data is NOT pd.DataFrame.
cv_group (str or numpy.ndarray, default=None) – Grouping variable that will be used for GroupKFold or LeaveOneGroupOut. Should be str if
data
is pd.DataFrame.objective ({'classification', 'regression'}, default=None) – Specify the learning task. If None, select task by target variable automatically.
scoring (str, default=None) –
Score name used to parameter tuning.
- In regression:
’rmse’ : Root mean squared error
’mse’ : Mean squared error
’mae’ : Mean absolute error
’rmsle’ : Rot mean absolute logarithmic error
’mape’ : Mean absolute percentage error
’r2’ : R2 Score
- In binary classification:
’logloss’ : Logarithmic Loss
’accuracy’ : Accuracy
’precision’ : Precision
’recall’ : Recall
’f1’ : F1 score
’pr_auc’ : PR-AUC
’auc’ : AUC
- In multiclass classification:
’logloss’ : Logarithmic Loss
’accuracy’ : Accuracy
’precision_macro’ : Precision macro
’recall_macro’ : Recall macro
’f1_micro’ : F1 micro
’f1_macro’ : F1 macro
’f1_weighted’ : F1 weighted
’auc_ovr’ : One-vs-rest AUC
’auc_ovo’ : One-vs-one AUC
’auc_ovr’ : One-vs-rest AUC weighted
’auc_ovo’ : One-vs-one AUC weighted
If None, the SCORING constant is used.
other_scores (list[str], default=None) –
Score names calculated after tuning. Available score names are written in the explatnation of scoring argument.
If None, the OTHER_SCORES constant is used.
learning_algos (list[str], default=None) –
Estimator algorithm. Select the following algorithms and make a list of them.
- In regression:
’linear_regression’ : LinearRegression
’elasticnet’ : ElasticNet
’svr’ : SVR
’randomforest’ : RandomForestRegressor
’lightgbm’ : LGBMRegressor
’xgboost’ : XGBRegressor
- In regression:
’svm’ : SVC
’logistic’ : LogisticRegression
’randomforest’ : RandomForestClassifier
’lightgbm’ : LGBMClassifier
’xgboost’ : XGBClassifier
If None, the LEARNING_ALGOS constant is used.
n_iter (dict[str, int], default=None) –
Iteration number of parameter tuning. Keys should be members of
learning_algos
argument. Values should be iteration numbers.If None, the N_ITER constant is used.
cv (int, cross-validation generator, or an iterable, default=5) – Determines the cross-validation splitting strategy. If None, to use the default 5-fold cross validation. If int, to specify the number of folds in a KFold.
tuning_algo ({'grid', 'random', 'bo', 'optuna'}, default='optuna') – Tuning algorithm using following libraries. ‘grid’: sklearn.model_selection.GridSearchCV, ‘random’: sklearn.model_selection.RandomizedSearchCV, ‘bo’: BayesianOptimization, ‘optuna’: Optuna.
seed (int, default=42) – Seed for random number generator of cross validation, estimators, and optuna.sampler.
estimators (dict[str, estimator object implementing 'fit'], default=None) –
Classification or regression estimators used to tuning. Keys should be members of
learning_algos
argument. Values are assumed to implement the scikit-learn estimator interface.If None, use default estimators of tuning instances
See https://c60evaporator.github.io/tune-easy/each_estimators.html
tuning_params (dict[str, dict[str, {list, tuple}]], default=None) –
Values should be dictionary with parameters names as keys and lists of parameter settings or parameter range to try as values. Keys should be members of
learning_algos
argument.If None, use default values of tuning instances
See https://c60evaporator.github.io/tune-easy/each_estimators.html
mlflow_logging (str, default=None) –
Strategy to record the result by MLflow library.
If True, nested runs are created. The parent run records conparison of all estimatiors such as max score history. The child runs are created in each tuning instances by setting
mlflow_logging
argument to “outside”If False, MLflow runs are not created.
mlflow_tracking_uri (str, default=None) –
Tracking uri for MLflow. This argument is passed to
tracking_uri
inmlflow.set_tracking_uri()
See https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uri
mlflow_artifact_location (str, default=None) –
Artifact store for MLflow. This argument is passed to
artifact_location
inmlflow.create_experiment()
See https://mlflow.org/docs/latest/tracking.html#artifact-stores
mlflow_experiment_name (str, default=None) –
Experiment name for MLflow. This argument is passed to
name
inmlflow.create_experiment()
See https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experiment
tuning_kws (dict[str, dict], default=None) –
Additional parameters passed to tuning instances. Keys should be members of
learning_algos
argument. Values should be dict of parameters passed to tuning instances, e.g. {‘not_opt_params’: {‘’kernel’: ‘rbf’}}.See API Reference of tuning instances.
- Returns
df_result – Validation scores of before and after tuning model.
- Return type
pd.DataFrame
- print_estimator(learner_name, printed_name, mlflow_logging=False)¶
Print estimator after tuning
- Parameters
learner_name ({'linear_regression', 'elasticnet', 'svr', 'randomforest', 'lightgbm', 'xgboost', 'svm', 'logistic'}, or np.ndarray) – Printed learning algorithm name