NAME
AI::XGBoost::Booster - XGBoost main class for training, prediction and evaluation
VERSION
version 0.11
SYNOPSIS
use 5.010;
use aliased 'AI::XGBoost::DMatrix';
use AI::XGBoost qw(train);
# We are going to solve a binary classification problem:
# Mushroom poisonous or not
my $train_data = DMatrix->From(file => 'agaricus.txt.train');
my $test_data = DMatrix->From(file => 'agaricus.txt.test');
# With XGBoost we can solve this problem using 'gbtree' booster
# and as loss function a logistic regression 'binary:logistic'
# (Gradient Boosting Regression Tree)
# XGBoost Tree Booster has a lot of parameters that we can tune
# (https://github.com/dmlc/xgboost/blob/master/doc/parameter.md)
my $booster = train(data => $train_data, number_of_rounds => 10, params => {
objective => 'binary:logistic',
eta => 1.0,
max_depth => 2,
silent => 1
});
# For binay classification predictions are probability confidence scores in [0, 1]
# indicating that the label is positive (1 in the first column of agaricus.txt.test)
my $predictions = $booster->predict(data => $test_data);
say join "\n", @$predictions[0 .. 10];
DESCRIPTION
Booster objects control training, prediction and evaluation
Work In Progress, the API may change. Comments and suggestions are welcome!
METHODS
update
Update one iteration
Parameters
- iteration
-
Current iteration number
- dtrain
-
Training data (AI::XGBoost::DMatrix)
boost
Boost one iteration using your own gradient
Parameters
- dtrain
-
Training data (AI::XGBoost::DMatrix)
- grad
-
Gradient of your objective function (Reference to an array)
- hess
-
Hessian of your objective function, that is, second order gradient (Reference to an array)
predict
Predict data using the trained model
Parameters
- data
-
Data to predict
set_param
Set booster parameter
Example
$booster->set_param('objective', 'binary:logistic');
set_attr
Set a string attribute
get_attr
Get a string attribute
get_score
Get importance of each feature
Parameters
- importance_type
-
Type of importance. Valid values:
- weight
-
Number of times a feature is used to split the data across all trees
- gain
-
Average gain of the feature when it is used in trees
- cover
-
Average coverage of the feature when it is used in trees
- fmap
-
Name of feature map file
get_dump
attributes
Returns all attributes of the booster as a HASHREF
TO_JSON
Serialize the booster to JSON.
This method is to be used with the option convert_blessed
from JSON. (See https://metacpan.org/pod/JSON#OBJECT-SERIALISATION)
Warning: this API is subject to changes
BUILD
Use new, this method is just an internal helper
DEMOLISH
Internal destructor. This method is called automatically
AUTHOR
Pablo Rodríguez González <pablo.rodriguez.gonzalez@gmail.com>
COPYRIGHT AND LICENSE
Copyright (c) 2017 by Pablo Rodríguez González.