r - confusion matrix of bstTree predictions, Error: 'The data must contain some levels that overlap the reference.' -
i trying train model using bsttree method , print out confusion matrix. adverse_effects class attribute.
set.seed(1234) splitindex <- createdatapartition(attended_num_new_bsttree$adverse_effects, p = .80, list = false, times = 1) trainsplit <- attended_num_new_bsttree[ splitindex,] testsplit <- attended_num_new_bsttree[-splitindex,] ctrl <- traincontrol(method = "cv", number = 5) model_bsttree <- train(adverse_effects ~ ., data = trainsplit, method = "bsttree", trcontrol = ctrl) predictors <- names(trainsplit)[names(trainsplit) != 'adverse_effects'] pred_bsttree <- predict(model_bsttree$finalmodel, testsplit[,predictors]) plot.roc(auc_bsttree) conf_bsttree= confusionmatrix(pred_bsttree,testsplit$adverse_effects)
but error 'error in confusionmatrix.default(pred_bsttree, testsplit$adverse_effects) : data must contain levels overlap reference.'
max(pred_bsttree) [1] 1.03385 min(pred_bsttree) [1] 1.011738 > unique(trainsplit$adverse_effects) [1] 0 1 levels: 0 1
how can fix issue?
> head(trainsplit) type new_missed therapytypename new_diesease gender adverse_effects change_in_exposure other_reasons other_medication 5 2 1 14 13 2 0 0 0 0 7 2 0 14 13 2 0 0 0 0 8 2 0 14 13 2 0 0 0 0 9 2 0 14 13 2 1 0 0 0 11 2 1 14 13 2 0 0 0 0 12 2 0 14 13 2 0 0 0 0 uvb_puva_type missed_prev_dose skintypea skintypeb age doseb dosea 5 5 1 1 1 22 3.000 0 7 5 0 1 1 22 4.320 0 8 5 0 1 1 22 4.752 0 9 5 0 1 1 22 5.000 0 11 5 1 1 1 22 5.000 0 12 5 0 1 1 22 5.000 0
max(pred_bsttree) [1] 1.03385
min(pred_bsttree) [1] 1.011738
and errors tells all. plotting roc checking effect of different threshold points. based on threshold rounding happens e.g. 0.7 converted 1 (true class) , 0.3 go 0 (false class); in case threshold 0.5. threshold values in range of (0,1)
in case regardless of threshold observations true class minimum prediction greater 1. (thats why @phiver wondering if doing regression instead of classification) . without 0 in prediction there no level in 'prediction' coincide 0 level in adverse_effects
, hence error.
ps: difficult tell root cause of error without posting data
Comments
Post a Comment