r - confusion matrix of bstTree predictions, Error: 'The data must contain some levels that overlap the reference.' -


i trying train model using bsttree method , print out confusion matrix. adverse_effects class attribute.

set.seed(1234) splitindex <- createdatapartition(attended_num_new_bsttree$adverse_effects, p = .80, list = false, times = 1) trainsplit <- attended_num_new_bsttree[ splitindex,] testsplit <- attended_num_new_bsttree[-splitindex,]  ctrl <- traincontrol(method = "cv", number = 5) model_bsttree <- train(adverse_effects ~ ., data = trainsplit, method = "bsttree", trcontrol = ctrl)   predictors <- names(trainsplit)[names(trainsplit) != 'adverse_effects'] pred_bsttree <- predict(model_bsttree$finalmodel, testsplit[,predictors])   plot.roc(auc_bsttree)  conf_bsttree= confusionmatrix(pred_bsttree,testsplit$adverse_effects) 

but error 'error in confusionmatrix.default(pred_bsttree, testsplit$adverse_effects) : data must contain levels overlap reference.'

 max(pred_bsttree) [1] 1.03385  min(pred_bsttree) [1] 1.011738  > unique(trainsplit$adverse_effects) [1] 0 1 levels: 0 1 

how can fix issue?

> head(trainsplit)    type new_missed therapytypename new_diesease gender adverse_effects change_in_exposure other_reasons other_medication 5     2          1              14           13      2               0                  0             0                0 7     2          0              14           13      2               0                  0             0                0 8     2          0              14           13      2               0                  0             0                0 9     2          0              14           13      2               1                  0             0                0 11    2          1              14           13      2               0                  0             0                0 12    2          0              14           13      2               0                  0             0                0    uvb_puva_type missed_prev_dose skintypea skintypeb age doseb dosea 5              5                1         1         1  22 3.000     0 7              5                0         1         1  22 4.320     0 8              5                0         1         1  22 4.752     0 9              5                0         1         1  22 5.000     0 11             5                1         1         1  22 5.000     0 12             5                0         1         1  22 5.000     0 

max(pred_bsttree) [1] 1.03385
min(pred_bsttree) [1] 1.011738

and errors tells all. plotting roc checking effect of different threshold points. based on threshold rounding happens e.g. 0.7 converted 1 (true class) , 0.3 go 0 (false class); in case threshold 0.5. threshold values in range of (0,1)

in case regardless of threshold observations true class minimum prediction greater 1. (thats why @phiver wondering if doing regression instead of classification) . without 0 in prediction there no level in 'prediction' coincide 0 level in adverse_effects , hence error.

ps: difficult tell root cause of error without posting data


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -