Identifying Variables in Decision Trees

Xpress Insight identifies variables in a unique way specific to decision trees.

Variables that can be profiled through the tree are available in the Profiles panel. These variables can be categorized as either continuous or categorical. Categorical variables define the array of possible categories. Continuous variables are normally numeric values—note that a range of numeric values may also be classed as categorical. All categorical profile variables are treated as possible targets for the tree and are available for selection in the Target dialog.

The following table lists each data type available in decision trees and how it is used in Xpress Insight.

Variables in Decision Trees
Data Type	Identification Criteria	Inserting Splits	Target-Driven Decision Trees	Statistics for Profiling Variables
`real`	All numeric variables	Enter branch thresholds.	Best Split algorithm is supported.	Categorical—Numeric variables with 10 or fewer unique values Continuous—Numeric variables with more than 10 unique values
`enum` (enumeration)	All string variables with 100 or fewer unique values	All unique values are automatically added to the tree after you click APPLY. Each value is shown as its own node in the tree.	Best Split algorithm is not supported.	Categorical
`string`	All string variables with more than 100 unique values	Each unique value must be entered manually as a branch value.	Best Split algorithm is not supported.	Not available as a profile variable.

Tip You can verify the data type of a variable in a decision tree by right-clicking its level and selecting Properties to open its Properties dialog box.

Note Any split on a string variable in a decision tree will contain an extra 'Any other value' branch to which the user will have to assign a treatment. This will be the case even for string variables with a finite set of known values (yes/no, hot/cold). This is stored as a FJDT special condition. Decision Trees that use string variables are not fully supported. For example, you cannot create them with TAO scenarios. It is also not possible to use Decision Trees that use the FJDT special condition “Any other value” as base trees in tree templates.

Contents

Index

Glossary

Search Results

Identifying Variables in Decision Trees