## setup
causal forest
- 70/30 train-test split
- fit models to 70% of data
- evaluated models separately on the train and and test data
- outcome: lean candidate preference (0 to 100)
- treatment: persuadeHarris (0), persuadeTrump (1)
fitted 2 forests (following Athey's recommendations)
- forest 1: 61 features, $X$ (so many because of dummy/one-hot encoded policy issues/personality traits)
- identified from the forest features, $X_{sub}$, that are important (feature importance is > mean(feature importance)); 19 of them
- forest 2: fitted forest using just $X_{sub}$
- see heterogeneous treatment effects/partial dependence plots below
- for each feature, the other remaining features are held at their mean values
## train partition
![[1727722395.png]]
## test partition
![[1727722476.png]]