## setup causal forest - 70/30 train-test split - fit models to 70% of data - evaluated models separately on the train and and test data - outcome: lean candidate preference (0 to 100) - treatment: persuadeHarris (0), persuadeTrump (1) fitted 2 forests (following Athey's recommendations) - forest 1: 61 features, $X$ (so many because of dummy/one-hot encoded policy issues/personality traits) - identified from the forest features, $X_{sub}$, that are important (feature importance is > mean(feature importance)); 19 of them - forest 2: fitted forest using just $X_{sub}$ - see heterogeneous treatment effects/partial dependence plots below - for each feature, the other remaining features are held at their mean values ## train partition ![[1727722395.png]] ## test partition ![[1727722476.png]]