Of the 100 posts, I identified **46 posts** where the three fact-checkers agreed more on, abs(fc_i - fc_j) < 30.
- 50-50 train-test split: 23 train posts, 23 test posts
- maybe too few posts in training set?
```r
# all 80k parameter combinations
param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method
<num> <num> <num> <num> <num> <num> <num> <num> <num> <char> <char> <char>
1: 69990 1 0.854 0.326 1982.073 1660.269 6 6 0.25 combined corr_only random
2: 75980 2 0.843 0.393 2211.208 2302.026 6 7 0.10 simple perf_min random
3: 80910 3 0.837 0.358 9560.579 9286.530 8 7 0.20 combined corr_only random
4: 50783 4 0.834 0.478 1600.414 1600.414 6 5 0.10 simple perf_min worst2_then_best_to_worst
5: 52583 5 0.834 0.478 1600.414 1600.414 6 5 0.15 simple perf_min worst2_then_best_to_worst
6: 54383 6 0.834 0.478 1600.414 1600.414 6 5 0.20 simple perf_min worst2_then_best_to_worst
7: 56183 7 0.834 0.478 1600.414 1600.414 6 5 0.25 simple perf_min worst2_then_best_to_worst
8: 57983 8 0.834 0.478 1600.414 1600.414 6 5 0.30 simple perf_min worst2_then_best_to_worst
9: 59783 9 0.834 0.478 1600.414 1600.414 6 5 0.35 simple perf_min worst2_then_best_to_worst
10: 61583 10 0.834 0.478 1600.414 1600.414 6 5 0.40 simple perf_min worst2_then_best_to_worst
11: 14052 11 0.824 0.459 227.198 227.198 4 2 0.10 combined perf_mean best_to_worst
12: 15852 12 0.824 0.459 227.198 227.198 4 2 0.15 combined perf_mean best_to_worst
13: 17652 13 0.824 0.459 227.198 227.198 4 2 0.20 combined perf_mean best_to_worst
14: 19452 14 0.824 0.459 227.198 227.198 4 2 0.25 combined perf_mean best_to_worst
15: 21252 15 0.824 0.459 227.198 227.198 4 2 0.30 combined perf_mean best_to_worst
16: 23052 16 0.824 0.459 227.198 227.198 4 2 0.35 combined perf_mean best_to_worst
17: 24852 17 0.824 0.459 227.198 227.198 4 2 0.40 combined perf_mean best_to_worst
18: 14042 18 0.824 0.460 227.198 227.198 4 2 0.10 combined perf_geometric best_to_worst
19: 15842 19 0.824 0.460 227.198 227.198 4 2 0.15 combined perf_geometric best_to_worst
20: 17642 20 0.824 0.460 227.198 227.198 4 2 0.20 combined perf_geometric best_to_worst
21: 19442 21 0.824 0.460 227.198 227.198 4 2 0.25 combined perf_geometric best_to_worst
22: 21242 22 0.824 0.460 227.198 227.198 4 2 0.30 combined perf_geometric best_to_worst
23: 23042 23 0.824 0.460 227.198 227.198 4 2 0.35 combined perf_geometric best_to_worst
24: 24842 24 0.824 0.460 227.198 227.198 4 2 0.40 combined perf_geometric best_to_worst
25: 75500 25 0.824 0.329 8205.930 7174.146 8 6 0.40 combined perf_min random
26: 13452 26 0.824 0.467 227.198 227.198 4 2 0.10 weighted perf_mean best_to_worst
27: 15252 27 0.824 0.467 227.198 227.198 4 2 0.15 weighted perf_mean best_to_worst
28: 17052 28 0.824 0.467 227.198 227.198 4 2 0.20 weighted perf_mean best_to_worst
29: 18852 29 0.824 0.467 227.198 227.198 4 2 0.25 weighted perf_mean best_to_worst
30: 20652 30 0.824 0.467 227.198 227.198 4 2 0.30 weighted perf_mean best_to_worst
param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method
# only parameter combinations where cost per million post is < 100
param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method
<num> <num> <num> <num> <num> <num> <num> <num> <num> <char> <char> <char>
1: 66040 1006 0.786 0.069 81.049 81.527 0 6 0.15 combined icc_only random
2: 13810 1346 0.774 -0.194 22.654 33.485 0 2 0.10 combined perf_mean random
3: 69620 1837 0.762 0.207 75.794 83.923 0 6 0.25 combined perf_min random
4: 21630 2004 0.759 -0.134 22.654 33.432 0 2 0.35 simple corr_only random
5: 84630 2460 0.752 0.386 93.031 114.381 0 7 0.35 simple corr_only random
6: 13820 2731 0.748 0.048 26.151 33.583 0 2 0.10 combined perf_min random
7: 56400 2767 0.748 0.483 23.457 34.205 0 5 0.25 weighted perf_geometric random
8: 78010 2770 0.747 -0.137 23.457 32.629 0 7 0.15 weighted perf_mean random
9: 76810 2881 0.744 0.331 90.915 104.470 0 7 0.10 combined perf_mean random
10: 52240 2883 0.744 0.047 76.043 69.874 0 5 0.15 simple icc_only random
11: 61210 2962 0.742 0.426 62.657 77.154 0 5 0.40 simple perf_mean random
12: 76832 2999 0.741 0.290 87.258 90.162 0 7 0.10 combined corr_only best_to_worst
13: 78632 3000 0.741 0.290 87.258 90.162 0 7 0.15 combined corr_only best_to_worst
14: 80432 3001 0.741 0.290 87.258 90.162 0 7 0.20 combined corr_only best_to_worst
15: 82232 3002 0.741 0.290 87.258 90.162 0 7 0.25 combined corr_only best_to_worst
16: 84032 3003 0.741 0.290 87.258 90.162 0 7 0.30 combined corr_only best_to_worst
17: 85832 3004 0.741 0.290 87.258 90.162 0 7 0.35 combined corr_only best_to_worst
18: 87632 3005 0.741 0.290 87.258 90.162 0 7 0.40 combined corr_only best_to_worst
19: 26432 3035 0.741 0.238 38.730 39.324 0 3 0.10 combined corr_only best_to_worst
20: 28232 3036 0.741 0.238 38.730 39.324 0 3 0.15 combined corr_only best_to_worst
21: 30032 3037 0.741 0.238 38.730 39.324 0 3 0.20 combined corr_only best_to_worst
22: 31832 3038 0.741 0.238 38.730 39.324 0 3 0.25 combined corr_only best_to_worst
23: 33632 3039 0.741 0.238 38.730 39.324 0 3 0.30 combined corr_only best_to_worst
24: 35432 3040 0.741 0.238 38.730 39.324 0 3 0.35 combined corr_only best_to_worst
25: 37232 3041 0.741 0.238 38.730 39.324 0 3 0.40 combined corr_only best_to_worst
26: 79820 3058 0.740 0.114 23.171 34.214 0 7 0.20 weighted perf_min random
27: 24050 3078 0.740 0.046 22.654 26.809 0 2 0.40 weighted cost_only random
28: 84030 3238 0.737 0.318 93.179 112.923 0 7 0.30 combined corr_only random
29: 43220 3465 0.734 0.175 46.819 46.197 0 4 0.25 simple perf_min random
30: 40810 3504 0.732 0.122 54.543 44.415 0 4 0.15 combined perf_mean random
param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method
```