Of the 100 posts, I identified **46 posts** where the three fact-checkers agreed more on, abs(fc_i - fc_j) < 30. - 50-50 train-test split: 23 train posts, 23 test posts - maybe too few posts in training set? ```r # all 80k parameter combinations param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method <num> <num> <num> <num> <num> <num> <num> <num> <num> <char> <char> <char> 1: 69990 1 0.854 0.326 1982.073 1660.269 6 6 0.25 combined corr_only random 2: 75980 2 0.843 0.393 2211.208 2302.026 6 7 0.10 simple perf_min random 3: 80910 3 0.837 0.358 9560.579 9286.530 8 7 0.20 combined corr_only random 4: 50783 4 0.834 0.478 1600.414 1600.414 6 5 0.10 simple perf_min worst2_then_best_to_worst 5: 52583 5 0.834 0.478 1600.414 1600.414 6 5 0.15 simple perf_min worst2_then_best_to_worst 6: 54383 6 0.834 0.478 1600.414 1600.414 6 5 0.20 simple perf_min worst2_then_best_to_worst 7: 56183 7 0.834 0.478 1600.414 1600.414 6 5 0.25 simple perf_min worst2_then_best_to_worst 8: 57983 8 0.834 0.478 1600.414 1600.414 6 5 0.30 simple perf_min worst2_then_best_to_worst 9: 59783 9 0.834 0.478 1600.414 1600.414 6 5 0.35 simple perf_min worst2_then_best_to_worst 10: 61583 10 0.834 0.478 1600.414 1600.414 6 5 0.40 simple perf_min worst2_then_best_to_worst 11: 14052 11 0.824 0.459 227.198 227.198 4 2 0.10 combined perf_mean best_to_worst 12: 15852 12 0.824 0.459 227.198 227.198 4 2 0.15 combined perf_mean best_to_worst 13: 17652 13 0.824 0.459 227.198 227.198 4 2 0.20 combined perf_mean best_to_worst 14: 19452 14 0.824 0.459 227.198 227.198 4 2 0.25 combined perf_mean best_to_worst 15: 21252 15 0.824 0.459 227.198 227.198 4 2 0.30 combined perf_mean best_to_worst 16: 23052 16 0.824 0.459 227.198 227.198 4 2 0.35 combined perf_mean best_to_worst 17: 24852 17 0.824 0.459 227.198 227.198 4 2 0.40 combined perf_mean best_to_worst 18: 14042 18 0.824 0.460 227.198 227.198 4 2 0.10 combined perf_geometric best_to_worst 19: 15842 19 0.824 0.460 227.198 227.198 4 2 0.15 combined perf_geometric best_to_worst 20: 17642 20 0.824 0.460 227.198 227.198 4 2 0.20 combined perf_geometric best_to_worst 21: 19442 21 0.824 0.460 227.198 227.198 4 2 0.25 combined perf_geometric best_to_worst 22: 21242 22 0.824 0.460 227.198 227.198 4 2 0.30 combined perf_geometric best_to_worst 23: 23042 23 0.824 0.460 227.198 227.198 4 2 0.35 combined perf_geometric best_to_worst 24: 24842 24 0.824 0.460 227.198 227.198 4 2 0.40 combined perf_geometric best_to_worst 25: 75500 25 0.824 0.329 8205.930 7174.146 8 6 0.40 combined perf_min random 26: 13452 26 0.824 0.467 227.198 227.198 4 2 0.10 weighted perf_mean best_to_worst 27: 15252 27 0.824 0.467 227.198 227.198 4 2 0.15 weighted perf_mean best_to_worst 28: 17052 28 0.824 0.467 227.198 227.198 4 2 0.20 weighted perf_mean best_to_worst 29: 18852 29 0.824 0.467 227.198 227.198 4 2 0.25 weighted perf_mean best_to_worst 30: 20652 30 0.824 0.467 227.198 227.198 4 2 0.30 weighted perf_mean best_to_worst param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method # only parameter combinations where cost per million post is < 100 param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method <num> <num> <num> <num> <num> <num> <num> <num> <num> <char> <char> <char> 1: 66040 1006 0.786 0.069 81.049 81.527 0 6 0.15 combined icc_only random 2: 13810 1346 0.774 -0.194 22.654 33.485 0 2 0.10 combined perf_mean random 3: 69620 1837 0.762 0.207 75.794 83.923 0 6 0.25 combined perf_min random 4: 21630 2004 0.759 -0.134 22.654 33.432 0 2 0.35 simple corr_only random 5: 84630 2460 0.752 0.386 93.031 114.381 0 7 0.35 simple corr_only random 6: 13820 2731 0.748 0.048 26.151 33.583 0 2 0.10 combined perf_min random 7: 56400 2767 0.748 0.483 23.457 34.205 0 5 0.25 weighted perf_geometric random 8: 78010 2770 0.747 -0.137 23.457 32.629 0 7 0.15 weighted perf_mean random 9: 76810 2881 0.744 0.331 90.915 104.470 0 7 0.10 combined perf_mean random 10: 52240 2883 0.744 0.047 76.043 69.874 0 5 0.15 simple icc_only random 11: 61210 2962 0.742 0.426 62.657 77.154 0 5 0.40 simple perf_mean random 12: 76832 2999 0.741 0.290 87.258 90.162 0 7 0.10 combined corr_only best_to_worst 13: 78632 3000 0.741 0.290 87.258 90.162 0 7 0.15 combined corr_only best_to_worst 14: 80432 3001 0.741 0.290 87.258 90.162 0 7 0.20 combined corr_only best_to_worst 15: 82232 3002 0.741 0.290 87.258 90.162 0 7 0.25 combined corr_only best_to_worst 16: 84032 3003 0.741 0.290 87.258 90.162 0 7 0.30 combined corr_only best_to_worst 17: 85832 3004 0.741 0.290 87.258 90.162 0 7 0.35 combined corr_only best_to_worst 18: 87632 3005 0.741 0.290 87.258 90.162 0 7 0.40 combined corr_only best_to_worst 19: 26432 3035 0.741 0.238 38.730 39.324 0 3 0.10 combined corr_only best_to_worst 20: 28232 3036 0.741 0.238 38.730 39.324 0 3 0.15 combined corr_only best_to_worst 21: 30032 3037 0.741 0.238 38.730 39.324 0 3 0.20 combined corr_only best_to_worst 22: 31832 3038 0.741 0.238 38.730 39.324 0 3 0.25 combined corr_only best_to_worst 23: 33632 3039 0.741 0.238 38.730 39.324 0 3 0.30 combined corr_only best_to_worst 24: 35432 3040 0.741 0.238 38.730 39.324 0 3 0.35 combined corr_only best_to_worst 25: 37232 3041 0.741 0.238 38.730 39.324 0 3 0.40 combined corr_only best_to_worst 26: 79820 3058 0.740 0.114 23.171 34.214 0 7 0.20 weighted perf_min random 27: 24050 3078 0.740 0.046 22.654 26.809 0 2 0.40 weighted cost_only random 28: 84030 3238 0.737 0.318 93.179 112.923 0 7 0.30 combined corr_only random 29: 43220 3465 0.734 0.175 46.819 46.197 0 4 0.25 simple perf_min random 30: 40810 3504 0.732 0.122 54.543 44.415 0 4 0.15 combined perf_mean random param_idx param_rank_train fc_veracity_corr_train fc_veracity_corr_test cost_per_m_post_train cost_per_m_post_test cost_bin max_models disagreement_threshold aggregation_method perf_metric model_selection_method ```