Exclude screen names - too many followers (e.g., news sites) and too small friend-follow-ratio - i.e., too many followers and following too few people - see `assign.Rmd` ```r > exclude <- dt1[followers_count > 100000 & friend_follow_ratio < 0.001, .(screen_name, followers_count, friend_follow_ratio)][order(-followers_count)] > exclude screen_name followers_count friend_follow_ratio 1: cnnbrk 62076267 1.949215e-06 2: ft 4898037 1.588391e-04 3: economictimes 4131225 1.113471e-05 4: benshapiro 3803462 8.439677e-05 5: france24 3796408 1.193233e-04 6: twittermoments 804819 1.615268e-05 7: enesfreedom 607914 1.644967e-06 8: leshchenkos 314355 4.103628e-04 9: dominic2306 271720 1.361691e-04 10: presstv 265906 2.519678e-04 11: aawsat_eng 142046 3.519962e-05 12: global_mil_info 110253 5.623379e-04 13: aymanrashdanw 103465 8.698510e-04 ``` Covariates/features used for blocking ```r features <- c( "total_tweets", "en_count", "uk_count", "ru_count", "topic_count_all", "topic_1_count", "topic_2_count", "topic_3_count", "topic_4_count", "followers_count", "friends_count", "favourites_count", "statuses_count", "friend_follow_ratio", "days_since_create" ) ``` After winsorizing ![[Pasted image 20220224145814.png|800]] Blocking ```r condition c t 6285 6314 > dt2[, table(table(block))] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 699 507 371 212 152 85 55 33 21 12 7 6 2 1 1 # conditon differences in covariate (accounting for blocking) > pvals_adjust covariate adjusted_pval 1: total_tweets 0.6214045 2: en_count 0.9535791 3: uk_count 0.7457448 4: ru_count 0.6734039 5: topic_count_all 0.2794515 6: topic_1_count 0.2335795 7: topic_2_count 0.4256212 8: topic_3_count 0.7392818 9: topic_4_count 0.1125286 10: followers_count 0.3963671 11: friends_count 0.2679797 12: favourites_count 0.1904098 13: statuses_count 0.6869106 14: friend_follow_ratio 0.9125938 15: days_since_create 0.3415026 ``` Sub-divided into two groups ```r features_grp <- c("total_tweets", "topic_count_all") group condition n 1: 0 c 3063 2: 0 t 3073 3: 1 c 3222 4: 1 t 3241 ``` ![[assignment_condition_group.png|1000]] audience status ![[Pasted image 20220225012053.png]] ```r upload audience_size match_rate 3063 id_220224224745 3028 true 0.9886 # control group0 3222 id_220224224832 3180 true 0.987 # control group1 3073 id_220224224906 3037 true 0.9883 # treatment group0 3241 id_220224224942 3201 true 0.9877 # treatment group1 ```