Hello
Please help me understand how to interpret the results of fullfactorial experiment with 4 factors when 1 of 2 main effects is not significant, but their interaction is significant.
My dataset:
A B C D Trials Succeses
1 0 0 0 0 1852 11
2 0 0 0 1 1878 3
3 0 0 1 0 1869 9
4 0 0 1 1 1881 14
5 0 1 0 0 1926 4
6 0 1 0 1 1920 6
7 0 1 1 0 1891 4
8 0 1 1 1 1841 5
9 1 0 0 0 1921 9
10 1 0 0 1 1827 2
11 1 0 1 0 1837 13
12 1 0 1 1 1908 11
13 1 1 0 0 1827 8
14 1 1 0 1 1860 5
15 1 1 1 0 1854 10
16 1 1 1 1 1922 10
My final model is (Succeses, Trials) ~ (C + D + C * D)
glm(formula = cbind(Succeses, Trials) ~ (C + D + C * D), family = binomial(link = logit),
data = df)
Deviance Residuals:
Min 1Q Median 3Q Max
1.9108 0.6633 0.1795 0.5882 1.2902
Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 5.54543 0.09587 57.843 < 2e16 ***
C1 0.25879 0.09587 2.699 0.00695 **
D1 0.14895 0.09587 1.554 0.12028
C1:D1 0.19490 0.09587 2.033 0.04206 *

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 27.866 on 15 degrees of freedom
Residual deviance: 16.007 on 12 degrees of freedom
AIC: 84.463
Number of Fisher Scoring iterations: 4
Model with all interactions:
glm(formula = cbind(Succeses, Trials) ~ (A + B + C + D)^5, family = binomial(link = logit),
data = df)
Deviance Residuals:
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 5.620799 0.104319 53.881 <2e16 ***
A1 0.105993 0.104319 1.016 0.3096
B1 0.109620 0.104319 1.051 0.2933
C1 0.259357 0.104319 2.486 0.0129 *
D1 0.150130 0.104319 1.439 0.1501
A1:B1 0.166693 0.104319 1.598 0.1101
A1:C1 0.108473 0.104319 1.040 0.2984
A1:D1 0.122721 0.104319 1.176 0.2394
B1:C1 0.165999 0.104319 1.591 0.1116
B1:D1 0.166955 0.104319 1.600 0.1095
C1:D1 0.205677 0.104319 1.972 0.0487 *
A1:B1:C1 0.015377 0.104319 0.147 0.8828
A1:B1:D1 0.025085 0.104319 0.240 0.8100
A1:C1:D1 0.006925 0.104319 0.066 0.9471
B1:C1:D1 0.169022 0.104319 1.620 0.1052
A1:B1:C1:D1 0.069391 0.104319 0.665 0.5059

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2.7866e+01 on 15 degrees of freedom
Residual deviance: 1.0081e13 on 0 degrees of freedom
AIC: 92.456
Number of Fisher Scoring iterations: 4
Model with 2way interactions
glm(formula = cbind(Succeses, Trials) ~ (A + B + C + D)^2, family = binomial(link = logit),
data = df)
Deviance Residuals:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0.4816 0.8870 0.3733 0.4042 0.5775 0.7016 0.4976 0.5010 0.2084 0.2152 0.2617 0.2079 0.2944 0.2655 0.4013 0.2842
Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 5.60250 0.10146 55.219 <2e16 ***
A1 0.09102 0.09661 0.942 0.3461
B1 0.14402 0.09658 1.491 0.1359
C1 0.23302 0.09782 2.382 0.0172 *
D1 0.12723 0.09791 1.299 0.1938
A1:B1 0.18278 0.09613 1.901 0.0572 .
A1:C1 0.12331 0.09800 1.258 0.2083
A1:D1 0.13583 0.09594 1.416 0.1569
B1:C1 0.14472 0.09857 1.468 0.1420
B1:D1 0.13080 0.09692 1.350 0.1771
C1:D1 0.22362 0.09918 2.255 0.0242 *

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 27.8660 on 15 degrees of freedom
Residual deviance: 3.2462 on 5 degrees of freedom
AIC: 85.702
Number of Fisher Scoring iterations: 4
I would be extremely grateful if you shed light on my 3 questions

I know that if there is a significant interaction effect then we should include it in a model even though one of the main effects may not be significant. As it is the case.
But how can we interpret the fact that when С at 0 level and D at 1  we have a decrease in success rate by 50%. However, when С and D both at level 1 they increase success level by 20%. How to report this to stakeholders? 
How confident can I be that factor С has a positive effect?
What confuses me is that when I look at a model with all interactions included then if factor C at 1 level it decreases the predicted success rate by 18%.
When I look at a model with 2way interactions it increases success rate by 6%. 
What are my next steps to make a clear conclusion?
Do I need to accept С factor as a most successful factor and run a followup experiment with factor D against control which will have factor C?