We expect that

*size*and*covered*are highly correlated for unordered sets, and that this correlation is weaker for ordered sets. We hope to incorporate entropy in the model for unordered sets to account for some of the extra noise.- Ordered: 0.38
Unordered: 0.40

**Size and coverage slightly more correlated in unordered**

We expect that

*OR*and*entropy*are highly correlated. These variables are the same in the ordered and non-ordered datasets as they reflect information on the system, and the same systems were used to form ordered and non-ordered credible sets.Ordered/ Unordered: 0.545

**OR and entropy are highly correlated**

We expect that

*entropy*and*covered*are more correlated in ordered than non-ordered sets. We hope to include*entropy*as a predictor for coverage in ordered sets.- Ordered: 0.202
Unordered: 0.202

Since the correlation is low, perhaps entropy will not be a significant predictor of coverage in the following logistic regression section.

We see there is much higher correlation between

*OR*and*covered*in ordered sets. We hope that by incorporating*entropy*as a predictor for coverage in ordered sets, we do not need to incorporate information on the*OR*as this is not known to experimentors.- Ordered: 0.447
Unordered: 0.217

**OR and covered show high correlation in ordered sets**

We see that

*nvar*and*entropy*have stronger negative correlation in ordered sets.- Ordered: -0.212
Unordered: 0.023

**nvar and entropy show higher negative correlation in ordered sets**

Similarly,

*nvar*and*OR*have stronger (negative) correlation in ordered sets.- Ordered: -0.246
Unordered: -0.064

**nvar and OR show higher negative correlation in ordered sets**

We see that

*nsnps*and*nvar*are much more correlated in unordered sets. This intuitively makes sense.- Ordered: 0.52
Unordered: 0.875

**nsnps and nvar highly correlated in unordered sets**

We see that

*thr*and*nvar*are more correlated in ordered sets - as the threshold increases, as does the nvar. I would expect this correlation to be higher in unordered sets, as if there is a snp with very high posterior probability then this will be included in the set quicker in ordered than non-ordered methods, making the set size smaller? Whereas for non-ordered sets, more snps have to be added to the set before ‘finding’ this high pp snp.- Ordered: 0.452
Unordered: 0.189

**thr and size more correlated in ordered sets**

The next section will analyse the following claims:

Claim 1: \[log(\frac{p}{1-p})\sim log(\frac{size}{1-size})\] works well for non-ordered sets, works less well for ordered sets.

Claim 2: Can we improve the accuracy of the above model in ordered sets by incorporating entropy as a predictor.

Claim 3: Hoping that adding OR to the \(log(\frac{p}{1-p})\sim log(\frac{size}{1-size})+entropy\) model does not improve it too much. Hoping that entropy has absorbed in our knowledge of OR.

Claim 4: Entropy has a non-linear effect on coverage. Use the `rcs`

function to analyse its non-linear effect.