Gaussian models are the motorway network of statistics!
6
Binary data (Z (Z ) can be modelled by Gaussians, using Probit model: Z =
0 if if Y 0 1 otherwise
≤
∼ N(α + + βx βx,, σ 2)
where Y
4
2 Z
1
Y
0
0
2 −
0
2
4
6 x
8
10 10
0
2
4
6 x
8
10 10
7
So can non-negative data (Z (Z ), ), using Tobit (or Latent Gaussian) model: Z =
0 if if Y 0 f f ((Y Y )) otherwise
≤
∼ N(α + + βx βx,, σ 2)
where Y
4
4
2
2
Z
Y 0
0
2 −
0
2
4
6 x
8
10 10
0
2
4
6 x
8
10 10
8
James Tobin (Econometrica, 1958)
9
10
PLAN 0. Introduction 1. Univariate data – data – crop lodging 2. Multivariate data – data – food intake 3. Spatio-temporal data – data – rainfall 4. Compositional data – data – food composition 5. Summary
11
1. UNIVARIATE DATA – CROP LODGING
Variety 1 2 3 4 5 6 7 8 9 10 ... 30 31 32
Crop lodging (Z (Z ) Trial 1 2 3 4 0 0 0 0.3 0 66.7 0 0 0
Diagnostic plots using standardised residuals: eˆij =
censored scatter plot
ij Z ij
− vˆi − tˆ j σˆ
Kaplan-Meier estimator & Φ
15
2. MULTIVARIATE DATA – FOOD INTAKE 0 0 0 3 ) ( g d a e r b n w o r b
0 0 0 2
0 0 0 1
0
0
1 0 00
2 000
3000
white bread (g)
UK Da Data ta Ar Arch chiv ivee (E (Ess ssex ex Un Univ iver ersi sitty) y):: week eekly ly in inta takkes of 51 food types by 2200 adults.
16
Model intake of food food j by adult adult i by by:: Z = ij
≤ 0
if Y ij
0
ij ) f j (Y ij
otherwise
where Y
N(µ , 1)
∼
ij
j
and f j−1 is a quadrati and quadraticc power power transfo transformati rmation on though though the origin origin γ + α2Z 2γ Y = f j− 1(Z ) = α1Z γ
Model fitting step 1: Estimate µ j , α and Estimate and γ by by regressing non-zero Z’s on normal scores
17
For example, for intake of white bread:
untransformed (Y (Y ))
transformed (Z = f f ((Y Y )))
18
Further assume assume Y i.
V ) (V ii MVN((µ, V ) MVN
∼
1, so also correlation matrix)
≡
Model fitting step 2:
Estimate V j Estimate jk k by maximising the pairwise likelihood: i
where
− −
Φ2( µ j , µk ; V j jk k)
ij , Z ik ik ) = p((Z ij p
p((Z ij , Z ik ) p
φ(Y ij
− µ j ) Φ
φ(Y ik
− µk ) Φ
φ2(Y ij
if Z ij = 0, Z ik = 0
− − − − − − − µk V j jkk (Y ij ij µ j ) 2 1 V j jk k
if Z ij > 0, Z ik = 0
µ j V j jkk (Y ik µk ) 2 1 V j jk k
if Z ij = 0, Z ik > 0
−
− µ j , Y ik − µk; V j jkk)
otherwise
19
ˆ V
Foo oods ds re-o re-ordere rderedd
20
− 1)1)//2 = 1275 parameters − 1275 parameters in in V
N ((N We prefer to have fewer than than N
In Factor AnalysisL V = V
2 ) = B B T + Σ β l β lT + diag diag (σ12, . . . , σN
l=1
Equivalently
L ij = µ j + Y ij
jll f iill + eij B j
l=1
where f , f , . . . , f where i1
∼
and eij and
i2
iL 2 N(0 (0,, σ j )
N(0 (0,, 1) 1) are are latent variables
∼
21
Model fitting step 3: Estimate B and Estimate and Σ using the maximum likelihood algorithm due to Joreskog ˆ in place of sample covariance matrix V in (1967), modified by using V To maximise:
L = − log |BBT + Σ| − trace trace[( [(B B B T + Σ)−1 ˆ V ]] V = j V jk k 1. Obtain initial estimate of Σ: σˆ j2 = 1 maxk ˆ j ˆ 1/2Ω(Θ I )1/2 2. Bˆ = Σ ˆ −1/2 ˆ −1/2 ˆ where Θ is where is L L diagonal of largest eigenvalues of Σ V Σ and Ω is the and the N L matrix of corresponding eigenvectors
−
−
× × ×
L with respect to to Σ
3. Numerically Numerically maximise maximise
4. Repeat steps 2 and 3 until convergence
| |
22
ˆ: V V :
L = 1
L = 2
L = 3
L = 4
23
Factor loadings Bˆ (L = 2)
24
3. SPATIO-TEMPORAL DATA – RAINFALL We have 12 hourly arrays (1200km Here are hours 3-5:
× 600km) of storm in Arkansas USA
We will build a model using fine-resolution data Then use it to disaggregate data at a coarser scale and see how well we recover the fine scale
25
Similar to the multivariate model: Step 1: We transform rainfall to a censored Gaussian variable (Y (Y )) via a quadratic power transformation 2: We estimate autocorrelations (V Step 2: We (V )) at a range of spatial and temporal lags by maximising pairwise likelihoods
26
ˆ V Time lag 0 4
.49
32 1 .83 0 1. .89 0 1
.68 .73 .75 2
..5672 .65 .66 3
..5526 .58 .59 4
Time lag 1 hour 4 3 2 1
.44 .50 .47 .57 .53 .49 .63 .59 .55 .51
0 .68 .65 .60 .55 .51 0 1 2 3 4
27
V we use To model model V use a spat spatio io-t -tem empo pora rall Gaus Gaussi sian an Mark Markov ov Rand Random om Fiel Fieldd (GMRF), because rainfall disaggregation requires simulation from conditional distributions Therefore p(Y Y )) 1
∝
1
1 exp
|V |2
−
1 (Y 2
− µ)T V −1(Y − µ)
where V − is the precision matrix, with non-zero entries specifying the conditional dependencies between elements in in Y
For example, a a 3
× 3 × 3 neighbourhood:
t-1
t
requires 5 parameters, if we allow for symmetries
t +1
28
Extending Rue and Tjelmeland (2002), we approximate both space and time by a torus. Therefore, all matrices are Toeplitz block circulant (TBC), and
• the first row summarises a matrix 1 compute V V from from V − via two 3-D Fourier transforms: • we can compute −
N i-1 N j 1 N t 1 ij∗ t = V ijt
then
−
k =0
l=0
1 V 000 000,kls ,kls = N iN j N t
1,kls exp 000,kls V 000
ik j l ts 2π ι N i + N j + N t
− − − − − ∗
s=0
N i 1 N j 1 N t 1 i=0
j=0 j =0
t=0
1 ik j l ts + + exp 2π ι V ij N i N j N t ijtt
29
Model fitting step 3: We estimate GMRF parameters by minimising 1 ˆij V ijtt i2 + j 2 + t2 i
j
t
× ×
For neighbourhood size size 5
5
Time lag 0
− V ijijtt
2
3:
Time lag 1 hour
30
Model diagnostics: Bivariate histogram of pairs of wet locations at a spatial separation of 8km
0
50mm observed
50mm expected
31
Model diagnostics: Histogram of rainfall for locations for which the adjacent location was dry
— observed, - - - expected
32
Disaggregation Gibbs sampling to update blocks of 5
× 5 pixels (Y (Y A)
Conditional distribution is multivariate normal, obtained from V AA V AB µA Y A , MVN V B A V B B µB Y B
∼
where dimension of neighbourhood neighbourhood Y B is is (3
× 92 − 52) = 218
constrain Y A such that Use rejection sampling to constrain rainfall