You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/case_studies/bayesian_sem_workflow.ipynb
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -671,7 +671,7 @@
671
671
"\n",
672
672
"\n",
673
673
"\n",
674
-
"In the model below we sample draws from the latent factors `eta` and relate them to the observables by the matrix computation `pt.dot(eta, Lambda.T)`. This computation results in a \"psuedo-observation\" matrix which we then feed through our likelihood to calibrate the latent structures against the observed dats. The covariances (i.e. red arrows) among the latent factors is determined with `chol`. These are the general patterns we'll see in all models below, but we add complexity as we go."
674
+
"In the model below we sample draws from the latent factors `eta` and relate them to the observables by the matrix computation `pt.dot(eta, Lambda.T)`. This computation results in a \"pseudo-observation\" matrix which we then feed through our likelihood to calibrate the latent structures against the observed dats. The covariances (i.e. red arrows) among the latent factors is determined with `chol`. These are the general patterns we'll see in all models below, but we add complexity as we go."
675
675
]
676
676
},
677
677
{
@@ -1757,7 +1757,7 @@
1757
1757
"id": "3b5c7ecb",
1758
1758
"metadata": {},
1759
1759
"source": [
1760
-
"The sampler diagnostics give no indication of trouble. This is a promising start. "
1760
+
"The sampler diagnostics give no indication of trouble. This is a promising start. We now shift the SEM setting to layer in Structural regressions. These relations are ussually the focus of an analysis. "
1761
1761
]
1762
1762
},
1763
1763
{
@@ -1783,7 +1783,7 @@
1783
1783
"id": "544e9848",
1784
1784
"metadata": {},
1785
1785
"source": [
1786
-
"The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structuraly divorced from the influence of the exogenous variables. For instance if we have $\\gamma_{cts} \\perp\\!\\!\\!\\perp \\eta_{dtp}$ then the $\\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\\eta_{dtp}$ is noise by construction. \n",
1786
+
"The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structurally divorced from the influence of the exogenous variables. For instance if we have $\\gamma_{cts} \\perp\\!\\!\\!\\perp \\eta_{dtp}$ then the $\\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\\eta_{dtp}$ is noise by construction. \n",
1787
1787
"\n",
1788
1788
"Each additional arrow in the structural model thus encodes a substantive theoretical claim about causal influence. You are making claims of causal influence. Arrows should be added in line with plausible theory, while parameter identification is well supported. In our case we have structured the DAG following the discussion in {cite:p}`vehkalahti2019multivariate` which will allow us to unpick the direct and indirect effects below. In Lavaan syntax the model we want to specify is: \n",
1789
1789
"\n",
@@ -2824,7 +2824,7 @@
2824
2824
"id": "6de4795b",
2825
2825
"metadata": {},
2826
2826
"source": [
2827
-
"The sampler diagnostics suggest that the model is having trouble samplng from the B matrix. This is a little concerning because the structural relations are the primary parameters of interest in the SEM setting. Anything which undercuts our confidence in their estimation, undermines the whole modelling exercise."
2827
+
"The sampler diagnostics suggest that the model is having trouble samplng from the B matrix. This is a little concerning because the structural relations are the primary parameters of interest in the SEM setting. Anything which undercuts our confidence in their estimation, undermines the whole modelling exercise. Because the conditional SEM showed sampler challenges on `mu_betas`, we now try a marginal formulation."
2828
2828
]
2829
2829
},
2830
2830
{
@@ -4687,7 +4687,7 @@
4687
4687
"source": [
4688
4688
"## Hierarchical Model on Structural Components\n",
4689
4689
"\n",
4690
-
"The mean-structure model above offers us an interesting way to implement hierarchical Bayesian SEMs. We can simply add a hierarchical component to the `tau` parameter and aim to infer how the baseline expectation for each indicator variable, shifts across groups. However, a more interesting and complex route is to assess if hierarchical structure holds across the `B` matrix estimates. Or putting it another way - can we determine if the relationships between these latent factors have a different causal structure as we vary the group of interest. \n",
4690
+
"The mean-structure SEM provides a baseline description of how latent factors generate observed indicators. In real applications, however, we often expect these relationships to differ across groups or conditions. A hierarchical extension allows us to model those differences directly—testing whether the structural paths encoded in `B` are invariant across groups. For example, employees differ by team, firm, or treatment condition. The next natural question in a Bayesian workflow is therefore not just “What are the structural relations?” but “How stable are they across contexts?” This represents a key step in the Bayesian workflow: expanding the model’s expressive power while checking how robust its assumptions remain.\n",
Copy file name to clipboardExpand all lines: examples/case_studies/bayesian_sem_workflow.myst.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -485,7 +485,7 @@ In this section, we translate the theoretical structure of a confirmatory factor
485
485
486
486

487
487
488
-
In the model below we sample draws from the latent factors `eta` and relate them to the observables by the matrix computation `pt.dot(eta, Lambda.T)`. This computation results in a "psuedo-observation" matrix which we then feed through our likelihood to calibrate the latent structures against the observed dats. The covariances (i.e. red arrows) among the latent factors is determined with `chol`. These are the general patterns we'll see in all models below, but we add complexity as we go.
488
+
In the model below we sample draws from the latent factors `eta` and relate them to the observables by the matrix computation `pt.dot(eta, Lambda.T)`. This computation results in a "pseudo-observation" matrix which we then feed through our likelihood to calibrate the latent structures against the observed dats. The covariances (i.e. red arrows) among the latent factors is determined with `chol`. These are the general patterns we'll see in all models below, but we add complexity as we go.
489
489
490
490
```{code-cell} ipython3
491
491
with pm.Model(coords=coords) as cfa_model_v1:
@@ -656,7 +656,7 @@ Below these model checks we will now plot some diagnostics for the sampler. The
656
656
plot_diagnostics(idata_cfa_model_v1, parameters);
657
657
```
658
658
659
-
The sampler diagnostics give no indication of trouble. This is a promising start.
659
+
The sampler diagnostics give no indication of trouble. This is a promising start. We now shift the SEM setting to layer in Structural regressions. These relations are ussually the focus of an analysis.
660
660
661
661
+++
662
662
@@ -674,7 +674,7 @@ This is a claim of conditional independence which licenses the causal interpreta
674
674
675
675
+++
676
676
677
-
The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structuraly divorced from the influence of the exogenous variables. For instance if we have $\gamma_{cts} \perp\!\!\!\perp \eta_{dtp}$ then the $\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\eta_{dtp}$ is noise by construction.
677
+
The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structurally divorced from the influence of the exogenous variables. For instance if we have $\gamma_{cts} \perp\!\!\!\perp \eta_{dtp}$ then the $\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\eta_{dtp}$ is noise by construction.
678
678
679
679
Each additional arrow in the structural model thus encodes a substantive theoretical claim about causal influence. You are making claims of causal influence. Arrows should be added in line with plausible theory, while parameter identification is well supported. In our case we have structured the DAG following the discussion in {cite:p}`vehkalahti2019multivariate` which will allow us to unpick the direct and indirect effects below. In Lavaan syntax the model we want to specify is:
680
680
@@ -777,7 +777,7 @@ However, the model diagnostics appear less robust. The sampler seemed to have di
777
777
plot_diagnostics(idata_sem_model_v1, parameters);
778
778
```
779
779
780
-
The sampler diagnostics suggest that the model is having trouble samplng from the B matrix. This is a little concerning because the structural relations are the primary parameters of interest in the SEM setting. Anything which undercuts our confidence in their estimation, undermines the whole modelling exercise.
780
+
The sampler diagnostics suggest that the model is having trouble samplng from the B matrix. This is a little concerning because the structural relations are the primary parameters of interest in the SEM setting. Anything which undercuts our confidence in their estimation, undermines the whole modelling exercise. Because the conditional SEM showed sampler challenges on `mu_betas`, we now try a marginal formulation.
781
781
782
782
+++
783
783
@@ -1121,7 +1121,7 @@ This kind of sensitivity analysis is one approach to model validation, but we ca
1121
1121
1122
1122
## Hierarchical Model on Structural Components
1123
1123
1124
-
The mean-structure model above offers us an interesting way to implement hierarchical Bayesian SEMs. We can simply add a hierarchical component to the `tau` parameter and aim to infer how the baseline expectation for each indicator variable, shifts across groups. However, a more interesting and complex route is to assess if hierarchical structure holds across the `B` matrix estimates. Or putting it another way - can we determine if the relationships between these latent factors have a different causal structure as we vary the group of interest.
1124
+
The mean-structure SEM provides a baseline description of how latent factors generate observed indicators. In real applications, however, we often expect these relationships to differ across groups or conditions. A hierarchical extension allows us to model those differences directly—testing whether the structural paths encoded in `B` are invariant across groups. For example, employees differ by team, firm, or treatment condition. The next natural question in a Bayesian workflow is therefore not just “What are the structural relations?” but “How stable are they across contexts?” This represents a key step in the Bayesian workflow: expanding the model’s expressive power while checking how robust its assumptions remain.
0 commit comments