1 year ago · cd2541bc77
--- a/README.md
+++ b/README.md
@@ -42,33 +42,34 @@ usage: main.py [-h] [--group {corpus,child}] [--chains CHAINS] [--samples SAMPLE
 
				 main model described throughout the notes.
			
 
				 
			
 
				 optional arguments:
			
 
				-  -h, --help            show this help message and exit
			
 
				-  --group {corpus,child}
			
 
				-  --chains CHAINS
			
 
				-  --samples SAMPLES
			
 
				-  --validation VALIDATION
			
 
				-  --output OUTPUT
			
 
				-  ```
			
 
				+-h, --help            show this help message and exit
			
 
				+--group {corpus,child}
			
 
				+--chains CHAINS
			
 
				+--samples SAMPLES
			
 
				+--validation VALIDATION
			
 
				+--output OUTPUT
			
 
				+```
			
 
				+
			
 
				+The ``--group`` parameter controls the primary level of the hierarchical model. The model indeed assumes that confusion rates (i.e. confusion probabilities) vary across corpora (``corpus``) or children (``child``).
			
 
				 
			
 
				-  The ``--group`` parameter controls the primary level of the hierarchical model. The model indeed assumes that confusion rates (i.e. confusion probabilities) vary across corpora (``corpus``) or children (``child``).
			
 
				+The ``--chains`` parameter sets the amount of MCMC chains, and ``--samples`` controls the amount of MCMC samples, warmup excluded.
			
 
				 
			
 
				-  The ``--chains`` parameter sets the amount of MCMC chains, and ``--samples`` controls the amount of MCMC samples, warmup excluded.
			
 
				+The ``--validation`` parameter sets the amount of annotation clips used for validation rather than training. Set it to 0 in order to use as much data for training as possible.
			
 
				 
			
 
				-  The ``--validation`` parameter sets the amount of annotation clips used for validation rather than training. Set it to 0 in order to use as much data for training as possible.
			
 
				+The ``--output`` parameter controls the output destination. Training data will be saved to ``output/samples/data_{output}.pickle`` and the MCMC samples are saved as ``output/samples/fit_{output}.parquet``
			
 
				 
			
 
				-  The ``--output`` parameter controls the output destination. Training data will be saved to ``output/samples/data_{output}.pickle`` and the MCMC samples are saved as ``output/samples/fit_{output}.parquet``
			
 
				 
			
 
				-  ### Confusion probabilities
			
 
				+### Confusion probabilities
			
 
				 
			
 
				-  The marginal posterior distribution of the confusion matrix is shown below:
			
 
				+The marginal posterior distribution of the confusion matrix is shown below:
			
 
				 
			
 
				-  ![](output/fit_vanuatu.png)
			
 
				+![](output/fit_vanuatu.png)
			
 
				 
			
 
				 
			
 
				-  ### Speech distribution
			
 
				+### Speech distribution
			
 
				 
			
 
				-  Speech distributions used to generate the simulated "null-hypothesis" corpora are fitted against the training data using Gamma distributions. The code used to fit these distributions is found in ``code/models/speech_distribution``. 
			
 
				+Speech distributions used to generate the simulated "null-hypothesis" corpora are fitted against the training data using Gamma distributions. The code used to fit these distributions is found in ``code/models/speech_distribution``. 
			
 
				 
			
 
				-  The match between the training data and the Gamma parametrization can be observed in various plots in ``output``. See below for the Key child:
			
 
				+The match between the training data and the Gamma parametrization can be observed in various plots in ``output``. See below for the Key child:
			
 
				 
			
 
				-  ![](output/dist_CHI.png)
			
 
				+![](output/dist_CHI.png)