[DRAFT/RFC] Gaussian Copula Cholesky LPDF#3206
Conversation
|
@spinkney would you have a look at the general signature/idea here? Not 100% that is the way to go, but I'm not sure of a better alternative |
31116c0 to
8b68066
Compare
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
|
@andrjohns thanks for getting to this! In the signature real gaussian_copula_cholesky_lpdf(vector | tuple-of-tuples, cholesky_factor_corr)Does this imply that it is looped over for each observation, that is for each
Yea, I see that there's a few different options with different pros and cons. I like the current implementation but I think it would be nice to have additional signatures. Here's a few questions and options:
To make this a bit more concrete: data {
int<lower=0> N;
int<lower=1> K; // K different marginals
array[K] int<lower=0> P; // number of parameters for each k marginal
array[N] vector[K] y; // continuous outcome and an array of vector
}
transformed data {
int P_total = to_int(sum(to_vector(P)));
}
parameters {
vector[P_total] theta_free;
cholesky_factor_corr[K] L;
}
model {
tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_lcdf_tuples =
tuple(
tuple(k1_lcdf, theta_free[1], theta_free[2], ...),
tuple(k2_lcdf, theta_free[p], theta_free[p + 1], ...),
....,
tuple(K_lcdf, ..., theta_free[P_total])
);
// Can the above also be any container for the parameters?
// such as
//
// tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_tuples =
// tuple(
// tuple(k1_lcdf, vector),
// tuple(k2_lcdf, array[] real),
// ....,
// tuple(K_lcdf, ..., matrix)
// );
//
// Why not also have another input for the lpdfs?
//
// tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_lpdf_tuples =
// tuple(
// tuple(k1_lpdf, theta_free[1], theta_free[2], ...),
// tuple(k2_lpdf, theta_free[p], theta_free[p + 1], ...),
// ....,
// tuple(K_lpdf, ..., theta_free[P_total])
// );
//
y ~ gaussian_copula_cholesky(marginal_lcdf_tuples | L);
// or optionally
// y ~
// gaussian_copula_cholesky(
// marginal_lcdf_tuples | marginal_lpdf_tuples, L
// );
//
} |
|
Just typing that out was tiresome and made me think another way // lcdf functors
tuple( real, real, ... ) marginal_lcdf_tuples =
tuple( k1_lcdf, k2_lcdf, ..., K_lcdf );
// lpdf functors, same K number of objects in the tuple as marginal_lcdf_tuples
tuple( real, real, ... ) marginal_lpdf_tuples =
tuple( k1_lpdf, k2_lpdf, ..., K_lpdf );
// parameters
// I'm assuming that the containers in this tuple would match the
// allowable parameters in their corresponding lcdf/lpdf functor signatures
tuple( params_k1, ..., params_K) marginal_params_tuple =
tuple( k1_params, ..., K_params );
y ~ gaussian_copula_cholesky(
marginal_lcdf_tuples |
marginal_params_tuple,
marginal_lpdf_tuples, // optional, if omitted assumed you are doing this somewhere else
L); |
|
Is it legal to have a user defined |
|
Yes, we currently allow lpdfs in reduce_sum, and I don’t think cdfs would require any different code generation than what we already have |
Summary
Opening this as a draft PR for feedback on the signature/approach.
For the Gaussian copula (and other copula families), the user needs to provide both the
yvariable and a means of transforming that variable to the unit-scale.The method I've gone with is to require a tuple of the same length as
y, where each element is itself a tuple - with a functor for computing the LCDF as the first element, and any additional args as the remaining.For example, if the user wanted to model the correlation between a
Gamma(2, 1)variable and aExp(2)variable, they would pass the tuple-of-tuples:So the final signature would be:
The current framework only supports continuous outcomes, but once we settle on a good approach I can expand to discrete outcomes by requiring that a uniform parameter for data-augmentation is also included in the tuple for that outcome.
Tests
Basic
primandmixare added, but thefwdcomponents of themixtests are currently failingSide Effects
Extended a few of the utilities (
size,vector_seq_view) for tuplesRelease notes
Replace this text with a short note on what will change if this pull request is merged in which case this will be included in the release notes.
Checklist
Copyright holder: Andrew Johnson
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit)make test-headers)make test-math-dependencies)make doxygen)make cpplint)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested