Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Eigen::Map to replace arr functions #1425

Closed
andrjohns opened this issue Oct 28, 2019 · 66 comments
Closed

Use Eigen::Map to replace arr functions #1425

andrjohns opened this issue Oct 28, 2019 · 66 comments
Milestone

Comments

@andrjohns
Copy link
Collaborator

Description

As initially discussed in this PR, there is some code duplication in having the same functions implemented for std::vector and Eigen separately. This could be avoided by using a wrapper which compiles to a no-op for Eigen types and returns an Eigen::Map for std::vector types prior to calling the mat definition:

template <typename Derived>
const auto& as_eigen(const Eigen::MatrixBase<Derived>& v) {
  return v;
}

template <typename T>
const auto as_eigen(const std::vector<T>& v) {
  return Eigen::Map<const Eigen::Matrix<T, Eigen::Dynamic, 1>>(v.data(),
                                                               v.size());
}

This also means that the templating for the mat definitions needs to be changed from Eigen::Matrix<double, R, C> to Eigen::MatrixBase<Derived> so that it works with Eigen::Map inputs.

After some initial testing, the speed gains are minor:

2019-10-28 20:40:26
Running ./log_sum_exp_arr
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 256K (x4)
  L3 Unified 8192K (x1)
-----------------------------------------------------
Benchmark              Time           CPU Iterations
-----------------------------------------------------
LogSumExp_Old    8175394 ns    8175190 ns         92
LogSumExp_New    5511970 ns    5511907 ns        124
DotSelf_Old      5136511 ns    5136391 ns        132
DotSelf_New      4497041 ns    4496989 ns        157
Sum_Old          4868005 ns    4867939 ns        139
Sum_New          4608184 ns    4608127 ns        149

Benchmarking code here

The main benefits would be in code reduction for functions with existing arr definitions, as well as making it easier to extend other functions to take std::vector inputs (without requiring additional definitions).

Does this seem like a worthwhile change?

Current Version:

v3.0.0

@rok-cesnovar
Copy link
Member

The way I read this, there is a 30% reduction in execution time for LogSum? 15% for DotSelf and 5 for Sum? I would not call that minor. And even if the execution times would be exactly the same and we "only" gain the code reduction I would be all for it.

So if I understand correct, we can remove the /arr folder completely in this case? If that is the case, I am 100% on board with this idea and please let me know if I can help in any way.

This also means that the templating for the mat definitions needs to be changed from Eigen::Matrix<double, R, C> to Eigen::MatrixBase<Derived> so that it works with Eigen::Map inputs.

This idea has been brewing for some time already anyways for many reasons (most recently sparse matrices I think). This is just another example of why we really need to do this. Some latest discussions on this topic are here: https://discourse.mc-stan.org/t/refactoring-math-with-more-general-signatures/10157/6

What would be the best way forward here? A PR (or multiple PRs) to refactor the /mat signatures and then a PR to add what you are proposing here?

@syclik
Copy link
Member

syclik commented Oct 28, 2019 via email

@syclik
Copy link
Member

syclik commented Oct 28, 2019

Ha. @rok-cesnovar, I think our messages crossed. And you're right -- we can remove the arr implementation completely and just use the mat implementation. We still need to flatten everything, but that's a different issue and we don't need to hold up anything for that.

@andrjohns
Copy link
Collaborator Author

Great! I'll go ahead then.

I think Rok's workflow was a good idea:

  • Generalise mat signatures
  • Replace arr functions

After some testing, I've realised that the signatures need to be more general than MatrixBase<Derived>, since they need to be able to take std::vectors as inputs. At the moment, I'm thinking of something like this, using log_sum_exp as an example:

template <typename T, typename = require_eigen_st<std::is_arithmetic,T>>
double log_sum_exp(const T& x) {
...
}

Then, once the we're ready to collapse the arr functions, the template can be updated to:

template <typename T, typename = require_vector_like_st<std::is_arithmetic,T>>
double log_sum_exp(const T& x) {
...
}

PS @SteveBronder your require generics are an absolute lifesaver here!

@bbbales2
Copy link
Member

This looks cool! Questions:

  1. How does this compare to scalar_seq_view in https://github.com/stan-dev/math/blob/develop/stan/math/prim/scal/prob/normal_lpdf.hpp#L62 ? (I guess that one can absorb scalars so maybe I answered myself -- but they seem very similar?)

  2. Can this mechanism absorb row_vectors as well?

  3. Can this be used to write functions that take a std::vector of any of a std::vector, an Eigen vector or an Eigen row vector?

No pressure. Just asking cause these seem related.

Also it probably makes sense to just work this out for one or two representative functions first. No need to try to get rid of everything in array in one go (that'll just make the first review painful for whoever does it).

@andrjohns
Copy link
Collaborator Author

How does this compare to scalar_seq_view in https://github.com/stan-dev/math/blob/develop/stan/math/prim/scal/prob/normal_lpdf.hpp#L62 ? (I guess that one can absorb scalars so maybe I answered myself -- but they seem very similar?)

Pretty much the same conceptually, in that it's taking different vector types and returning a single type. The main difference is that it needs to return an Eigen type specifically, so that we can directly call their functions (e.g. v.squaredNorm())

Can this mechanism absorb row_vectors as well?

That depends on what you mean, the as_eigen wrapper will return all Eigen objects unchanged, and the require_eigen_st<std::is_arithmetic,T> templating will allow all types under the EigenBase umbrella (which includes row-vectors). So it will work in the sense that the templating will be inclusive of row-vectors, but not in the sense that row-vectors would be mapped to column-vectors. But that's an easy enough inclusion to make as well.

Can this be used to write functions that take a std::vector of any of a std::vector, an Eigen vector or an Eigen row vector?

Tricky one. For a given function it would be easy to add another definition explicitly for nested vectors, but for a single function that handles both vectors and nested vectors, I could look into implementing the as_eigen wrapper in a similar way to vector_seq_view (a la eigen_seq_view). That should make it pretty trivial (touch wood) to add nested vectors to current mat functions as well.

Also it probably makes sense to just work this out for one or two representative functions first. No need to try to get rid of everything in array in one go (that'll just make the first review painful for whoever does it).

Good call!

@syclik
Copy link
Member

syclik commented Oct 29, 2019

The main difference is that it needs to return an Eigen type specifically, so that we can directly call their functions (e.g. v.squaredNorm())

@andrjohns, can you explain this in more detail? I want to make sure I understand. Ok first read, I think what you’ve written here is not correct: if the return type is always an Eigen type, then we can’t mix the arr and mat implementation because this will lose the type information that’s necessary to work with Stan. But maybe what you’ve written isn’t what you were envisioning?

Also, starting with one function and seeing it all the way through is definitely the right way to go.

@andrjohns
Copy link
Collaborator Author

Sure! It's not that the return type of the function (e.g. log_sum_exp) will always be an Eigen type, it's that the return type of wrapper (i.e.as_eigen) will always be an Eigen type.

Using log_sum_exp as an example. The current definition is:

template <int R, int C>
double log_sum_exp(const Eigen::Matrix<double, R, C>& x) {
  const double max = x.maxCoeff();
  if (!std::isfinite(max)) {
    return max;
  }
  return max + std::log((x.array() - max).exp().sum());
}

What I'm proposing is something like:

template <typename T, typename = require_vector_like_st<std::is_arithmetic,T>>
double log_sum_exp(const T& x) {
  auto v = as_eigen(x);
  const double max = v.maxCoeff();
  if (!std::isfinite(max)) {
    return max;
  }
  return max + std::log((v.array() - max).exp().sum());
}

So the return type (double) doesn't change, it's just that std::vectors are mapped to Eigen types within the function.

@t4c1
Copy link
Contributor

t4c1 commented Oct 29, 2019

You don't have to implement as_eigen(). I think as_column_vector_or_scalar() I implemented when working with GLMs will do exactley what you want.

@andrjohns
Copy link
Collaborator Author

You don't have to implement as_eigen(). I think as_column_vector_or_scalar() I implemented when working with GLMs will do exactley what you want.

I did have a look at that, the only issue was that it was specifically for vectors, whereas I'm hoping to work with any Eigen object (including Map).

I've put together a rough proof-of-concept here. I've implemented a view framework (similar to vector_seq_view) and updated the mat definition to work with both std::vectors and std::vectors containing other vector types

The templating for the nested views is all explicit at the moment, since I'm still figuring out to navigate the specialisations. But this should give a good overview of what I had in mind.

@t4c1
Copy link
Contributor

t4c1 commented Oct 29, 2019

I like the idea. I think in that case existing function can be reworked to work with any Eigen expression. We don't need two functions that do almost the same thing.

@t4c1
Copy link
Contributor

t4c1 commented Oct 29, 2019

If you want I can do that.

@andrjohns
Copy link
Collaborator Author

I'm not sure the functions are aiming to do the same thing - since Eigen types are meant to be returned unchanged, not as a column vector. I don't think as_column_vector_or_scalar could be applied to a matrix?

@t4c1
Copy link
Contributor

t4c1 commented Oct 29, 2019

No, it can't be applied to a matrix. But that is not a problem, since a matrix can not be stored in a std::vector. But I see it could be a problem if you want to return row vectors for row vector inputs.

@SteveBronder
Copy link
Collaborator

Glad to hear the requires are useful! Are they missing anything conceptually that would be useful? Also for some of the above I think you want the _vt (value_type) instead of _st (scalar_type) since value type only looks one level down. Scalar type looks all the way down so vector of vector would also go in that signature. If as_eigen can work with vector of vectors then _st is the right choice.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Oct 29, 2019 via email

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Oct 29, 2019 via email

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Oct 29, 2019 via email

@andrjohns
Copy link
Collaborator Author

Why do these need to directly call Eigen functions? Don't we usually want to call Stan functions instead to get more efficient derivatives?

I was referring to the matrix functions that have separate prim and rev implementations (e.g. log_sum_exp), so they would just be working with matrices/vectors of doubles (i.e. no need for autodiff/derivatives). The example I used (v.squaredNorm()) is from prim/mat/fun/dot_self.

This will not be an efficient way to write this if it's not calling our vectorized exp and specialized sum functions.

They're calling Eigen's vectorised functions, which can be faster (see this old forum post for benchmarking I've done with exp)

For instance, what happens to transpose as a function?

I'm not sure what you mean here. There isn't an arr definition of transpose, so I'm not sure why this would need to change?

I think this may be tricky to get return types back in a lot of cases

I think this comes down to the taxonomy of different types of matrix functions:

  • Reductions that return a scalar (e.g. log_sum_exp, dot_self)
  • Functions that directly modify the input (e.g. sort_asc)
  • Functions that return a vector of the same type as the input (e.g. value_of)

The first two should be simple enough - the return type is known for the first, and the input for the second is modified in-place. The third type will need some work to deduce the return container type (similar to apply_scalar_unary)

@andrjohns
Copy link
Collaborator Author

The third type will need some work to deduce the return container type

I've put together a very rough working example for this using log_softmax (suggestions appreciated). At the moment, log_softmax can only take column vectors as inputs (and subsequently return a column vector). The below function can work with any type of vector (column, row, and std::) as well as those vectors nested within an std::vector:

template <typename T, typename T_v = typename T::value_type,
          typename nest_vec_t = std::conditional_t<is_vector_like<T_v>::value,
                                                   T_v, T>,
          typename = require_vector_like_st<std::is_arithmetic, T>>
inline auto log_softmax(const T& v) {
  eigen_seq_view<T> u(v);
  check_nonzero_size("log_softmax", "v", u[0]);
  std::vector<nest_vec_t> result(u.size());
  for(int i = 0; i < u.size(); i++){
    result[i].resize(u[i].size());
    as_eigen(result[i]) = u[i].array() - log_sum_exp(u[i]);
  }
  return match_input_dim<T>(result);
}

eigen_seq_view works like vector_seq_view, except the view is to an eigen object (via as_eigen).

The results are stored in an std::vector ('result'), which is the size of the view (where the sizing works the same as vector_seq_view). For a non-nested vector (e.g. Eigen::VectorXd), this results vector will be size 1, whereas for a nested vector (e.g. std::vector<Eigen::VectorXd>) the vector will the length of the array.

The match_input_dim struct then determines whether to return the std::vector of results as-is (in the case of a nested input), or whether to return only the first entry (in the case of a non-nested input).

I probably didn't explain that very well, but here's the branch I'm working from which also includes tests for the new functionality in log_softmax as well as a similar refactor for log_sum_exp

@SteveBronder
Copy link
Collaborator

Getting over a cold so I'll post more tmrw. But why not have two methods? One that iterates on containers of scales and the other that iterates on containers of containers. C of C the iterator of scalars

@andrjohns
Copy link
Collaborator Author

But why not have two methods?

There's a non-zero chance I was trying to be too clever by squishing it all into one method. Putting it into two makes for much cleaner code:

template <typename T, typename = require_vector_like_st<std::is_arithmetic, T>>
inline T log_softmax(const T& v) {
  check_nonzero_size("log_softmax", "v", v);
  auto u = as_eigen(v);
  T result(u.size());
  as_eigen(result) = u.array() - log_sum_exp(u);
  return result;
}

template <typename T, typename = require_vector_like_st<std::is_arithmetic, T>>
inline std::vector<T> log_softmax(const std::vector<T>& v) {
  check_nonzero_size("log_softmax", "v", v);
  std::vector<T> result(v.size());
  for(int i = 0; i < v.size(); i++){
    result[i] = log_softmax(v[i]);
  }
  return result;
}

@bbbales2
Copy link
Member

bbbales2 commented Oct 30, 2019

This is really interesting. Since it seems you were able to ratchet this up to std::vector, what are the chances of me ever writing type-agnostic C++ code and controlling what types go into a function at the interface level?

This is almost definitely a Very Bad idea at some level, but the reason I'm asking is I recently worked on this bit of code (Gaussian process kernel stuff): https://github.com/stan-dev/math/blob/f573698ea3c6095a6d24de65915a501e9f3b84bc/stan/math/prim/mat/fun/gp_dot_prod_cov.hpp

The x arguments to the function can be std::vector<Eigen::VectorXd>-like.

It'd be cool if the arguments could actually be any of std::vectorstd::vector, std::vector<Eigen::RowVectorXd>, or Eigen::MatrixXd. When I told @avehtari that he'd have to give his inputs as arrays of vectors in Stan, he was pretty bummed. But I am also not keen on writing and testing 6 more gp_dot_prod_cov variants to allow him to use Stan matrix types (especially since this would require me to update a bunch of other gp stuff to keep it all equivalent).

Also, when I've played with this stuff before there have been issues where super-generic templated functions can absorb Eigen intermediate expression types and give unhappy results (like things you would have regularly called eval on). How's that work here?

Edits: Added some backticks so that template stuff would print correctly

@bbbales2
Copy link
Member

I may be going off the rails there too -- feel free to say so if you think this is out of scope.

@avehtari
Copy link

feel free to say so if you think this is out of scope.

No problem, you're absolutely right that I was pretty bummed.

@SteveBronder
Copy link
Collaborator

I think the conversation has gone semi off the rails but in a pretty good direction! Ben I'll look over your code tmrw and see if we can do something like the above

@andrjohns
Copy link
Collaborator Author

Since it seems you were able to ratchet this up to std::vector, what are the chances of me ever writing type-agnostic C++ code and controlling what types go into a function at the interface level?

In a simple, proof-of-concept, kind of way, it's already done. I've currently got log_sum_exp working with std::vectors and most every Eigen type (Array, Map, Diagonal), as well as std::vectors of all of those. But functions like log_softmax are currently restricted to std::vectors and 'standard' eigen types (Matrix, Vector, Row-Vector, Array), as well as std::vectors of all of those, since it's harder to deduce the correct return type when the input is an expression (e.g. Map or Diagonal).

I'm not sure how well (or easily) this would extend to something like gp_dot_prod_cov, but given that the return type is always an Eigen::Matrix, it might not be too crazy

@SteveBronder
Copy link
Collaborator

Should the second one be for vectors that contain vectors of scalars? Something like below in pseudocode

template <typename Vec, require_vector_st<std::is_arithmetic, Vec>...,
 require_vector_vt<is_vector, Vec>...>
inline auto log_softmax(Vec&& v) {
  check_nonzero_size("log_softmax", "v", v);
// return_container does not exist yet but gives back either eigen or std vector type
  return_container_t<Vec> result(v.size());
  for(int i = 0; i < v.size(); i++){
    result[i] = log_softmax(std::forward<decltype(v[i])>(v[i]));
  }
  return result;
}

vector_like works for std::vectors, eigen vectors, and pointers (which are all array like accessible with []). I think here we may just want require_vector_*

In the future we may also want the above to work over containers of eigen vs. std vectors so we can have the eigen one just return a unaryExpr template type that can be lazy evaluated later

@bbbales2
Copy link
Member

Well at some point copies would be necessary. I don't think I can write a function that takes all the Stan types matrix, vector[], row_vector[], real[,] and doesn't require copies?

As another wrench, how do ragged arrays fit into this: stan-dev/design-docs#7? That wouldn't break what you have so far, but it would break any sort of matrix/real[,] equivalency we tried to set up, right? Or wrong?

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Oct 30, 2019 via email

@SteveBronder
Copy link
Collaborator

The other types this could work with are all the intermediate expression templates that get generated from operations like .transpose().

Yeah that's why I mentioned the coefficient access stuff. I think the issue in the past was that coefficient access on an expression template will access the untransformed data member

@andrjohns
Copy link
Collaborator Author

Why do we need the template cast here? Not saying we don't, just curious.

To promote any int inputs, since Eigen doesn't automatically promote: https://godbolt.org/z/RnS3BY

What is return_container_t doing? Is it matching the container type of T or something?

That was determining whether the return type should be an Eigen type or an std::vector, but it might not be necessary (explained in the next comment)

The other types this could work with are all the intermediate expression templates that get generated from operations like .transpose()

To get this working I went back to three definitions:

template <typename T, typename = require_eigen_st<std::is_arithmetic, T>,
          typename PromotedScalarT = return_type_t<typename T::Scalar>>
inline auto log_softmax(const T& v) {
  check_nonzero_size("log_softmax", "v", v);
  auto u = v.template cast<PromotedScalarT>().eval();
  return (u.array() - log_sum_exp(u)).eval();
}

template <typename T, typename = require_any_arithmetic_t<T>,
          typename PromotedScalarT = return_type_t<T>>
inline auto log_softmax(const std::vector<T>& v) {
  check_nonzero_size("log_softmax", "v", v);
  auto u = as_column_vector_or_scalar(v).template cast<return_type_t<T>>().eval();
  std::vector<T> result(u.size());
  as_eigen(result) = u.array() - log_sum_exp(u);
  return result;
}

template <typename Vec, require_vector_st<std::is_arithmetic, Vec>...,
          require_vector_vt<is_vector, Vec>...>
inline auto log_softmax(Vec&& v) {
  check_nonzero_size("log_softmax", "v", v);
  std::vector<return_container_t<value_type_t<Vec>>> result(v.size());
  for(int i = 0; i < v.size(); i++){
    result[i] = log_softmax(std::forward<decltype(v[i])>(v[i]));
  }
  return result;
}

The first definition works with general eigen types and expressions (things like log_softmax(m.transpose().diagonal()). The expression has to be evaluated before it's returned, otherwise there's the issue that @SteveBronder mentioned, where the 'finished' value isn't returned.

The second definition is for std::vector<T> where T is an int or double, and the third is for nested containers.

It's less concise than just having two definitions, but it makes the function much more flexible with input types.

@t4c1
Copy link
Contributor

t4c1 commented Nov 8, 2019

The expression has to be evaluated before it's returned, otherwise there's the issue that @SteveBronder mentioned, where the 'finished' value isn't returned.

I can't find that. Can you link the explanation or explain it here?

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 8, 2019 via email

@andrjohns
Copy link
Collaborator Author

I can't find that. Can you link the explanation or explain it here?

It was this comment:

Yeah that's why I mentioned the coefficient access stuff. I think the issue in the past was that coefficient access on an expression template will access the untransformed data member

I've got an example of that occurring here: https://godbolt.org/z/8k-zqz

@t4c1
Copy link
Contributor

t4c1 commented Nov 8, 2019

Thanks. I asked this for the same reason as @bob-carpenter nicely explained - for performance reasons it is better to avoid .eval() if it is not necessary.

I think we can work around this by requiring that functions that needs coeficient access evaluate expressions. Than we dont need to eval every return value. I dont know why piping to ostream does not do it, but that should not bother us.

@andrjohns
Copy link
Collaborator Author

Oh hold on, it looks like the problem is the cast statement. Everything returns correctly if there's no cast: https://godbolt.org/z/EywVwH

Odd.

@andrjohns
Copy link
Collaborator Author

andrjohns commented Nov 11, 2019

I think I have a final framework ready for comment. I've taken a similar approach to apply_scalar_unary, as Bob recommended, and implemented an apply_vector_unary framework which will work with arbitrary Eigen types, std::vector types, and containers of both (while also avoiding the use of .eval()).

The function header itself looks like this:

struct log_softmax_fun {
  template <typename T>
  static inline auto fun(const T& v) {
    check_nonzero_size("log_softmax", "v", v);
    return v.array() - log_sum_exp(v);
  }
};

template <typename T>
inline auto log_softmax(const T& x) {
  return apply_vector_unary<log_softmax_fun, T>::apply(x);
}

And the apply_vector_unary header looks like this:

template<typename F, typename T, typename Enable = void>
struct apply_vector_unary {};

template <typename F, typename T>
struct apply_vector_unary<F, T, require_eigen_st<std::is_floating_point, T>> {

  static inline auto apply(const T& x) {
    return F::fun(x);
  }
};


template <typename F, typename T>
struct apply_vector_unary<F, T, require_eigen_st<std::is_integral, T>> {

  static inline auto apply(const T& x) {
    return F::fun(x.template cast<double>());
  }
};


template <typename F, typename T>
struct apply_vector_unary<F, std::vector<T>, require_stan_scalar_t<T>> {

  using map_t = typename Eigen::Map<const Eigen::Matrix<T, Eigen::Dynamic, 1>>;
  using return_t = typename std::vector<return_type_t<T>>;

  static inline return_t apply(const std::vector<T>& x) {
    return_t result(x.size());
    as_eigen(result) = apply_vector_unary<F, map_t>::apply(as_eigen(x));
    return result;
  }
};


template <typename F, typename T>
struct apply_vector_unary<F, T, require_vector_vt<is_vector, T>> {

  using return_t = typename std::vector<return_container_t<value_type_t<T>>>;

  static inline return_t apply(const T& x) {
    return_t result(x.size());
    for(int i = 0; i < x.size(); i++){
      as_eigen(result[i])
                  = apply_vector_unary<F, decltype(x[i])>::apply(as_eigen(x[i]));
    }
    return result;
  }
};

@andrjohns
Copy link
Collaborator Author

The wrinkle is that this assumes a vector-type return. Functions that reduce a vector down to a scalar (e.g. log_sum_exp) would need either a separate framework (e.g. apply_vector_unary_reduction) or a flag could be added to signal that a scalar return is needed (e.g. apply_vector_unary<F, T, reduce>). Thoughts?

@SteveBronder
Copy link
Collaborator

I like it!

The wrinkle is that this assumes a vector-type return.

imo I think that's fine, idt we want to think about this in the R way of applying to matrix rows and columns. If we wanted to we could have a apply_vector_unary that specialized for types that are Eigen Matrices that are not vectors (require_t<conjunction<is_eigen<T>, !is_eigen_vector<T>>>) then apply over the columns with .col() right? Or something of the sort

Functions that reduce a vector down to a scalar (e.g. log_sum_exp) would need either a separate framework (e.g. apply_vector_unary_reduction) or a flag could be added to signal that a scalar return is needed (e.g. apply_vector_unary<F, T, reduce>). Thoughts?

I think scatter, broadcast, and reduce should be their own structs. With reduce we might want something about how to reduce it i.e. do addition for a sum or product etc.

If we are going to rely on more meta-programming I think we should try to be more formal about our template arg names ala

template <typename Func, typename VecVec>
struct apply_vector_unary<Func, VecVec, require_vector_vt<is_vector, VecVec>> {

  using return_t = typename std::vector<return_container_t<value_type_t<VecVec>>>;

  static inline return_t apply(VecVec&& x) {
    return_t result(x.size());
    for(int i = 0; i < x.size(); i++){
      as_eigen(result[i])
                  = apply_vector_unary<F, decltype(x[i])>::apply(as_eigen(x[i]));
    }
    return result;
  }
};

@SteveBronder
Copy link
Collaborator

Could we just replace the call to apply scalar unary for vectors etc. with the above stuff? That would be an easy plug and play

@andrjohns
Copy link
Collaborator Author

I think scatter, broadcast, and reduce should be their own structs. With reduce we might want something about how to reduce it i.e. do addition for a sum or product etc.

I don't think we'll be able to get to that level of generality, since some functions have implementation-specific reductions to avoid over-/under-flow (e.g. log_sum_exp), while others have the reduction built-in to the function call (e.g.m.mean()). I was thinking that apply_vector_unary_reduce would be more to signal that scalar is being returned from a matrix function, rather than a generic class for reducing matrices down to scalars.

For example, this what I've got mean looking like:

struct mean_fun {
  template <typename T>
  static inline auto fun(const T& v) {
    check_nonzero_size("mean", "v", v);
    return v.mean();
  }
};

template <typename T>
inline auto mean(const T& x) {
  return apply_vector_unary_reduce<mean_fun, T>::apply(x);
}

With apply_vector_unary_reduce:

template<typename F, typename T, typename Enable = void>
struct apply_vector_unary_reduce {};

template <typename F, typename T>
struct apply_vector_unary_reduce<F, T, require_eigen_st<std::is_floating_point, T>> {

  static inline auto apply(const T& x) {
    return F::fun(x);
  }
};

template <typename F, typename T>
struct apply_vector_unary_reduce<F, T, require_eigen_st<std::is_integral, T>> {

  static inline auto apply(const T& x) {
    return F::fun(x.template cast<double>());
  }
};

template <typename F, typename T>
struct apply_vector_unary_reduce<F, std::vector<T>, require_stan_scalar_t<T>> {

  using map_t = typename Eigen::Map<const Eigen::Matrix<T, Eigen::Dynamic, 1>>;
  using return_t = return_type_t<T>;

  static inline return_t apply(const std::vector<T>& x) {
    return apply_vector_unary_reduce<F, map_t>::apply(as_eigen(x));
  }
};

template <typename F, typename T>
struct apply_vector_unary_reduce<F, T, require_vector_vt<is_vector, T>> {

  using vec_type = value_type_t<T>;
  using return_t = typename std::vector<scalar_type_t<vec_type>>;

  static inline return_t apply(const T& x) {
    return_t result(x.size());
    for(int i = 0; i < x.size(); i++){
      result[i] = apply_vector_unary_reduce<F, vec_type>::apply(x[i]);
    }
    return result;
  }
};

You'll notice that the Eigen declarations are the same as in apply_vector_unary, since the auto return type lets us be agnostic here. But for the std::vector<T> inputs, the return type needs to be specified.

@andrjohns
Copy link
Collaborator Author

Could we just replace the call to apply scalar unary for vectors etc. with the above stuff? That would be an easy plug and play

That's a bit trickier, since apply_scalar_unary is overloaded for scalar inputs, several functions only have a prim/mat definition, and not a corresponding prim/scal (e.g. ceil and cos). Those functions would need a prim/scal definition introduced before apply_scalar_unary could be replaced

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 12, 2019 via email

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 12, 2019 via email

@SteveBronder
Copy link
Collaborator

the m-word.

Anyone who mentions those things and forces me to google what they are for the 19th time will be reported for CoC violation

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 12, 2019 via email

@t4c1
Copy link
Contributor

t4c1 commented Nov 12, 2019

Just a random Idea. Do we for some reason need to separate function implementations from function signatures into a separate struct? I think we could get code that is a bit easier to read if we used generic lambdas and template deduction. So instead of this:

struct log_softmax_fun {
  template <typename T>
  static inline auto fun(const T& v) {
  }
};

template <typename T>
inline auto log_softmax(const T& x) {
  return apply_vector_unary<log_softmax_fun, T>::apply(x);
}

Is there any reason not to do this:

template <typename T>
inline auto log_softmax(const T& x) {
  return apply_vector_unary<T>::apply(x, [](auto& v){
    check_nonzero_size("log_softmax", "v", v);
    return v.array() - log_sum_exp(v);
  });
}

Of cource apply_vector_unary would have to modified a bit to accept functor instead of template type with run member function.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 13, 2019 via email

@SteveBronder
Copy link
Collaborator

They are trivially constructible tho' so idt the overhead would be that much. Could run both through godbolt to see what happens

@t4c1
Copy link
Contributor

t4c1 commented Nov 13, 2019

I think there should not be any overhead in optimized code. But it can't hurt to check.

@wds15
Copy link
Contributor

wds15 commented Nov 13, 2019

Uhm... this is a great thread and I am really looking forward to this template magic...given the complexity and size of discussion would it make sense to write a brief design doc and file that in the design repo?

(I really never like when people tell me to write design docs - lots of work - but after having done it things gain in clarity)

Feel free to ignore my comment.

@SteveBronder
Copy link
Collaborator

^what does and doesn't need a design doc is still kind of vague, but I think it does fit here. Plus with the new site we can take the approved design doc and add it there so contributors have an easier way to get why we do things a certain way.

@andrjohns
Copy link
Collaborator Author

I'm on board with the design doc, especially since the scope of this has changed quite a bit from my initial post (in a good way!). I'd just like to clarify the intended direction of this before I write up a doc about the wrong thing.

In my mind, this issue is now concerned with extending the existing prim/mat functions that take Eigen column vectors as inputs to also work with Eigen row vectors and std::vectors, without the need for additional specialisations for each type (and combination) of inputs. There are going to be several different implementations, given the broad taxonomy of vector-based functions we have at the moment:

  • f(vector) -> scalar
    • e.g. log_sum_exp
  • f(vector) -> vector
    • e.g. log_softmax
  • f(vector) -> matrix
    • e.g. diag_matrix
  • f(vector, scalar) -> vector
    • e.g. head
  • f(vector, vector) -> scalar
    • e.g. dot_product
  • f(vector, vector) -> vector
    • e.g. elt_multiply

(Let me know if I've missed a type in there)

In terms of a design, I'm not sure whether it would be possible to have a flexible implementation that would handle more than one of these types of functions, or whether we need a separate one for each (e.g. apply_vector_unary_reduce,apply_vector_unary, and apply_vector_scalar_unary; although it seems like the naming itself might also get out of hand here)

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Nov 27, 2019 via email

@andrjohns
Copy link
Collaborator Author

I would very strongly prefer that we do not extend matrix operations to std::vector

That sounds good to me, it was going to be a bit of a mess determining whether the to treat them as column or row-vectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants