Discussion:
The proper way to compose function returning expressions
Joel Falcou
2012-04-23 20:01:43 UTC
Permalink
Let's say we have a bunch of functions like sum and sqr defined on a
proto domain to return
expression of tag sum_ and sqr_ in this domain. One day we want to make
a norm2(x) function
which is basically sum(sqr(x)).

My feeling is that I should be able to write it using sqr and sum
expressions.
Alas it seems this results in dandling reference, crash and some sad pandas.

Then I remember about proto::deep_copy but I have a worries. x is
usually a terminal
holding a huge matrix like value and I just don't want this huge matrix
to be copied.

What's the correct way to handle such a problem ? How can I build new
function returning
expressions built from expression composition without incurring a huge
amount of copy ?
Eric Niebler
2012-04-23 22:15:48 UTC
Permalink
Post by Joel Falcou
Let's say we have a bunch of functions like sum and sqr defined on a
proto domain to return
expression of tag sum_ and sqr_ in this domain. One day we want to make
a norm2(x) function
which is basically sum(sqr(x)).
My feeling is that I should be able to write it using sqr and sum
expressions.
Alas it seems this results in dandling reference, crash and some sad pandas.
Then I remember about proto::deep_copy but I have a worries. x is
usually a terminal
holding a huge matrix like value and I just don't want this huge matrix
to be copied.
What's the correct way to handle such a problem ? How can I build new
function returning
expressions built from expression composition without incurring a huge
amount of copy ?
Right. The canonical way of doing this is as follows:

#include <boost/proto/proto.hpp>
namespace proto = boost::proto;

struct sum_ {};
struct sqr_ {};

namespace result_of
{
template<typename T>
struct sum
: proto::result_of::make_expr<sum_, T>
{};

template<typename T>
struct sqr
: proto::result_of::make_expr<sqr_, T>
{};

template<typename T>
struct norm2
: sum<typename sqr<T>::type>
{};
}

template<typename T>
typename result_of::sum<T &>::type const
sum(T &t)
{
return proto::make_expr<sum_>(boost::ref(t));
}

template<typename T>
typename result_of::sqr<T &>::type const
sqr(T &t)
{
return proto::make_expr<sqr_>(boost::ref(t));
}

template<typename T>
typename result_of::norm2<T &>::type const
norm2(T &t)
{
return
proto::make_expr<sum_>(proto::make_expr<sqr_>(boost::ref(t)));
}

int main()
{
sum(proto::lit(1));
sqr(proto::lit(1));
norm2(proto::lit(1));
}


As you can see, the norm2 is not implemented in terms of the sum and sqr
functions. That's not really ideal, but it's the only way I know of to
get fine grained control over which parts are stored by reference and
which by value.

You always need to use make_expr to build expression trees that you
intend to return from a function. That's true even for the built-in
operators. You can't ever return the result of expressions like "a+b*42"
... because of the lifetime issues.

You can't use deep_copy for the reason you mentioned.

I once had a function proto::implicit_expr, which you could have used
like this:

template<typename T>
typename result_of::norm2<T &>::type const
norm2(T &t)
{
return proto::implicit_expr(sum(sqr(x)));
}

implicit_expr() returns an object that holds its argument and is
convertible to any expression type. The conversion is implemented by
trying to implicitly convert all the child expressions, recursively. It
sort of worked, but I never worked out all the corner cases, and
documenting it would have been a bitch. Perhaps I should take another
look. Patches welcome. :-)
--
Eric Niebler
BoostPro Computing
http://www.boostpro.com
Joel Falcou
2012-04-24 05:17:48 UTC
Permalink
Post by Eric Niebler
implicit_expr() returns an object that holds its argument and is
convertible to any expression type. The conversion is implemented by
trying to implicitly convert all the child expressions, recursively.
It sort of worked, but I never worked out all the corner cases, and
documenting it would have been a bitch. Perhaps I should take another
look. Patches welcome. :-)
I think this is an important issues to solve as far as Proto grokability
does.
One of my coworker on NT2 tried to do just this (the norm2 thingy) and
he get puzzled by the random crash.

I think we should at least document the issues (I can write that and
submit a patch for the doc) and
maybe resurrect this implicit_expr. Do you have any remnant of code
lying around so I don't start from scratch ?
Eric Niebler
2012-04-24 20:31:10 UTC
Permalink
Post by Joel Falcou
Post by Eric Niebler
implicit_expr() returns an object that holds its argument and is
convertible to any expression type. The conversion is implemented by
trying to implicitly convert all the child expressions, recursively.
It sort of worked, but I never worked out all the corner cases, and
documenting it would have been a bitch. Perhaps I should take another
look. Patches welcome. :-)
I think this is an important issues to solve as far as Proto grokability
does.
Agreed. It would be very nice to have. But you still have to know when
to use it.
Post by Joel Falcou
One of my coworker on NT2 tried to do just this (the norm2 thingy) and
he get puzzled by the random crash.
I think we should at least document the issues (I can write that and
submit a patch for the doc) and
maybe resurrect this implicit_expr. Do you have any remnant of code
lying around so I don't start from scratch ?
The implicit_expr code lived in a detail namespace in past versions of
proto. You can find it if you dig through subversion history. I'm not
going to do that work for you because the code was broken in subtle ways
having to do with the consistency of terminal handling. Repeated
attempts to close the holes just opened new ones. It really should be
left for dead. I'd rather see what you come up with on your own.
--
Eric Niebler
BoostPro Computing
http://www.boostpro.com
Mathias Gaunard
2012-04-25 20:41:50 UTC
Permalink
Post by Eric Niebler
Post by Joel Falcou
I think this is an important issues to solve as far as Proto grokability
does.
Agreed. It would be very nice to have. But you still have to know when
to use it.
Post by Joel Falcou
One of my coworker on NT2 tried to do just this (the norm2 thingy) and
he get puzzled by the random crash.
[...]
Post by Eric Niebler
The implicit_expr code lived in a detail namespace in past versions of
proto. You can find it if you dig through subversion history. I'm not
going to do that work for you because the code was broken in subtle ways
having to do with the consistency of terminal handling. Repeated
attempts to close the holes just opened new ones. It really should be
left for dead. I'd rather see what you come up with on your own.
The issue Joel had in NT2 was probably unrelated to this. In NT2 we hold
all expressions by value unless the tag is boost::proto::tag::terminal.
This was done by modifying as_child in our domain.

I strongly recommend doing this for most proto-based DSLs. It makes auto
foo = some_proto_expression work as expected, and allows expression
rewriting of the style that was shown in the thread without any problem.

There is probably a slight compile-time cost associated to it, though.
Eric Niebler
2012-04-26 16:02:18 UTC
Permalink
Post by Mathias Gaunard
Post by Eric Niebler
Post by Joel Falcou
I think this is an important issues to solve as far as Proto grokability
does.
Agreed. It would be very nice to have. But you still have to know when
to use it.
Post by Joel Falcou
One of my coworker on NT2 tried to do just this (the norm2 thingy) and
he get puzzled by the random crash.
[...]
Post by Eric Niebler
The implicit_expr code lived in a detail namespace in past versions of
proto. You can find it if you dig through subversion history. I'm not
going to do that work for you because the code was broken in subtle ways
having to do with the consistency of terminal handling. Repeated
attempts to close the holes just opened new ones. It really should be
left for dead. I'd rather see what you come up with on your own.
The issue Joel had in NT2 was probably unrelated to this. In NT2 we hold
all expressions by value unless the tag is boost::proto::tag::terminal.
This was done by modifying as_child in our domain.
I strongly recommend doing this for most proto-based DSLs. It makes auto
foo = some_proto_expression work as expected, and allows expression
rewriting of the style that was shown in the thread without any problem.
There is probably a slight compile-time cost associated to it, though.
Interesting. I avoided this design because I was uncertain whether the
compiler would be able to optimize out all the copies of the
intermediate nodes. You're saying NT2 does it this way and doesn't
suffer performance problems? And you've hand-checked the generated code
and found it to be optimal? That would certainly change things.
--
Eric Niebler
BoostPro Computing
http://www.boostpro.com
Mathias Gaunard
2012-04-26 16:35:36 UTC
Permalink
Post by Eric Niebler
Interesting. I avoided this design because I was uncertain whether the
compiler would be able to optimize out all the copies of the
intermediate nodes. You're saying NT2 does it this way and doesn't
suffer performance problems? And you've hand-checked the generated code
and found it to be optimal? That would certainly change things.
NT2 treats large amounts of data per expression, so construction time is
not very important. It's the time to evaluate the tree in a given
position that matters (which only really depends on proto::value and
proto::child_c<N>, which are always inlined now).

We also have another domain that does register-level computation, where
construction overhead could be a problem. The last tests we did with
this was a while ago and was with the default Proto behaviour. That
particular domain didn't get sufficient testing to give real conclusions
about the Proto overhead.
Eric Niebler
2012-04-26 16:38:15 UTC
Permalink
Post by Mathias Gaunard
Post by Eric Niebler
Interesting. I avoided this design because I was uncertain whether the
compiler would be able to optimize out all the copies of the
intermediate nodes. You're saying NT2 does it this way and doesn't
suffer performance problems? And you've hand-checked the generated code
and found it to be optimal? That would certainly change things.
NT2 treats large amounts of data per expression, so construction time is
not very important. It's the time to evaluate the tree in a given
position that matters (which only really depends on proto::value and
proto::child_c<N>, which are always inlined now).
We also have another domain that does register-level computation, where
construction overhead could be a problem. The last tests we did with
this was a while ago and was with the default Proto behaviour. That
particular domain didn't get sufficient testing to give real conclusions
about the Proto overhead.
In that case, I will hold off making any core changes to Proto until I
have some evidence that it won't cause performance regressions.

Thanks,
--
Eric Niebler
BoostPro Computing
http://www.boostpro.com
Jeffrey Lee Hellrung, Jr.
2012-04-26 16:40:21 UTC
Permalink
Post by Joel Falcou
Post by Mathias Gaunard
Post by Eric Niebler
Post by Joel Falcou
I think this is an important issues to solve as far as Proto
grokability
Post by Mathias Gaunard
Post by Eric Niebler
Post by Joel Falcou
does.
Agreed. It would be very nice to have. But you still have to know when
to use it.
Post by Joel Falcou
One of my coworker on NT2 tried to do just this (the norm2 thingy) and
he get puzzled by the random crash.
[...]
Post by Eric Niebler
The implicit_expr code lived in a detail namespace in past versions of
proto. You can find it if you dig through subversion history. I'm not
going to do that work for you because the code was broken in subtle ways
having to do with the consistency of terminal handling. Repeated
attempts to close the holes just opened new ones. It really should be
left for dead. I'd rather see what you come up with on your own.
The issue Joel had in NT2 was probably unrelated to this. In NT2 we hold
all expressions by value unless the tag is boost::proto::tag::terminal.
This was done by modifying as_child in our domain.
I strongly recommend doing this for most proto-based DSLs. It makes auto
foo = some_proto_expression work as expected, and allows expression
rewriting of the style that was shown in the thread without any problem.
There is probably a slight compile-time cost associated to it, though.
Interesting. I avoided this design because I was uncertain whether the
compiler would be able to optimize out all the copies of the
intermediate nodes. You're saying NT2 does it this way and doesn't
suffer performance problems? And you've hand-checked the generated code
and found it to be optimal? That would certainly change things.
FWIW, some years back, I had created a rather simple expression template
engine, tried storing all intermediate nodes by value, and, IIRC, was
surprised to find (via putting print statements into the copy constructor)
that MSVC8 elided all the intermediate copying.

And, because of this, I certainly took notice that, when Proto came out, by
default, all intermediate Proto nodes were held by reference.

But...I could be remembering incorrectly :(

- Jeff

Loading...