C++ Feature Whitelist/Blacklist

Mon Nov 7 21:25:30 AEDT 2016

On Thu, Nov 3, 2016 at 5:03 AM, Patrick Williams <patrick at stwcx.xyz> wrote:
>
> On Wed, Nov 02, 2016 at 05:02:27PM -0700, Brendan Higgins wrote:
> >     * it seems that Patrick and I have general agreement on (just not
> > necessarily exact
> >       strength of rule):
> >         - Variables of class type with static storage duration
(including
> > constants),
> >           plain old data types (POD) are allowed
>
> I'm not convinced we have agreement on this one.
>
> > > I would
> > > prefer we start with industry recognized experts opinion[3] and then
> > > document deviations specific to OpenBMC as they come up.
> >
> >
> > I agree with trying to follow generally accepted practices and it seems
> > like the CppCoreGuidelines seem to encode that pretty well, but it is
not
> > really the same as a style guide and I see no reason why that precludes
us
> > from writing one.
>
> There is obviously a major cultural difference here (between Google and
> "everyone else").  What Google calls a style guide is a combination of
> what others call two different documents: a coding style guide and a
> coding conventions (or guidelines) document.  Style guides describe the
> location of spaces, names of variables, etc.  Coding guidelines give you
> the "Do's and Don'ts".  The QT project, for example, has called them
> exactly that: "coding style"[1] and "coding conventions"[2].
>
> I honestly think that the difference between the two is partially to
> blame for why we have been talking past each other previously in the
> need for a "style guide".
>

Agreed, and I thought we discussed this in our meetings last week.  What
we'd
like to define is more than just style.  I don't care what we call it
though.
A style guide, language guidelines, rules.  But having these conventions and
rules around helps programmers contribute, especially programmers who are
just
getting started with OpenBMC.

> > I would, however, caution against citing the opinions of
> > "industry recognized experts" as sole justification for something. If
the
> > opinion itself is cited with good reasoning/good supported evidence
that's
> > fine, but it is important not to discount someone because he disagrees
with
> > someone important. I am not saying you did that anywhere, all of the
> > arguments you made below either make clear arguments that stand on their
> > own or cite well thought out self-contained arguments, just something to
> > keep in mind.
>
> Also be aware that this is not some weak "appeal to authority" here.
> We're talking about Bjorn and Herb.  In case you are not familiar, they
> are the creator of C++ and the convener of the standards body
> respectively.  Discounting their opinion is like saying Linus doesn't
> know what he is talking about.  The balance is surely tipped in their
> favor coming into any discussion and you better have some pretty
> significant argument against it.
>

I'm not sure why sarcasm is necessary here.  It's not conducive to a
productive
discussion.

Yes, we're all familiar with Bjorn and Herb and their roles in the
standards and
the C++ language.  Doesn't mean they're never wrong.  There's no
discounting of
their opinion happening.  Just simply asking for good supported reasoning.
You
asked for the same when we brought up virtual functions.  Even in the
CppCoreGuidelines abstract, it states these rules may be too strict, may
need
changing or more exceptions.  It even asks for feedback:  "Please try to
verify
or disprove rules! In particular, we'd really like to have some of our rules
backed up with measurements or better examples."  It's a work in progress.
I don't know about you, but I've been taught never to just follow rules
unquestioningly, but to ask questions.

So far, it sounds like we mostly agree on the following:

> > Re: Implicit conversion constructors and operators
> > Re: Multiple Inheritance

I don't think it hurts to call it out though.  Bear in mind, this guide
should also
help for future developers contributing to OpenBMC, and their experience
levels
may vary.

> > Re: Variables of class type with static storage duration (including
> > constants)
> >
-----------------------------------------------------------------------------------------------------
> >
> > This proposal is a key example on why it is hard to quantify these types
> > > of rules.  Everyone agrees with a general principle of "reducing
> > > globals", but as spelled out this proposal would prohibit
> > > initializer-list construction, std::string, and the pattern within our
> > > IPMI plugins of "register on load".  These are all reasonable
exceptions
> > > that require little justification in my mind.
> >
> >
> > It sounds like we agree in principle. My proposal would prohibit
> > std::string constants, but it would not prevent their general use.
> > Initializer list construction would not be prohibited for c-style arrays
> > and not its use in general; you would not be able to use it for
> > constructing variables of class type with static storage, but it
outright
> > prohibits that anyway. As written currently, yes the ipmid plugin
> > registration code would not conform, but it would be possible to write
it
> > in a way that does conform; this is actually something I am working on
with
> > my changes to ipmid, but that is a discussion for elsewhere. (I would
like
> > to be clear that this does not discourage global pointers, but in
general
> > these also would require a good justification).
>
> The only time there is unspecified behavior in the standard is when you
> are using a symbol defined in another compilation unit in your own
> compilation unit during the initialization phase of your object.
>

Are you saying this won't happen or happens infrequently?  Or that
developers won't accidentally write code that triggers this?

> I do not understand why you would want to prohibit std::string constants
> and any other initializer-list constructed data structures when they are
> not the issue.  We are often generating a data structure from machine
> description files and using the initializer-list syntax to prime those
> structures.  Having to do it outside an initializer-list requires a
> special "globalDataInit()" function which is a needless and less
> efficient hoop to jump through.

Initialize-list constructed data structures are not prohibited in general,
only for class types with static storage.

>
> In my view, I.22 already has everything covered.  It also has a better
> description of the problem areas in the "Enforcement" heading
> (non-constexpr functions, extern objects).
>
> >
> > Re: Implicit conversion constructors and operators
> > -------------------------------------------------------------------
> >
> > This would prohibit 'unique_ptr<T>::operator bool()' and
> > > 'string::string(const char*)', which are part of the STL and exist
> > > for concise syntax.  I do agree with this in most cases but there are
> > > cases where they both obvious and crisper.  A non-obvious conversion
> > > should be prohibited.
> >
> >
> > Maybe I should have called out more clearly that I do not plan on
> > prohibiting widely used parts of the styleguide that do these types of
> > things. In anycase, the smart pointers are special in that they maintain
> > the API that exists for raw pointers and it is common practice for
people
> > to check for a nullptr by simply treating it as a boolean value.
> > Nevertheless, the vast majority of custom APIs should not use this
feature,
> > and you seem to agree with this point in general anyway.
>
> I think I agree, it just feels odd to point out.  Who writes an operator
> overload "just because"?  Everyone who writes one thinks that the
> rationale for it is obvious.
>
> > Re: Multiple Inheritance
> > --------------------------------
> >
> > > Composition is fairly common practice in C++ which uses multiple
> > > inheritance.  The sdbusplus bindings, as an example, have a specific
> > > template to aid in the composition of N dbus interfaces into a single
> > > C++ class. [4]
> >
> >
> > Your argument seems to be arguing for allowing multiple inheritance *of
> > interfaces*, which I meant to allow. I said "unless at most one of the
> > parent classes is an interface i.e. only has pure virtual methods", it
> > should have read "unless at most one of the parent classes is *not* an
> > interface i.e. only has pure virtual methods". You can inherit from as
many
> > interfaces as you want, but only one (or no) non-interface. Technically,
> > your sdbusplus bindings are not interfaces, but I could see an exception
> > for that, because they are fairly special constructs. *It quacks like an
> > interface*, from the standpoint of an implementor. Also, the CCG
sections
> > you cited agree with this standpoint. C.135 specifically calls out
> > interfaces and C.136 seemed to want to discourage the behaviour in the
> > reasoning section.
>
> Right.  There is an obvious overload of the word "interface" in this
> case.  Someone reviewing what you wrote is coming at it with the Java
> language "interface" mindset: a pure-virtual class.  The DBus Interface
> bindings are clearly not pure-virtual but they are "interface-like".
>
> Again, multiple inheritance isn't something that I've seen people even
> attempt typically unless they know what they are doing.  So it seems odd
> to point it out but then exempt "obvious" usage.
>
> > Re: Non-const-non-r-value references
> > ---------------------------------------------------
> >
> > > I find this especially unnecessary, not recommended anywhere outside
of
> > > the GSG, and having a crippling effect to the language.  C++ is not a
> > > procedural / OO programming language; it is also a functional
> > > programming language.  The bulk of the STL is to facilitate a
> > > functional programming style and many of the C++11 features (lambdas,
> > > constexpr functions, etc.) take large inspiration from FP.
Eliminating
> > > l-value reference makes FP techniques exceptionally harder and
> > > syntactically clumsier.
> >
> >
> > Again, I explicitly allowed use where necessary to maintain consistency
our
> > utilize the STL.
>
> The logical conclusion of this exemption is to actively encourage use of
> non-const l-value references, which is surely not what you intend
> (though, I do).
>
>     - As an author of an API, I do not know how users are going to
>       organize their own data.
>     - As an author of an API, I should do nothing to preclude reasonable
>       programming patterns.
>     - All container iterators work by default on l-value references.
>     - Functors and lambdas to the STL algorithms work easiest on l-value
>       references.
>     - Therefore, I should conclude that my users could have a collection
>       of things and could desire to use functional programming style,
>       and I should make life easier for them by taking an l-value
>       reference.
>
> > But in general, const references, r-value references, and
> > pointers together fulfill the same purposes and make it easier to
> > understand what the code is doing. Const reference implies nothing will
> > happen to the data, it will not (outside of exceptional circumstances)
> > store that reference and will not modify it; basically it stays in the
> > function. No special semantics are required when calling the function
> > because there is nothing to consider. R-value reference implies that the
> > data will be consumed, you do not own it anymore and it does it in a
> > beautifully explicit way, std::move, which means that you know the data
> > does not belong to you anymore when you call the function. Providing a
> > pointer requires you to either be working with a pointer which is clear
> > because it has different semantics, or you have to pass a pointer
> > explicitly; either way, you know that the function is free to edit what
you
> > passed in; it should not store it because it does not own it (otherwise
use
> > a smart pointer), but is free to edit it in the context of the function
> > call.
>
> I understand what you and the GSG are saying.  I just disagree with it.
>
> In C, everything is a pointer.  I could just write all of my code to not
> use references at all and instead use 'T*' and 'const T*'.  You now
> still have no idea when you read 'foo(&bar, &baz)'.
>
> In C++ there is always const_cast, so even if something is passed by
> const it _could_ be manipulated.  (Not to say this is good practice in
> any way.  See ES.50).
>
> It is a matter of public record[3] that I have reviewed on roughly
> 50-100kLOC a year for the last 5 years and I still don't buy into this
> reviewer argument.  If someone is calling an API and I don't know what
> that API is doing, I don't give them a +1 no matter how "obvious" it may
> seem.
>
> The rough summary of my viewpoint is:
>     1) This syntax simply lulls you into a false sense of security as
>        the reviewer.
>     2) This syntax greatly diminishes aspects of modern C++ practices.
>     3) This syntax is not recommended anywhere outside of Google.
>
> > I do not see it standing in the way of FP at all, in FP you are
> > usually dealing with immutable data types which you should use as much
as
> > possible i.e. const references, in the case that is expensive you have
> > r-value references and pointers.
>
> FP style does not always mean immutable.  A toy example:
>
>     vector<int> stuff = ...;
>     auto times2 = [](auto& x) { x *= 2; }
>     for_each(stuff, times2);
>
> > Although, the CCG does say it is allowed,
> > it admits that non-const references can lead to bugs see the note under
> > F.17.
>
> It is a pretty weak argument for you to use a sectioned titled "For
> 'in-out' parameters, pass by reference to non-const" as evidence.
>
> The "error" they are talking about is a reassertion of F.16: "For 'in'
> parameters, pass ... by reference to const".  Non-const implies it could
> be changed.
>
> >
> > A very common recommendation in "Exceptional Modern C++" is to define a
> > > function like:
> > >     template <typename T> foo(T&&)
> > > This function operates on both l-value and r-value references
> > > ("universal references" as Meyers calls them).
> >
> >
> > I think I am missing something here. l-value and r-value references are
two
> > different things and are treated differently by the compiler. I tried
the
> > following:
> >
> > int add(int&& l, int&& r) {
> >   return l + r;
> > }
> >
> > int main() {
> >   int l = 1;
> >   int r = 2;
> >   std::cout << "add(1, 2) = " << add(l, r) << std::endl;
> >   return 0;
> > }
> >
> > and got the following compile error:
> >
> > test.cc:10:42: error: cannot bind ‘int’ lvalue to ‘int&&’
> >    std::cout << "add(1, 2) = " << add(l, r) << std::endl;
> >
> > which is precisely what I expected to happen. I think I am missing part
of
> > your example.
>
> What I wrote and what you tested are two entirely different things.  I
> wrote a template; you tested a non-template.  Templates and autos fall
> under type-deduction rules.  Read up on those and reference collapsing.
>
> foo(10) => foo(int&&)
> foo(x) => foo(int& &&) => foo(int&)
>
> >
> > Re: Exceptions
> > --------------------
> >
> > > This was the "old way" in C++ and the standards body specifically
> > > deprecated what you are suggesting.  I'm sure you could read their
> > > rationale.
> >
> >
> > Sure, I would be happy to. I tried googling around and the only recent
> > update I could find relating to how the compiler treats exceptions was
the
> > addition of noexcept. Could you share this?
>
> See deprecation of 'throw(typeid, ...)'. :
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3051.html
>
> (I haven't actually read the reasoning myself, I just know it is how it
> was previously done and is now deprecated.)
>
> >
> > > In GCC, at least, function attributes can only be defined as part of
> > > declaration and not definition.  That means you would have to forward
> > > declare every function that returns your error type.
> >
> >
> > That's true. Clang allows you to put the attribute on the type
definition,
> > I forgot gcc does not allow that. Still, if the function was part of a
> > public interface, you would have to forward declare anyway. In anycase,
my
> > point was that there are alternatives to exceptions where the compiler
will
> > actually bug you if you ignore it.
> >
> > > In every large project I have been involved with, an integer quickly
> > > becomes insufficient to express "an error".  The projects all make
their
> > > own "error object".  According to Titus' talk, Google has one of these
> > > internally as well.  So that means we're now talking about passing
> > > around a heavy object by value or a pointer to one (which is always at
> > > risk of leaking)?  What does this error look like?
> >
> >
> > Also true, hypothetically we would use some sort of error type.
Probably an
> > error enum and an optional string message. Google has an open source
one we
> > could use, or we could write our own. In any case, it does not need to
be
> > heavy weight, as long as the field list is small the object will be
small.
> > For user defined types that fit inside of a few machine types, gcc and
> > clang will frequently pass them via register instead of stack. In any
case,
> > I would expect the error object to be returned so it would be subject to
> > return value optimization.
> >
> > > Having a error return type likely eliminates any RVO and again plays a
> > > huge diminishing role in using functional programming methods.  The
two
> > > approaches for output variables become:
> > >     1) Fill-in out parameter reference.  [anti-FP]
> > >     2) Return a tuple<error, real_return>.
> > >
> >
> > The approach we use in this case is called StatusOr<T>; it wraps the
type
> > being returned and the associated Status object (what we call the error
> > object). This is actually a really common pattern outside of C++ and
> > parallels the error monad type found in a lot of functional programming
> > languages (we do not actually support the fmap operation, but the usage
> > pattern is semantically equivalent).
>
> StatusOr requires everything you are returning to have a [public] default
> constructor available to StatusOr and copy operations.
>
> >
> >
> > > >         - Exceptions are not typed
> > >
> > > Exceptions are strongly typed in the exact same type system as any
> > > reference.  I don't know what this statement means.
> >
> >
> > Sorry, what I meant by this is exceptions are unchecked in C++ so you
can
> > throw any type; hence, there is no way to enforce what kinds of
exceptions
> > are thrown and that those exceptions are caught in functions that do not
> > forward them as in Java.
>
> So we are capable of writing a guideline that we should not use globals
> but we are incapable of writing a guidelines that requires all
> exceptions being derived from std::exception?  Oh, wait... E.14 already
> implies this.

Snarkiness aside. If the exception E thrown in function foo() is derived
from
std::exception.  How does this help enforce that bar() who calls foo()
either
catches or rethrows E?

My experience with exceptions is that make code difficult to read, that
could
just be a personal preference thing.  With an enumeration of status codes
defined for a function, the calling function knows all the possibly return
values.
With exceptions, I never know what exceptions might be thrown by code a few
layers deep.  And if a submodule defines a new exception, there's not good
way
to ensure upper level modules catch that exception.  It may hit a default
handler
but it won't be properly handled.

Why are you concerned with ROV?   It sounds like a premature optimization.
If
 the use of a status code or object, enhances the
readability/maintainability of
the code, I think it's worthwhile to consider.

>
> > Re: Virtual Functions
> > ----------------------------
>
> > Interfaces, and thus virtual functions are required for good unit
> > testing and mocking,
>
> I agree with the statement "Interfaces are required for good unit
> testing".  I do not agree with the "and thus virtual functions".  You
> can also solve this at link time with two alternative implementations of
> the class.
>
> I'm frankly amazed G-Mock doesn't allow this.  The only rationale I can
> come up with is that "big" processors tend to have branch predictors on
> virtual function tables while "small" processors do not, so for the
> environments Google typically targets there really isn't any thought
> put into the cost of virtual.
>
> Though, marking everything virtual also eliminates any sort of 'inline',
> so I am somewhat perplexed even on that aspect as the basis.
>
>
> Back to Big Picture
> ----------------------------
>
> I do feel we are seriously missing a broader philosophical question.  In
> what ways do you feel the CppCoreGuidelines do not address all of your
> coding guidelines suggestions?  Rather than write an entirely new
> document why would we not simply point to that and then extend it if we
> feel it is insufficient in a few key areas.  As you saw in my previous
> email, nearly everything we agree on comes from the CCG already and
> nearly everything we disagree on is a contradiction from the CCG.

Our intention isn't to come up with a new document.  In our summit
meetings, you agreed we could propose a coding guide as you didn't
have anything spelled out and you didn't want to tackle it.  So we put
together an initial proposal and we're discussing it.  If your decision is
to say that CCG is the final say for all coding practices in openBMC
then we'll have to figure out what that means for us.

> I do not want to hear that the CCG is "too long".  If you take out the
> references it is in the same order of magnitude as the GSG.  It is also
> organized in a way where each section is a straight-forward statement to
> be followed, so anyone unfamiliar or uninterested can simply read the
> TOC to understand all of the rules (ex. "I.2 Avoid global variables.",
> E.6 Use RAII to prevent leaks").

"You do not want to hear"...I don't understand why you state is in such
a manner or assume we'd make this point.

> [1] http://wiki.qt.io/Qt_Coding_Style
> [2] http://wiki.qt.io/Coding_Conventions
> [3] https://github.com/open-power/hostboot
> --
> Patrick Williams

----------
Nancy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20161107/3b2d191c/attachment-0001.html>