Code Generation in Rust vs. C++26

brevzin.github.io

129 points by steveklabnik 4 days ago

In case the author of the article is reading...

> This is because Rust’s attribute grammar can’t support a callable here.

The grammar of attributes supports it fine, it's just that Serde chooses to not use it. I'm not sure if it's because Serde started before it was allowed or if it's a stylistic preference or what.

For example, my crate SNAFU allows you[1] to use attributes containing types (`Error`) and expressions (`Box::new`):

    #[derive(Debug, Snafu)]
    #[snafu(source(from(Error, Box::new)))]
    struct ApiError(Box<Error>);

[1]: https://docs.rs/snafu/latest/snafu/derive.Snafu.html#transfo...

steveklabnik a day ago

This is also my error, and the author has been informed. He just hasn't updated the post yet, but will.
I forgot that this got stabilized. It's easy to lose track of everything sometimes!
EDIT: oops forgot to reply to this:
> I'm not sure if it's because Serde started before it was allowed or if it's a stylistic preference or what.
This was stabilized post 1.0, and serde is (as you know) older than Rust 1.0. That's why I thought that it wasn't possible. https://doc.rust-lang.org/1.0.0/reference.html#attributes says
> An identifier followed by the equals sign '=' and a literal, providing a key/value pair
Of course, we didn't even have stable proc macros at that point. I tried to dig into the exact history here for a bit, but didn't manage to find the exact point at which this came to be, it was taking too long.
OptionOfT a day ago

I'd love to read more on how you implemented this. I hope I don't sound lazy, but can you point me a starting location to read up on it?
Maybe it's something I can backport to serde.
- shepmaster a day ago
  
  Code-wise, it's not too painful [1], the problem is that you need to enable more features for syn. By default, syn doesn't compile in support for parsing arbitrary types / expressions, which does increase the time / space needed.
  Since syn is a pretty fundamental crate, I've a feeling that Serde probably doesn't want to turn on this feature for minimal gain, but that's pure speculation on my part.
  [1]: https://github.com/shepmaster/snafu/blob/1dbba9514e2abfdff01...
  [2]: https://github.com/shepmaster/snafu/blob/1dbba9514e2abfdff01...

jepler 2 days ago

A modest proposal: stop adopting new digraphs like [:, ^^, [[.

Unicode has at least 50 sets of pairing punctuation just waiting for use...

       \N{MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT} U+276c 
       \N{HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT} U+2770 
       \N{LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT} U+2772 
       \N{MEDIUM LEFT CURLY BRACKET ORNAMENT} U+2774

and multiple blocks of mathematical operators including

      ⊥ \N{UP TACK} U+22a5 ⊥
      ⊦ \N{ASSERTION} U+22a6 ⊦
      ⊧ \N{MODELS} U+22a7 ⊧
      ⊨ \N{TRUE} U+22a8 ⊨
      ⊩ \N{FORCES} U+22a9 ⊩

so you could just write the much clearer and more distinctive ⹗derive<serde::Serialize>⹘ and so on for the other new multi-punctuation sequences.

Yes, by the time C++37 comes around it might be necessary to petition Unicode to add additional code points. This wouldn't be a problem, for two reasons:

First, Unicode is accustomed to adding new code points and might even be willing to pre-allocate some entire pages to the C++ committee.

Second, the existing ways to modify emoji could apply to designated mathematical modifiers as well. For instance, ∅ would denote a distinct future C++ operator symbol from ∅ or ∅ (sadly, as of 2024, hacker news can't render "pale woman "A-type" empty set" or the other empty set symbol variations I lovingly entered in this paragraph).

These sequences are highly preferable to ASCII sequences like [: because the ZWJs allow supporting editors to correctly render them as a single glyph occupying a single terminal cell and without using font ligature hacks.

Filligree 2 days ago

Your serde example isn’t even rendering in my browser. It’s also missing from my keyboard; how do you propose I type it?
- greenavocado a day ago
  
  This isn't extreme enough. We need to use APL specific keyboards and write in APL for maximum clarity
  https://xpqz.github.io/learnapl/intro.html
- zarzavat a day ago
  
  Lean (and iirc Mathematica) use backslash escapes: you type \symbolname and the symbol is inserted by your editor.
  You can also imbue the backslash escape sequence with the same meaning as the unicode, so that in the event that the editor didn’t make this replacement it would still mean the same thing.
  - shwouchk a day ago
    
    julia also. Mathematica uses literal “escape” sequences, ie you start a symbol by pressing the esc key and finish with another esc (aside from a bunch of bindings for commonly used things).
- layer8 a day ago
  
  The most general solution is to configure yourself a Compose key [0]. Useful for typing anything that doesn’t directly figure on your keyboard.
  [0] https://en.wikipedia.org/wiki/Compose_key
- 98469056 a day ago
  
  thats the joke
  - dooglius a day ago
    
    It got me, I've seen this sort of suggestion made un-ironically enough that it seemed believable
  - mikepurvis a day ago
    
    I also assumed initially that it was serious, but upon reflection, "a modest proposal" is the educated man's /s.
fanf2 a day ago

One of the fun things about the Unicode bidi algorithm is that it flips brackets. You always use ( as an open round bracket, but if your script is rtl it appears like ). In order to support this feature, Unicode has a list of all known kinds of reversible paired brackets.
https://www.unicode.org/Public/UNIDATA/BidiBrackets.txt
For extra fun, C++ < angle brackets > are, of course, not brackets.
eximius a day ago

Is this satire? It has to be, right?
- omoikane a day ago
  
  This reminds me of Bjarne Stroustrup's proposal on overloading whitespace, and around page 4 he suggested using the full Unicode character set:
  https://www.stroustrup.com/whitespace98.pdf
  This was an April Fool's joke:
  https://www.stroustrup.com/whitespace.html
  I am not sure about the parent comment.

steveklabnik 4 days ago

Serde is a fantastic part of the Rust ecosystem, and is often credited as a reason people reach for Rust. This power and convenience coming to C++ should be a cause for celebration, in my mind. I am sad that Rust is missing an opportunity to do similar here, and hope that someone will pick the proposal back up someday.

Barry was kind enough to share a draft of this with me, and he inserted this based on some of my feedback:

> newer Rust has something called derive macro helper attributes which will make this easier to do.

Apparently I am mistaken about this, and basically every Rust procedural macro does what serde does here. I find the documentation for this a bit confusing. I've emailed him to let him know, and please consider this mistake mine, not his!

j-pb 2 days ago

> Serde ... is often credited as a reason people reach for Rust.
I've heard this narrative a lot, but I'd challenge it. I tend to actively disable serde feature flags and don't use libraries with mandatory serde dependencies.
It seems to be one of those things that are good enough™, while still falling short of being actually good.
- rusts type system is strong enough to build extremely powerful zerocopy serialization, but people don't explore that space because of serde
- macros and especially proc-macros in Rust are horrible, and are only feasible because of the syn crate
- syn itself is essentially a hack that has become so entrenched that there is little incentive to improve rusts metaprogramming story
- serde adds a ton of compile time whenever used because of the heavy macro usage
- the author of serde and syn has been at the center of a ton of drama, being at the root of the keynote incident that essentially stalled comptime reflection (Zig is eating Rusts lunch in that department) and caused a huge setback on rusts inclusivity, shipping binary blobs in serde to make a point and push an RFC (essentially abusing the power he has with serde), being the final descision maker of the latest core::time breakage, all leave a bitter after-taste for someone who doesn't aparently have any official position in the rust foundation with the appropriate accountability, yet holds an enormous amount of hidden power (and being ex-palantir also makes me a lot less inclined to give him the benefit of doubt, that these were all unfortunate misunderstandings)
I think the overall impact of serde and syn, including the authors behaviour has been a net negative for the rust ecosystem, both technically, personally, and marketing wise (the rust community is perceived as being unprofessional and overly dramatic).
Having written this down like that, makes me realised that I don't just avoid serde, I boycott it.
- steveklabnik a day ago
  
  That proc macros have significant drawbacks, and you personally weigh those drawbacks as having a lot of significance, doesn’t change that for many people, the juice is worth the squeeze. 375M downloads don’t lie.
  I agree with you that there are significant drawbacks, which is why I’d love to see reflection happen in Rust.
  > being at the root of the keynote incident
  I don’t believe this to be true, based on what I know from both public and private information. I think a lot of people don’t like dtolnay due to the Palantir thing, and decided to try and inflate his involvement.
  But even from just public information: blame lies with the decision makers. By his own account, Josh Triplett suggested a demotion, and Sage Griffin set that into motion. They’ve both since apologized. I do not believe they are solely to blame, but if you want to point fingers at individuals, they’re who I would describe as “in the center.”
  Overall the blame relies on Rust’s continual failure to do governance in a way that’s responsive and legible to its users, and continually ignores the written rules to just do whatever whoever has the soft power wants. This isn’t a unique story in open source, of course.
  > someone who doesn't aparently have any official position in the rust foundation
  The Rust Foundation does not govern most of the things you’re upset with. He is a member of the libs-api team in the Project, though.
  > being ex-palantir
  I’ll be the first to say “Fuck Palantir” but being upset at people for where they worked a long time ago just doesn’t sit right with me. Do you think that anyone who’s ever worked anywhere objectionable should be branded with that decision for life? Many people don’t like me for working at Cloudflare. Is there no possibility for change in your mind?
  If we truly want to play that game, you’re gonna have to exclude like half of the industry.
  - j-pb a day ago
    
    > 375M downloads don’t lie.
    That de-facto standartisation (and resulting monopoly due to the network effect) is exactly why serde should either be part of std or be replaced with a more heterogeneous ecosystem.
    You conveniently ignore the other incidents. Had it only been the keynote incident (including the non-apology 3 months later after he was ousted by someone else, and the therein included misrepresentation of thephds' opinion regarding his own talk) I would have brushed it off, but there is a pattern to it. The palantir thing is just the cherry on top because it is indicative of questionable moral compass to me.
    A compass which can of course change, but if I had such a controversial former employer I'd distance myself from them after having such a change of heart (which to my knowledge hasn't happened).
    Too long of a "Series of Unfortunate Events" to have this much power consolidated with a single BDFL for rusts serialization and metaprogramming story.
    > If we truly want to play that game, you’re gonna have to exclude like half of the industry.
    Completely unrelated, but maybe that's what it takes to grow as a field. As a german I have a lot less bussiness with palantir, but I'm reminded of the situation after WW2 where we had to essentially fill major political and bueraucratic roles with literal Nazis because we didn't have anyone else. The german internal intelligence service has been involved in a lot of crazy neonazi terrorist attacks to this day. Go figure.
    
    steveklabnik a day ago
    
    > That de-facto standartisation
    Sure, that's a fine opinion to have, but you were challenging my "people use serde" assertion. All I'm saying is that people do in fact use it.
    > You conveniently ignore the other incidents.
    Yeah, I submitted this post two days ago, but it hit the front page now. I was not exactly prepared to see comments here, I was on my morning walk, and typed that out on my phone. It was already getting too long.
    > including the non-apology 3 months later after he was ousted by someone else
    A "non-apology" because he wasn't the one at fault! And he was "ousted" because people were crying for his head, specifically, because of the Palantir thing. Not because he had any real thing to do with it. Yet people seem to want to blame him regardless of the actual facts.
    But sure if you want to get into the other things, now that I'm at a computer:
    > shipping binary blobs in serde to make a point and push an RFC
    It is incredibly normal to ship binary blobs in other ecosystems. pjmlp will post on basically every Rust thread that Rust isn't a serious language until it can ship binary libraries. People care about compile times, and a binary serde would help with that.
    With ineffectual Rust leadership, sometimes just doing the thing is the only way to actually get things done. I don't see responding to the needs of your users in spite of other people dragging their heels as "abuse of power."
    > being the final descision maker of the latest core::time breakage
    Do you have a citation on this one? I agree that this probably shouldn't have happened, but given that it took five minutes to recognize that it happend, type "cargo update -p time", and then move on with my life, it's not a huge deal to me. But Rust decisions aren't usually made by a single person, so that he personally somehow made the call would surprise me.
    > (which to my knowledge hasn't happened).
    As someone who is reasonably well known on the internet, it's still wild to me how much people expect others to put out in public about their personal lives. Asking for someone to publicly denounce an employer from years ago is just weird. People are free to keep their private life private.
    Furthermore, even if he did do that, it's pretty clear that the folks who talk like this:
    > Too long of a "Series of Unfortunate Events" to have this much power consolidated with a single BDFL for rusts serialization and metaprogramming story.
    wouldn't even accept the apology. It's been years of casting every single thing into the worst light, in bad faith. I am glad that he is seemingly unphased by what I would describe as a borderline harassment campaign.
    
    j-pb a day ago
    
    > Sure, that's a fine opinion to have, but you were challenging my "people use serde" assertion.
    That wasn't my intention at all. I was challenging the narrative that serde has been one of the best things that happen to rust since sliced bread/borrow checking.
    I'm gonna reuse a reply to a different comment because I feel it captures the gist of my opinion, and because the dtolnay drama was one point among many, and is again generating more emotional effort than it's worth:
    But my point, is that his crates (and thus he) hold too much power, even if there had been 0 incidents. That's not his failure per-se, but it's rusts failure if a single person can hold this much power, and be involved with this many shit-storms, without someone stepping up and being like "maybe we should put the decisions for this essential infrasructure on more shoulders". Even if he was the rust messiah himself, I wouldn't be comfortable with a Bus factor of one, for something that "is often credited as a reason people reach for Rust."
    We should stop putting serde on a pedestal and call it what it is, an ok serialization library, that falls short of the potential that rust has in that space.
    We should also call syn by it's name. A lisp-style macro system to polyfill the shortcommings of rusts macros, that has become rusts de-facto macro system for anything that goes beyond the simplest declarative macros at 600.000.000 downloads.
    No matter how you and I feel about the guy behind those things, it seems obvious to me that we should strive to replace serde with something better, and either replace the need for syn with a system for compile time introspection (I don't have high hopes for this anymore), or pave the cow paths and integrate the existing API into core (as a side effect also cutting down on compile times).
    
    jujube3 a day ago
    
    Feel free to step up and support your own serialization library. You too can have this "power"
    Note: Your budget will be $0, your reward will be people hurling insults at you.
    
    j-pb 19 hours ago
    
    That's not how network effects work.
    
    steveklabnik a day ago
    
    > That wasn't my intention at all.
    I see, I misunderstood you, my apologies.
    > That's not his failure per-se, but it's rusts failure if a single person can hold this much power
    I do agree that bus factor is important. Unfortunately, this seems to happen in a lot of ecosystems: https://xkcd.com/2347/
    
    Ygg2 a day ago
    
    > But my point, is that his crates (and thus he) hold too much power, even if there had been 0 incidents. That's not his failure per-se, but it's rusts failure if a single person can hold this much power, and be involved with this many shit-storms, without someone stepping up and being like "maybe we should put the decisions for this essential infrasructure on more shoulders". Even if he was the rust messiah himself, I wouldn't be comfortable with a Bus factor of one, for something that "is often credited as a reason people reach for Rust."
    Man supports thanklessly a set of libraries that evolve on basically every minor rust version (syn/quote) for 6 years with minor hickups, and you want to Bus him? Over holding a seat at libs table? That's not a BDFL, that's seat at the round table.
    Hic Rhodus, hic salta! You try doing the same.
- aabhay 2 days ago
  
  The fact that most crates are compiled with serde as feature flag and the fact that you can raise a PR on most things that aren’t — are both great cases for Rust having an awesome ecosystem. Just as with async support, serialization support is a free market of ideas.
- mardifoufs a day ago
  
  >(Zig is eating Rusts lunch in that department)
  I mean, has it? I find the way it works in Zig rather clunky, and unless I'm missing something (and I know comptime reflection isn't the same), I just usually end up needing or wanting something closer to c++ templates or even consteval/constexpr. Again, I'm pretty sure this is a total skill issue and just a lack of imagination from my side, but I don't see what I could do with zig that I can't with c++ (I'm specifically referring to comptime here). I know the difference in theory, as c++ doesn't have actual reflection, but I wonder what's the practical difference in use cases.
  (In fact I used to think that not having a separate metaprogramming language was awesome, as I really disliked the language inside a language that are c++ templates but I'm not sure I agree anymore. Especially with additions like constexpr/eval)
- Ygg2 a day ago
  
  I'd like to challenge your theories.
  > syn itself is essentially a hack that has become so entrenched that there is little incentive to improve rusts metaprogramming story
  It's not a hack it's a way to expose compiler internals, by way of 3rd party crates.
  I think rust developers don't want to maintain it, because it would lead to compiler calcification, via backwards compatibility guarantees.
  > rusts type system is strong enough to build extremely powerful zerocopy serialization
  Unclear what you mean, but other than syn and quote you don't have a way to reflect and do code gen, outside of build script. Which also use it.
  Note there are alternative like mini serde and nano serde, but no library for type reflection.
  > the author of serde and syn has been at the center of a ton of drama
  By ton you mean two. First drama was that he expressed interest in Jean Heyd's project (because of reduced build times) but thought it not best presentation for RustConf. He wasn't the only one. But his vote led to Heyd burning all bridges with Rust project.
  Second is doing a speedup of serde by converting some crates to binary blobs. This broke reproducibility of some builds, even if it's not a SemVer violation.
  > being the final descision maker of the latest core::time
  [Citation needed] I've seen https://github.com/rust-lang/rust/issues/127343#issuecomment...
  That's just notes from the meeting. Do you have proof he was the deciding vote?
  > and being ex-palantir
  Ok. This is straight up conspiracy theory. It implies everyone working for Palantir is Mossad level spook.
  > the rust community is perceived as being unprofessional and overly dramatic
  Rust community can be overly dramatic and unprofessional without dtolnay. He wasn't the one harassing Actix developer via GitHub.
  ---
  Problem with your theory, is that it's disjointed and too simple. It makes sense in pieces but not as a whole.
  Dtolnay needs to want comptime reflections for compilation speet reduction, but also sabotage them. He needs to promote binary blobs but also leave the source. He needs to sabotage time for ??? And make Rust community seems deranged, when they are doing fine job on their own.
  - j-pb a day ago
    
    It might be that I'm shooting the messenger here, and that he was just the one that closed the issue.
    But my point, is that his crates (and thus he) hold too much power, even if there had been 0 incidents. That's not his failure per-se, but it's rusts failure if a single person can hold this much power, and be involved with this many shit-storms, without someone stepping up and being like "maybe we should put the decisions for this essential infrasructure on more shoulders". Even if he was the rust messiah himself, I wouldn't be comfortable with that Bus factor.
    No grand conspiracy needed. Being ex-palantir just makes me believe him more if he acts like a jerk, that he mighy actually be a jerk, nothing more nothing less.
  - cstrahan a day ago
    
    >> rusts type system is strong enough to build extremely powerful zerocopy serialization
    > Unclear what you mean, but other than syn and quote you don't have a way to reflect and do code gen, outside of build script. Which also use it.
    > Note there are alternative like mini serde and nano serde, but no library for type reflection.
    Reflection/comptime/codegen are all orthogonal to zero copy (de)serialization.
    To briefly(ish) describe zero copy by way of comparison, consider JSON: there's no way to parse a JSON object into your language's data structures without (among other things) copying any strings you come across (rather than directly borrowing a slice of the original buffer). Why not? Consider escapes: you must first unescape the string, which entails a new allocation. But this doesn't stop at strings: if you have an array, you must first parse that array (usually accumulating a copy of the elements in a Vec<Object> or similar, which in turn means that the whole array was effectively copied), rather than providing a "cursor" into the original buffer. Parsing JSON requires traversing the entire buffer and copying everything you come across.
    Protocol buffers works much the same way: because structures are variable length (and in fact, scalars are too -- integers are stored as base 128 varints), if you want the Nth element of an array, you have no choice but to parse all the proceeding elements (rather than nudging a cursor's offset by N×sizeof(Elem) -- you can't do that because the size of any element is unknown until after you've parsed it). Because you want O(1) indexing after parsing, the logical implication is that whenever you parse a protobuf message, the protobuf library parses (and copies) the entire thing, and any array/repeated field ends up as a new array allocation in your language (e.g. Vec<Elem>).
    Contrast with something like flatbuffers or Cap'n Proto: the code generated from your schema file gives you structures that (more or less) have two fields: a reference to a buffer, and an offset into that buffer. When you do something like person.age(), the offset of the age field (which is constant) is added to the offset of this person record, and that combined offset if used to index into the buffer (something like buffer.read_u32(offset)). Using a library like this gives you performance similar to what you'd have dereferencing array indices and struct fields in plain old data types in your language of choice. You don't pay in memory and processor time parsing everything up front, you only pay for the scalar memory reads you actually use (and a little bit of quasi-pointer chasing, not unlike what would happen with native structs).
    Put another way: a zero copy (de)serialization protocol is one where the on-disk format is the same as the (readily usable) in-memory format. This rules out things like string escaping (just store the original bytes), variable sized records/integers, variable length arrays stored inline with records (otherwise that would make records themselves variable length), storing pointers in records (because those pointers will be meaningless when read from disk; instead: store offsets), etc.
    You can read more about zero copy as it relates to Rust here:
    https://manishearth.github.io/blog/2022/08/03/zero-copy-1-no...
    Here's the Wikipedia article:
    https://en.wikipedia.org/wiki/Zero-copy
    Examples of zero copy (de)serialization libraries:
    https://github.com/rkyv/rkyv
    https://github.com/google/flatbuffers
    https://github.com/capnproto/capnproto
    
    Ygg2 a day ago
    
    > Put another way: a zero copy (de)serialization protocol is one where the on-disk format is the same as the (readily usable) in-memory format.
    Ok, but for it to be useful in a (Rust) program it has to be converted to a Rust type. At some point you'll have to do a conversion. Whether it's UTF-16 to String or string "false" to `bool`.
    The reflection, code gen, etc. is the answer how you convert the values auto-magically.
    
    cstrahan a day ago
    
    > The reflection, code gen, etc. is the answer how you convert the values auto-magically.
    I don't think anyone (j-pb included) is saying anything to the contrary.
    Here's what you wrote:
    >> rusts type system is strong enough to build extremely powerful zerocopy serialization
    > Unclear what you mean, but other than syn and quote you don't have a way to reflect and do code gen, outside of build script. Which also use it.
    But your response doesn't logically follow from the text you quoted (so I figured you weren't familiar with zero copy). This isn't j-pb saying that Rust's type system could be used to forgo proc-macros -- j-pb isn't saying anything about proc-macros in the text you quoted.
    To be clear, these two points from j-pb's original comment are two separate, orthogonal issues:
    > - rusts type system is strong enough to build extremely powerful zerocopy serialization, but people don't explore that space because of serde
    > - macros and especially proc-macros in Rust are horrible, and are only feasible because of the syn crate
    
    j-pb 18 hours ago
    
    Exactly! Serde is good enough that few people look for alternative ways of doing things (i.e. zerocopy), and the network effect incentivizes just integrating with it, to be compatible with all the other crates. At the same time (but orthogonal to the previous point) it serves as one of the first contact points and prime example of how to do metaprogramming in rust for new people (syn/quote and derive proc macros). This combination of being one of the first things you touch, it's age, and the fact that quote/syn IS a lot better API than what proc macros natively expose, has is in my opinion a lot of influence over (old and new) rust developer metaprogramming mindshare, and pushes the language down a specific garden path/design space. (What I'm trying to champion here is that this should be done more deliberately, much like jQuery eventually caused the cration of new querySelector web APIs.)
    
    Ygg2 6 hours ago
    
    > - rusts type system is strong enough to build extremely powerful zerocopy serialization, but people don't explore that space because of serde
    That's what I have a problem with. Pure Zero-copy parsers aren't explored because 99.9% of the time you have to escape and/or convert data to be useful.
    Let's say we create a localization library that's zero-copy. Great, now whenever we call a field, since it's zero copy and might contain an escaped value, we need to invoke the escape function on it. So every call of field actually has an overhead of a method call, know what doesn't have that overhead? Converting it once and serving it constantly. But that's not zero-copy, as per the explanation given.
    It's not due to serde, it's because people don't want or need a pure zero copy parser.
    Zero-copy parsers, as you explained, make a lot of sense if you have a packet of data, do some mapping, evaluating and send it over the wire ASAP. That's not the same use case as storing data for longer time like settings, localization, serialization, etc.
    > - macros and especially proc-macros in Rust are horrible, and are only feasible because of the syn crate
    Macros by example, don't use syn crate, they are more like regex for syntax than anything.
    Proc macros were always considered a kind of temporary feature that became a backbone of Rust. Pretty sure it was used a lot in some Servo components, plus it's kinda needed for many other things.
    Additionally, syn isn't the only way to use in proc-macro. You're 100% free to roll your own, you just have to account for all the edge cases, all minor nits in language, etc. So you can write your own buggy version of syn, or you can use syn.
  - mardifoufs a day ago
    
    I mean, not everyone working with FSB related organizations are Mossad level spooks either, but it would still be weird to not be suspicious about it. And it sure wouldn't be weird to not want to associate with said person. Not sure why that doesn't apply to organizations that are super connected to the CIA and a lot of other intelligence/security agencies.
    Though I agree that I wouldn't mind as much if I was american, it's just my "foreign" perspective.
the_mitsuhiko 2 days ago

I still feel that it was a big mistake in retrospective that we ended up with serde style macros all over the place instead of getting introspection early on. The end result is ungodly amounts of quite convoluted code that in parts is not even compatible with each other. It also greatly restricts what you can do, because you cannot be conditional on the thing you're generating for.
I'm not sure if the C++ solution proposed is the right abstraction, but I feel very confident in saying that Rust's is not the gold standard that one should strive for.
- steveklabnik 2 days ago
  
  Full agreement. Proc macros are an accident of history in many ways. I would encourage new language authors to go straight towards reflection instead.
  - cogman10 a day ago
    
    Really makes me appreciate Java/kotlin's annotation processors. They are much cleaner/clearer APIs for writing compile time generated software and the entire ecosystem has benefited from them.
    
    Twisol a day ago
    
    I, uh, cannot in good conscience call Java's annotation processing APIs "clean" or "clear"; the way they interact with the multi-round processing model, and in particular make it extremely hard to build well-behaved processors that can cope with parts of the compile-time code graph not existing until later rounds, has baffled me for a long time.
    Despite their definite flaws, though, I have to agree: compile-time code generation via annotation processing is something I think we should do more of (and invest more time into better tools for it).
    
    cogman10 a day ago
    
    > in particular make it extremely hard to build well-behaved processors that can cope with parts of the compile-time code graph not existing until later rounds, has baffled me for a long time.
    I'm not exactly sure how you can really make this particular problem better when working with compile time code generation. But what I'd say WRT cleaner and clearer, Java and Kotlin both expose much higher level type information than is available in rust's proc macros. That's mostly what I meant by cleaner and clearer. Without needing to pull in weirdo not-quite-third-party libraries, you can reflect on and generate for all sort of different type information.
    It seems to me that something like this should be possible with rust given its early transformation into high level bytecode. But I could see why the rust devs have pushed back on doing that as it'd really lock in features they might not want to lock in.
VyseofArcadia a day ago

> Serde is a fantastic part of the Rust ecosystem, and is often credited as a reason people reach for Rust.
From what little I've seen of it, from a user perspective it seems like a pretty bog standard (de)serialization library. What is so special about it that people would reach for Rust to use it instead of an equivalent library in their language of choice?
- steveklabnik a day ago
  
  There isn’t anything super novel about Serde, for sure. But there’s three important things about it, in my mind:
  1. It is extremely convenient and pleasant to use.
  2. It’s kind of like pandoc: due to its design, it’s not just a serialization library for a specific format, but instead a platform for other libraries. Want to take in some JSON and produce TOML? No issues. Those two libraries are already interoperable. See 1 again.
  3. Due to being an extremely old library, as well as a good one, everyone supports it where appropriate. This ubiquity, combined with the flexibility of 2, is a big part of what makes 1 true as well.
  This doesn’t mean that Serde isn’t perfect, mind you. But it is very good. And with Rust’s reputation as a difficult language, I think sometimes are surprised when things are actually convenient.
- josephg a day ago
  
  I think the point here is that C++ notably doesn't have an equivalent to serde. Its very hard to make an equivalent library in C or C++ without some external codegen step, manually writing parsing code, or some horrible macros.
  Mind you, it sounds like this will change with C++26. Good times!
tux3 2 days ago

As a followup, could the helper attributes feature be implemented to work how you thought it did, enough that most proc macros might want to start using it? =)
The author does have a good point when they note that the C++ compiler does the parsing and the "serde intermediate representation" for them, it's pretty nice to not have to bring your own parser for everything!
Could we give proc macro authors fast compiler-backed helpers that they can use as building blocks for their proc macros?
- mplanchard a day ago
  
  Having just written a proc macro, I'd even be happy with some higher level abstractions on top of syn, if anyone knows of any.
- steveklabnik 2 days ago
  
  I do not know! I would prefer more energy be put into revitalizing the reflection proposal than doing that, though of course open source is always about “who wants to do the work” and I won’t be doing either.
nrclark 2 days ago

Tangential, but does anybody else get real npm vibes from the rust ecosystem?
Something about the “every productive project depends on this one external package” situation really makes me uneasy. And there are language features like async that can’t even really be used without going to crates.io for a bunch of stuff that really ought to be in the stdlib.
- bryanlarsen a day ago
  
  withoutboats mentioned trying to get something like https://github.com/zesterer/pollster into the standard library. I think that's a great idea. They don't want to put an executor into the standard library because that would "pick a winner" before the ideas are settled. But pollster allows async crates to be useful to sync code, is obviously not a "winner" and its inclusion in the standard library would force crates to be properly executor-agnostic.
- steveklabnik a day ago
  
  npm, being one of the most successful package management systems in history, was an express inspiration for Cargo, sure. It’s also not without flaw, and so Cargo and npm do differ in some key ways.
  Software engineers love to talk about how code re-use is good, and keeping code small and simple is good, and then somehow get upset when a lot of modular, small, reusable code is produced and widely shared.
  - jpc0 a day ago
    
    > ... modular, small ...
    I think that's where the issue lies. It's either so small it would take 5 seconds to actually type or it is neither modular nor small... Very rarely is is one of those things never mind both of them at the same time.
  - pjmlp a day ago
    
    And sadly it shows, given the amount of crates some projects depend on, plus having to wait for the same crate to be re-compiled multiple times, due to how it is referenced across the whole dependency graph.
secondcoming 2 days ago

I'm not convinced that serialisation should be a first class citizen of a language. Mainly due to issues with unintended breaking changes causing carnage downstream. Protobuf/FlatBuffers et all are way more safe.
Especially with C++ where doing something like serialising a bool raises questions about implementation defined behaviour such as:
- what bit pattern represents 'true'
- what is sizeof(bool)
- can something serialised with gcc be deserialised with clang
- steveklabnik 2 days ago
  
  None of this is specific to serialization. This is compile time meta programming. All of those questions are handled by the code that implements serialization, not the language feature that enables that code to be written.
- cryptonector a day ago
  
  Serde supports many serialization standards. These issues don't apply.

sixthDot 2 days ago

> Now, C++ does have one code generation facility today: C macros. It’s just a very poor and primitive one. Poor because of their complete lack of hygiene to the point where you could be accidentally invoking macros without knowing it (and standard library implementations guard against that), and primitive because even remarkably simple things conceptually (like iteration or conditions) require true wizardry to implement. That said, there are still plenty of problems today for which C macros are the best solution — which really says something about the need for proper code generation facilities.

The D language has both: introspection (__traits, is, etc.) plus hygienic code generation (mixin, static if, static foreach, etc.).

germandiago 2 days ago

Plus few people use it and the ecosystem is not stellar I would say. And that is the only reason why I do not use it, probably.
It scores very high in code style and convenience, I can see there expertise in day-to-day coding actually, in the patterns it enables.
A pitty, actually :(
- sixthDot a day ago
  
  I agree, isn't it a pitty. D should have worked more. Why the heck people ignore it ? They prefear speaking about zig, odin, c3, v, nim, crystal, etc ? I eventually look at their bug trackers and see bugs that have been fixed since years in D ;)
  - germandiago 16 hours ago
    
    D is incredibly balanced as in "people who know how to write code".
    It has a very pragmatic view of things, actually and does things that do work.
    I think the main problem why D did not catch up is because its leadership (mainly Walter Bright I guess) is very focused on the core technology: compiler writing and the like and how to write software.
    This ignored things such as ecosystem, tooling or the availability of the compiler in a polished way for multiple systems. I think the part that has been ruining D is the fact that they had a very good language (good) but they did not take care of the rest of the aspects. Also, they have been quite poor at marketing it.
    I also saw complaints quite often for broken code or underspecified behavior. All these things together add up and can make people who just "want to get things done" run away.
    It is a pitty because the core of the language is beautiful and useful: you can do very advanced generic programming, good OOP, avoid the GC when necessary, slices are well-thought, introspection and code generation is stellar...
    What I said... a pitty.
    Walter Bright is a genius in my humble opinion. But a genius in his area. I think he should pay a bit more attention to other aspects, because the core is really robust, it would be about improving it, marketing it, making it more stable, making it more widely available and improving the ecosystem.
    For example, last time I heard, D language for Android was not ready... you cannot afford that nowadays if you want to get through, I think priorities after desktop should be mobile...
    Idk. All in all, the perception, unless they change something, is against them compared to newer languages such as Rust (which I really do not like that much, even if I know its merits).
    So, the only way I see to make this perception change would be with a more targeted roadmap and an announcement that has impact in the marketing sense and a commitment that it is going to happen. The problem with this it is that it needs resources and direction, and projects like this are in a big part community-driven, so it is not easy to deal with something like this.
steveklabnik 2 days ago

Andrei Alexandrescu is involved in this work, they are aware of D for sure.
oneshtein 2 days ago

It's possible to use another macro languages with C/C++. For example, cvstrac uses «translate» tool to replace `@` with `cgi_printf`.
- int_19h a day ago
  
  So long as it's a separate facility that is not aware of C++ syntax and semantics, you get the same problems wrt hygiene and lack of ability to resolve symbols.
- MiguelX413 a day ago
  
  It's significantly less portable to do so.

Joker_vD a day ago

Ah, the problem of detecting the last (or the first) element and treating it specially:

        bool first = true;
        [:expand(nonstatic_data_members_of(^^T)):] >> [&]<auto nsdm>{
            if (not first) {
                *out++ = ',';
                *out++ = ' ';
            }
            first = false;

            out = std::format_to(out,
                                 ".{}={}",
                                 identifier_of(nsdm), m.[:nsdm:]);
        };

I really wish there were a for/for-each variant that straightforwardly supported interspersed actions, something like this:

        [:expand(nonstatic_data_members_of(^^T)):] >> std::make_tuple([&]<auto nsdm>{
            out = std::format_to(out,
                ".{}={}",
                identifier_of(nsdm), m.[:nsdm:]);
        }, [&]{
           *out++ = ',';
           *out++ = ' ';
        });

An actual for-loop could also get a simple extension (which is quite easy to compile efficiently) but I can't figure out an intuitive enough syntax:

       for (int i = 0; i < 101; i++) {
           std::cout << i;
       } then {
           std::cout << ", ";
       }

is a bit fishy because the "then" block (can't call it "else", obviously) technically happens before the main loop body (i.e. inside it, the values of i would be 1, 2, ..., 100) but writing it before the main body would be even more confusing. Any suggestions? I'd really like to figure this feature out because I personally find it more useful than the "else block, but for the loops" feature.

pbsd a day ago
Normal for loops can't really make that work, they're too general, but range for loops plausibly could. Something like
```
    for(auto&& e : range) {
        std::print("{}" e);
        join {
            std::print(", ");
        }
    }
```
where join {} is effectively syntax sugar for if(std::next(__first) != __last) {}.
The fmt library makes this sort of task easier by providing the join adaptor; this example would become
```
    fmt::print("{}", fmt::join(range, ", "));
```
wrs a day ago

My favorite way of handling this particular case (in any language) is to express the elements as a sequence (preferably a lazy one that doesn’t actually allocate) and use whatever the stdlib calls “join” (my favorite name is Haskell’s “intercalate”) to insert the delimiters. I haven’t used C++ in decades, but Rust typically makes this pretty easy to write, and low-cost.
saurik 10 hours ago

It happens before the main loop body but it doesn't happen before the first loop, and it happens after the increment, so I frankly read it initially with skeptical eyes but then felt like it was more intuitive than it was being made out to be? (And FWIW, I am someone who has used for-else on occasion, yet have never found it intuitive.)

leni536 a day ago

  int i = 0;
  while (true) {
    cout << i;
    ++i;
    if (i == 101) { break; }
    cout << ", ";
  }

Joker_vD a day ago
That's too much branching.
```
    int i = 0;
    goto main_part;
    do {
        cout << ", ";
    main_part:
        cout << i;
    label_for_continue: // yeah, what about "continue"?
        i++;
    } while (i <= 100);
```
That's what most for-loops get transformed to during the translation anyhow (with the condition check on the bottom), but writing this by hand? Ugh.
Edit: actually, looking at Godbolt, it seems that gcc compiles your suggestion into exactly this shape, while clang splits the first iteration out, into something equivalent to
```
    cout << 0;
    for (int i = 1; i < 101; i++) {
        cout << ", ";
        cout << i;
    }
```
Hmm.

cherryteastain 2 days ago

For the specific domain of serializaton/deserialization, the reflect-cpp [1] library (C++20) can serialize/deserialize arbitrary structs to/from several formats including json and yaml without any sort of tagging or traits

[1] https://github.com/getml/reflect-cpp

adzm 2 days ago

I'm still trying to figure out how it manages to get the member names at compile time!
Turns out it uses source_location and parses the string all at compile time!
https://www.reddit.com/r/cpp/comments/18b8iv9/c20_to_tuple_w...
- gpderetta 2 days ago
  
  In the grand tradition of C++, the compiler already knows everything the programmer wants, it just has to be coerced with increasingly convoluted incantations to give it away.

weinzierl 2 days ago

"Introspection — the ability to ask questions about your program during compilation"

Having lived in a Java bubble for some time compile-time introspection sounded like an oxymoron to me when I first heard it. Now I realize, there are worlds where introspection is understood to be at compile-time with such matter-of-factness, that it's not even worth mentioning.

steveklabnik a day ago

Yes, this is a good point. In a C++ context, RTTI already exists, and so for the intended audience, they would already understand that this is about compile time reflection. But sometimes when things hit a broader audience, there’s opportunity for misunderstanding.

coldcode 2 days ago

As an aside, having used C++ in the early 90s when it was first available in stand-alone compilers (rather than a preprocessor), it boggles my mind that we are seeing C++26.

bluGill 2 days ago

Once a language becomes popular it never really goes away. COBAL has been on the list of things to never write code in because it is so bad (COBAL was one of the first languages what makes it bad sound good, we only know they are bad ideas because we tried them extensively enough to know. As such I cannot fault COBAL for being bad).
C++ also is extremely popular and so it will take years to go away if something better exists. I'm not convinced Rust is better - it has some good points that I find interesting but there are also some trade offs that need to be made and so I'm not sure they are better.
- Narishma a day ago
  
  Did you mean COBOL?
  - bluGill 13 hours ago
    
    Probably, spelling is not my strong point.

thechao 2 days ago

Is it me, or is the C++ reflection syntax goofily obtuse on purpose? Years ago, in grad school, I modified GCC to add operator overloading "for the rest of C++".:

    operator if (C, T, E);
    operator ; (A, B);
    operator {} (A);
      :

And just used pattern matching, expression templates, and an unjustified belief in the good will of the compiler to cover reflection (introspection & generation).

I refuse to believe the first thing I thought of ~20 years ago is somehow more elegant than this proposal...

steveklabnik 2 days ago

This blog post doesn’t show the final syntax, there is another paper in progress that simplifies things.
- thechao 2 days ago
  
  Jaakko (Järvi), Gaby (dos Reis), and especially Bjarne would foist syntax on us graduate students just to see how we'd "naturally" respond to language feature proposals quite early on in the feature development cycle. That'd provide immediate feedback, because syntax can become a pretty unworkable constraint & torpedo a great idea. That's why Bjarne's original proposal for unified initializers was changed: none of us could figure out WTF the code was saying.

aabhay 2 days ago

I don’t understand the argument. Rust proc macros have some level of introspection. I can retrieve the type information from the AST. The reason you can’t always know whether something has a trait is because that constraint isn’t implemented at the type level. For example what if a trait is a subtrait of something inside the proc macro?

While I don’t have evidence, I’m guessing that true introspection would require turing completeness, allowing your code to never compile.

steveklabnik a day ago

I don’t know what argument you think is being made here, this is just an explanation of some features. It’s not making an argument.
- aabhay a day ago
  
  It seems to be making the argument that C++ 26 has new reflection features that outperform Rust code gen due to the unavailability of high quality proc macro features.
  - steveklabnik a day ago
    
    I see. I genuinely think the author is just trying to show off a new C++ feature, not argue that it's better than the Rust feature is. Of course, since they are different, there is comparison necessary, and proc macros aren't perfect. But it's really just about explaining why this is a cool feature, not trying to say Rust is bad.
    
    wrs a day ago
    
    Unless I misread, there is an assertion that Rust macros have to implement a lot of “compiler” logic themselves (whether or not through a separate crate) that is [will be] unnecessary with the new C++ feature. So not “bad”, but more clunky for the annotation implementor.
    
    steveklabnik a day ago
    
    For sure. That's just a statement of fact, though. This is why folks are interested in Rust having reflection as well.

xxljam 2 days ago

The C++ examples approach seems harder to read and not sure if it's more powerful or not than the rust one.

Also, I'm wondering if both are more powerful than C# source generators. I think they maybe are but not much in practice, at the same time being harder to use and debug. In simple terms, source generators are just libraries used by the compiler that generate new source code files that are then added to the compilation.

tempodox a day ago

As an aside, it's nice to see how Rust is giving C++ a run for its money.

I'm looking forward to these new capabilities.