-
-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement bit field sanity checks #20848
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request and interest in making D better, @rikkimax! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#20848" |
e7d3b0f
to
17b71bc
Compare
17b71bc
to
b0c0631
Compare
Good news, all the analysis is doing its job. That makes me happy. Bad news, I'm not sure what to do about the tests, since this is generating an error not a warning. @WalterBright I'm ok with a warning over an error, what do you want to do about this? EDIT: done. |
363fa57
to
50d38d2
Compare
Okay, think I got the test suite sorted out. I've checked with @schveiguy, he is happy with what problems it's finding. This should solve our concerns. |
50d38d2
to
6de824f
Compare
6de824f
to
53f5bb4
Compare
Thank you.
prints 4 on one machine and 8 on another The approach I was contemplating is that if a different layout was produced when compiling for Microsoft vs non-Microsoft, then issue the error. Otherwise, I'd expect a number of false positives. The implementation code here does not appear to do that. This will also make it sensitive to |
Shouldn't all C scopes have the C linkage set? Yes it won't consider previous alignment, that is ok. It doesn't matter where the start byte offset is. If you do care about start byte offset, you can pack it with |
I don't see a reason to check against |
Okay looks like the C linkage is set for ImportC: https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfields.c#L120 Never caused an error, and I'm pretty certain it should have. |
It alters the behavior of anonymous fields.
The solution I was anticipating was if the layout is different between platforms, then and only then flag it. That means emulating the layout algorithm.
This is the crux of the issue. Why does one not care about field alignment portability, yet care about bitfield layout portability? Nobody in the C world seems to care, why should we? (Numerous google searches turned up nothing significant, other than Linus saying if you do care, use explicit shift and mask.) |
Yes, I saw them in the test suite. struct F {
ubyte :1;
ubyte v:1;
} That's an error in extern D. You can't introspect it. We did discuss it, and honestly, it's kinda insanity to allow this in extern D.
Indeed a very admirable goal that I thought was possible from what you've been saying. So to a certain extent, we just gotta go: "use your CI to detect all combinations you care about". No matter what we do, its gonna fail at some point.
Keep in mind that the C folk don't have this guarantee that this PR adds. Therefore they have no choice but to go: "don't use language bitfields". We have it, therefore we can rely on it and use it in a lot more places. |
Nothing accounts for all compilers. We explicitly only account for the "associated C compiler".
Right, because in the last 35 years C has made no movement towards such a guarantee. They evidently do not care. There are millions of C users and google search turns up next to nothing on the issue. Why is it a monumental problem for D?
They do have a choice. It's trivial to write portable bitfields in C. In D it is, too, or one can use std.bitmanip. This whole thing is an exercise in waste of time. The Frankly, I suspect I am the only D user who even wants to use bitfields, or people would be asking "why don't we merge it?" This has gone on for several years now. So why not merge the bitfields, I will use them, and everyone else can go on doing what they do, and everyone will be happy! P.S. I'm sorry to unload on you, this is not remotely your fault. I apprecate the efforts you have made here. P.P.S. Bitfields would make the DMD front end source code significanly easier to read and maintain. And it will mesh perfectly with the gdc and ldc backends. It's just sad that we can't use it. |
If these are the only 2 C layouts we know of, this seems like a reasonable mechanism. |
I would like to use bitfields for the new GC. But they have to be precisely laid out. Having a guarantee they stay as expected would be very beneficial. (Note, the new GC currently only compiles with SDC, so some work needs to happen to get access to bitfields). |
Send me the layout you want, and I will send back the precise bitfield declaration for it. |
I know you're feeling frustrated about this, and yes I do want them too. But it does add a level of unknown movability that the compiler can do without a way for me to just slap on And yes this is 100% on the C designer's todo list to improve C's bitfields. I know it's on JeanHeyd Meneide I asked a while back. They have a rather long to do list though! |
Here is what I suggest: We merge this and turn on bitfields. In a couple of years time, we'll know if it is doing its job or not. If not, we can swap it for something more layout engine-aware, which is something you are having trouble with anyway. I'll even do the PR to turn it on! |
@rikkimax I'm curious what the C proposal is. (There are a lot of C proposals, very few make it into the Standard. A proposal doesn't mean the wider community is interested in it.) |
No proposals currently that I am aware of and I did check their documents listing. It's certainly not at the top of anyone's todo list. But regardless, I don't want to be debugging someone's code and it turns out that a simple bit width count wasn't enough to understand a packed struct. The prospect of that does not bring joy! |
I understand your point. It's great to have a mechanical check that finds errors so one doesn't have to debug. I've used linters and very pedantic error checkers. I soon abandoned them all because the errors given were false positives and the recommended fixes were unattractive. There's a sweet spot in there somewhere, where the real errors are found without being an annoying nag. It's hard to know where that spot is; the best we can do is rely on experience as a guide. (I've also abandoned entire languages (i.e. Pascal) because of the overbearing error checking.) |
I know how to do it correctly. I'm saying, it would be nice if the compiler is also helping me by making sure I don't make a mistake (or anyone else working on the code). "Send it to Walter if you want it to work" doesn't scale. Note, we wouldn't be having this discussion at all if some C compilers hadn't decided to do clever tricks to pack bits. Requiring the specification of every bit alleviates any of these problems. |
@schveiguy I appreciate your concerns. C programmers have been wrangling bitfields for 30 years, and there's a complete lack of enthusiasm there for making it spec defined rather than implementation defined. There's also no interest in dealing with the implementation-defined aspect of struct layouts, and not a single D programmer has complained about that. As mentioned before, there are 3 use cases for bit fields
There would only be concern about (3). But the compiler cannot know this, and so for (1) and (2) it will be annoying the user with errors that are not errors, and further annoying the reader with the addition of ugly syntax. I suspect also that (3) will be a minority use of bitfields. The user can use std.bitmanip for them, or (as Linus has suggested) use explicit shift and mask. Bitfields should be incorporated into D. If any significant problems develop, we can address that then, maybe as a warning. But I seriously doubt they will surface. Fixing problems where the cure is worse than the disease is not a good path (I've seen this with other languages). |
As most of my D programming work is about reverse engineering and strange architectures (and porting related C code), I am constantly complaining about that. I would love it if I could define a struct just once and have it work no matter where I compile it. That consistency takes a lot of annoying support code. Half my structs are littered with I already understand that nobody cares about that, but please don't pretend programmers that work with these things and have opinions about them don't exist.
I have ported at least a dozen C libraries and used several times as many. (1) is completely irrelevant. No sane library interface uses them, and they're even easier to fix up while porting than multi-dimensional arrays are. You have mentioned an "associated C compiler" countless times in these discussions, but that's a myth. Any particular program can involve multiple C compilers, and with closed-source software, you may not even be able to identify which ones! I don't even see any DMD options to select said compiler! Of course, C compilers don't have that option either.
Of course (3) will be a minority use. With bitfields being made intentionally useless for that, why WOULD anyone use them? If we had bitfields good enough for (3), we wouldn't even NEED "associated C compiler" implementations of bitfield layouts, because we could just emulate them in user code!
What disease is this feature curing, again? |
Perhaps not on its own. But clang does have the notion of whether it's doing MSVC based codegen. It's used to differentiate against MinGW. It's supplied as part of the arch for the target. |
and 2. My concern is about 2. I want bitfields to be laid out in an exact fashion, because the layout is important. I don't want that to change on platforms because the C compiler decided something else. Now, I know how to do it. You just specify all bits with the appropriate underlying type. This isn't hard to do. What if we have a UDA FWIW, pretty much the only time I have seen bitfields is (3). It's not uncommon. Here is a giant project that does just that, for a ton of embedded devices: https://github.com/espressif/esp-idf/blob/c5865270b50529cd32353f588d8a917d89f3dba4/components/soc/esp32s2/register/soc/i2c_struct.h#L14 |
Better UDA name: |
I don't see why an attribute is needed for other fields. Both C and D have packing support which covers that use case. It is only bit fields that haven't got a solution to it. |
Show me a layout you want, and I'll return a bitfield declaration that matches it, guaranteed. |
I understand that you want to make this a technical problem, but it is not a simple "provide alternative solution" kind of problem. My concerns are very human-oriented. When assisting other people. You do not see the entire code base. You may not even be in the right binary, let alone struct. Identification in this scenario may be impossible without compiler help, and it will happen. Us long-timers will be the ones paying the price. We will be making the exact same recommendation as C users do. It's not worth you using them, it's a bad feature. This is an awful conclusion, I don't want this for D. |
@Herringway with If not matching what C does is important for your application, std.bitmanip sounds like the best solution for you? The "associated C compiler" is the one that the D implementation is designed to mirror. That includes mirroring the implementation defined behavior of the C compiler for both alignment and bitfield layout. Mixed C/D programs exist, DMD itself is a prime example. If I started using bitfields in the compiler today, the gdc/ldc backends will continut to work without modification. If no sane C library uses bitfields in its interface, then bitfield layout is a non-issue for D. The dmd option to select an associated C compiler is the The "disease" is having error messages for perfectly correct code, and the ugly syntax required to turn off the messages. BTW, some people on the forum have asked me on how to achieve particular layouts in a portable manner, and I have obliged. I extend the same offer to you, with the provisor that you ask in the forum so that others can see how it's done. It isn't tricky nor hard. |
@rikkimax if bitfields do not work for you or the people you assist (I appreciate you helping them!) then do not use bitfields. I am familiar with Linus' rationale on them. I don't mind if you adopt it and recommend that people not use bitfields. I, however, wish to use bitfields which will make my code easier to write, read and understand. I've been using them for 40+ years and never had an issue with using them. (Although with implementing them I relied on help from @ibuclaw.) |
No, they don't. I do not want align(1) or pack(1). What I want is an error if I accidentally, or some random architecture that I wasn't aware of, decides to place a pad in my struct. |
ARM and AArch64 are bi-endian. But that's besides the point - big endian integers are common in file formats and network contexts (where it's called network order instead). For any feature like this to be useful to me, it needs to guarantee an exact bit layout no matter what operating system or CPU is involved and give me the same values without a ton of boilerplate. I think this would require even more, since I can't use my BigEndian/LittleEndian templates.
This is my current approach. Does not play nicely with named constructors, unfortunately.
Nearly all D programs are mixed C/D programs. They depend on a libc. I've written some of the few programs that aren't mixed C/D (Terrible experience, do not recommend. If anyone reading this cares, there's dlang/project-ideas#108)
This is an unreasonable abstraction. Each OS has multiple associated C compilers.
I was asking what disease C bitfields in D were meant to cure.
You vastly underestimate the insanity of existing hardware interfaces. You do NOT want to be doing this for long. (Show me a sane interface for SNES PPU OAM entries. Integers with non-contiguous bits! woohoo!) More importantly, I just don't have the patience to wade through that toxic environment and wait for you to answer. It's far easier and much more pleasant to just hack together something else with bit shifting and/or std.bitmanip (and it's not particularly easy or pleasant...). I just wish a better option was available. I have a lot of other problems with C bitfields, but I think this line of discussion has gotten off-topic enough already. I have no use for this feature and see no path to changing that, so I won't comment any further on this. |
I understand that. The problem is when the programmer needs the pad, he gets a false positive error message. |
@Herringway Big endian - __traits can introspect a struct, and determine all the members and their size and locations. Bswap each one should do it. You shouldn't need to look into the bitfields themselves, just the enclosing member.
I don't see any practical way to support arbitrary bitfield layouts on the same machine. It seems crazy to me write a C compiler for a machine that does not lay things out the way the dominant C compiler does. However, you can explicitly lay them out yourself:
The sum of the sizes is 32, do you have any C compilers that won't lay this out as specified?
for a portable layout,
That gives a nice visual look at exactly what you're trying to do. I don't know any C compiler that would do something different. |
@Herringway You vastly underestimate the insanity of existing hardware interfaces. You do NOT want to be doing this for long. (Show me a sane interface for SNES PPU OAM entries. Integers with non-contiguous bits! woohoo!) I am currently dealing with the AArch64 instruction set, which has pretty wild and wacky instruction encodings. I agree it is fairly tedious work to construct them, including non-contiguous encodings. Fortunately, I only have to do each encoding once! |
I've been thinking of what @WalterBright keeps saying, about not understanding why we don't care about field alignment and I think I have an answer. Here is how I model it typically:
This is a very pragmatic view on alignment, where the padding is owned by the preceding variable. This is due to only needing to pick padding in one of two cases:
This is not how compilers work, the padding is owned by the proceeding variable.
You assign the preceeding variable padding, by setting the alignment of the next variable. If Consider: struct Foo {
ubyte field1;
ushort bf:15; // offsetof = 2
} But if I were to slap struct Foo {
ubyte field1;
align(1) ushort bf:15; // offsetof = 1
} What this is showing is that alignment of fields is a solved problem. We don't need to consider it here. It does not match the needs of the average user, and if they need to consider it they can. |
that's not a good test case. If you want to see problems, then do Note that align(1) results in unaligned reads, which are expensive on all platforms, but on ones that don't support unaligned reads, it's converted into multiple instructions to do algined reads and then bit-shift. align(1) is not what is desired. We want to ensure compiler doesn't abuse the padding to adjust the layout. i.e. if you know there should be 1 byte of padding, you have to declare it. If you know it should stick the bits in that byte, then you have to declare that. |
It's a good thing all types in D are fixed size then (sans size_t/pointer), so layout is always predictable! Unless I'm missing something, in all examples I've seen so far, the width of a bit-field crosses the alignment boundary of a struct. |
is it? Do 64-bit types get aligned to 64 bits on all 32-bit architectures? I thought there were some instances where the layout is not the same. |
For all fundamental types (ints, floats, characters, boolean - not complex types), the alignment defaults to the largest power of two that divides the size of the object. For field alignments (because these can differ from data alignments), it's not going to be a D friendly target if the biggest alignment is less than 64 bits. |
No description provided.