That is incorrect, Windows never adopted the LP64 model. Only pointers were increased to 64-bit whereas long remained 32-bit. The long datatype should be avoided in cross-platform code.
uint64_t is a bit verbose, many re-def this to u64.
I always understood the native types to be the "probably most efficient" choice, for when you don't actually care about the width. For example, you'd choose int for a loop index variable which is unlikely to hit width constraints because it's the "probably most efficient" choice. If you're forced to choose a width, you might choose a width that is less efficient for the architecture.
Is that understanding correct? Historically or currently?
Either way, I think I now agree that unspecified widths are an anti-feature. There's value in having explicitly specified limits on loop index variables. When you write "for(int32_t i; ...)", it causes you to think a bit, "hey can this overflow?" And now your overflow analysis will be true for all arches, because you thought about the width that is actually in use (32-bits, in this case). It keeps program behavior consistent & easier to reason over, for all arches.
That's my thinking, but I'd be interested to hear other perspectives.
This itself is a platform-specific property, and is thus non-portable (not in the sense that your code won't run, but in the sense that it might be worse for performance than just using a known small integer when you can).
You can go look up how the 32-bit protected mode got hacked on top of the 16-big segmented virtual memory that the 286 introduced. The Global Descriptor Table is still with us on 64-bit long mode.
So, its not PAE that is particularly hacky, its a more broader thing with x86.
ooh, found a link to a UNIX Open Group white paper on that discussion and reasoning why LP64 should be/was chosen:
https://unix.org/version2/whatsnew/lp64_wp.html
And per Raymond Chen, why Windows picked LLP64: https://devblogs.microsoft.com/oldnewthing/20050131-00/?p=36... and https://web.archive.org/web/20060618233104/http://msdn.micro...
For some history of why ILP32 was picked for 1970s 16 to 32 bit transition of C + Unix System V (Windows 3.1, Mac OS were LP32) see John Mashey's 2006 ACM piece, partcularly the section "Early Days" sechttps://queue.acm.org/detail.cfm?id=1165766
No peanut gallery comments from OS/400 guys about 128-bit pointers/object handles/single store address space in the mid-1990s please! That's not the same thing and you know it! (j/k. i'll stop now)
Unfortunately I've read articles where quite-more-respected-than-me people said in a nutshell “no, x32 does not make a difference”, which is contrary to my experience, but I could only provide numbers where the reply was “that's your numbers in your case, not mine”.
Amazon Linux kernel did not support x32 calls the last time I tried, so you can't provide images for more compact lambdas.
Alas, my SmartOS test system is gone, or I would show you.
smartos$ uname -a
SunOS smartos 5.11 joyent_20240701T205528Z i86pc i386 i86pc Solaris
Core system stuff: smartos$ file /usr/bin/ls
/usr/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (Solaris), dynamically linked, interpreter /usr/lib/ld.so.1, not stripped
smartos$ file /bin/sh
/bin/sh: symbolic link to ksh93
smartos$ file /bin/ksh93
/bin/ksh93: ELF 64-bit LSB executable, x86-64, version 1 (Solaris), dynamically linked, interpreter /usr/lib/amd64/ld.so.1, not stripped
And then the pkgsrc stuff: smartos$ which ls
/opt/local/bin/ls
smartos$ file /opt/local/bin/ls
/opt/local/bin/ls: symbolic link to /opt/local/bin/gls
smartos$ file /opt/local/bin/gls
/opt/local/bin/gls: ELF 64-bit LSB executable, x86-64, version 1 (Solaris), dynamically linked, interpreter /usr/lib/amd64/ld.so.1, not stripped
Sure, that doesn't change pointer sizes, but it would have reduced the impact of the different 64-bit data models, like Unix LP64 vs Windows LLP64
(1) DX: typing "int" feels more natural and less clunky than choosing some arbitrary size.
(2) Perf: if you don't care about the size, you might as well use the native size, which is supposed to be faster.
In Java, people do use the hardware-independent 4 byte ints and 8 byte longs. I guess (1) matters more, or that people think that the JVM will figure out the perf issue and that it'll be possible to micro-optimize if a profile pointed out an issue.
I don't think this is a reasonable take. Beyond ABI requirements and how developers use int over short, there are indeed requirements where the size of an integer value matters a lot, specially as this has a direct impact on data size and vectorization. To frame your analysis, I would recommend you took a peek at the not-so-recent push for hardware support for IEEE754 half-precision float/float16 types.
I don't see the relation to fp16; I don't think anyone is pushing for `float` to refer to fp16 (or fp64 for that matter) anywhere. `long double` is already bad enough.
I think you got it backwards. There are platform-specific ints because different processors have different word sizes. Programing languages then adjust their definitions for these word sizes because they are handled naturally by specific processors.
So differences in word sizes exist between processors. Either programming languages support them, or they don't. Also, there is also specific needs to handle specific int sizes regardless of cpu architecture. Either programming languages support them, or don't.
And you end with "platform-specific integer widths" because programming languages do the right thing and support them.
Some more discussion: https://news.ycombinator.com/item?id=41768144
There's endless actual pictures of processors from both eras. Using actual images here would have been as fast, possibly faster, than the process of writing a prompt and generating this image.
I see someone else commented that it's probably due to copywrite/licensing. I agree there too. That's a shame. So, because of usage policies we end up with AI generated pictures that aren't real, aren't accurate and usually off-putting in some way. Great.
When all you have is a hammer...
Click "tools" then "usage rights", pick "creative commons", pick an image.
Now search for "core cpu" and pick a second image.
Yeah that sure was hard and time consuming!
Step 1: go to Wikimedia Commons
A practical use could be e.g. using bit fields can be convenient, e.g. having 32-bit indexes, with the higher bit for the color in a Red-black tree. And in case of requiring dynamic-sized items in the tree nodes, these could be in different 32-bit addressable memory pools.
Just like the instruction pointer which implicitly increments as code executes, there are some dedicated data-pointer registers. There's a dedicated ALU for advancing/incrementing, so you can have interesting access patterns for your data.
Rather than loops needing to load data, compute, store data, and loop, you can just compute and loop. The SSRs give the cores a DSP like level of performance. So so so neat. Ship it!
(Also, what was the name of the x86 architecture some linux distros were shipp8ng with 32 bit instructions & address space, but using the new x86-64 registers?)