> My big problem with Zig is that Andrew Kelley is promising a lot of features, ...

Kwpolska · on Jan 17, 2022

Why does something as basic as uppercasing a string or decoding latin1 require a third-party library? I would expect that to be part of stdlib in any language. Also, why does that third-party library come with its own string implementation? What if my dependency X uses zigstr but dependency Y prefers zig-string <https://github.com/JakubSzark/zig-string>? Basically all languages designed in the past 30 years have at least basic and correct-for-BMP Unicode support built-in/as part of stdlib. Why doesn’t Zig?

c-cube · on Jan 17, 2022

That's not "simple". Rust also does neither of those two tasks with just the stdlib!

- latin1 is dead and should be in no stdlib in 2022 - uppercasing requires the current Unicode tables, so, a largish moving target that you probably don't want to embed in small programs.

tialaramex · on Jan 18, 2022

Latin-1 is actually the first 256 code points from Unicode. So, you can do that in Rust by casting u8 (the Latin-1 bytes) into char (Unicode scalar values). That's unintuitive perhaps because of course in C that wouldn't do anything useful since the char type isn't Unicode, but in Rust that's exactly what you wanted.

In this environment you might very well not need actual uppercase/ lowercase but only the ASCII subset. Accordingly Rust provides that too, which is far less to carry around than the Unicode case rules. Since the ASCII case change can always be performed in situ (if you can modify the data) Rust provides that too if it's what you want.

cturtle · on Jan 17, 2022

Those are all valid points. At the moment I believe Zig has decided to leave full unicode support out of std because they don't want language releases dependent on unicode updates.

futharkshill · on Jan 17, 2022

> they don't want language releases dependent on unicode updates.

I'm sorry, what do you mean by this?

jshier · on Jan 17, 2022

The "rules" of unicode change over time with updates to the unicode standard(s). One big one is the grapheme breaking algorithm, which has been updated over time to support things like the family emoji and other compositions.

futharkshill · on Jan 18, 2022

That should be strictly related to the rendering

jibalt · on Jan 18, 2022

correct-for-BMP-but-not-otherwise is simply a bug (and cultural chauvinism). And almost all of such implementations aren't correct-for-BMP because uppercasing Unicode is far from "basic".

iainmerrick · on Jan 17, 2022

you get a compiler capable of compiling Zig code stupidly fast, and that's even without factoring in incremental compilation with in-place binary patching, with which we're aiming for sub-millisecond rebuilds of arbitrarily large projects

That sounds great! But at the same time people in other threads here are talking about 1-3 second compilation times for Advent of Code solutions (which I presume are smallish). Can you summarise where that really fast compiler comes from, to save me searching through that talk video? Is this something that everyday users will be able to use in typical workflows?

kristoff_it · on Jan 17, 2022

Here's a full write up about it

https://kristoff.it/blog/zig-new-relationship-llvm/

Long story short, we're currently working on a self-hosted implementation of the compiler and what people are using now is the old C++ implementation. As soon as the new compiler is feature-complete enough, we'll start shipping it and we expect much better compilation speeds, which will be even greater speed for debug builds once the native (i.e., non-llvm) backends catch up as well.

Latest progress update on this work: https://twitter.com/andy_kelley/status/1481862781380874240?s...