(1) Start off by making changes to a program that someone/some book gave us as a basis and blindly stumble through some changes and compiler/interpreter/runtime errors.
(2) Take a step back, absorb what we've learned from the errors.
(3) Read more about the principles of the programming language/OS/API/operating environment.
(4) Internalize concepts behind the language/OS/API/operating environment.
(5) Goto (1).
It's only after many iterations through this process that I can now see similarities among different processors (CPU/DSP/GPU/etc), operating systems, APIs, programming languages. So many classes of problems can be solved effectively without ever understanding these details, so it makes sense for those new programmers not to learn the details at first. IMO, this is at least an intermediate-level text.
Hah, just today I started to read "Haskell programming from first principles"[1] which also targets non-programmers but builds on lambda calculus instead of assembly.
Would anyone actually recommend to a beginner to start learn assembly first?
The C book you recommended is the exact same one my mum bought for me years ago in late elementary school; I've never seen it mentioned on HN or elsewhere before so I thought I'd also vouch to its effectiveness.
Now that I think of it, I should give my mum a call :)
I've used Little Man Computer successfully to teach even very young kids (elementary school age) basics of computer architecture and a very basic assembly language. Some reading materials and a java based LMC simulator here: http://www.yorku.ca/sychen/research/LMC/
I just took a look at the "LOOK INSIDE!" sample available on amazon.com. A few observations:
The front cover says it's been "Updated for C11", but all the sample programs use "main()" rather than "int main(void)". The form without an explicit return type has been invalid since C99.
Page 11:
"In 1983, ANSI created the X3J11 committee to set a standard version of C. This became known as ANSI C. The most recent version of ANSI C, C11, was formally adopted in 2011."
That's loosely correct, but in fact the 1990, 1999, and 2011 editions of the C standard were published by ISO, not by ANSI (and later adopted by ANSI). Furthermore, it ignores the fact that the phrase "ANSI C" is commonly but incorrectly used to refer to the 1989/1990 version of the language (see gcc's "-ansi" option, for example).
Page 16:
"These are functions:
main() calcIt() printf() strlen()
and these are commands:
return while int if float"
No, those aren't "commands", they're keywords. I can't think of any reasonable meaning of the word "command" that would include "int".
Page 26:
The sample program listing is not properly indented.
Page 36:
"\b moves the cursor back a line".
No, it moves the cursor back a column.
A book for beginning C programmers should not have these kinds of errors after three editions.
If you start with assembly I would suggest not picking x86. I would suggest a RISC or virtual assembly like LLVM. x86 just has too much historic weirdness.
Like scores of other generation C64 kids by second language (after BASIC) was 6502 assembly. I think it's suitably straightforward. Although the special addressing mode for the zero page might count as a weirdness.
I've read a few chapters of this and have found it be well written and a great introduction to assembly programming. It was written for 32-bit but so far everything works just fine on my 64 bit linux system. Also it was interesting to learn alongside C. It helped to see the workings of a machine from those two vantage points.
Also of note is the author's story of writing the book and giving it away as a labor of love. Very nice
Learning assembly was probably the biggest thing that made computers click in my head. Since then, I've always thought that learning assembly as your first programming language would be a great place to start. Of course, I'm sure this doesn't apply to everyone.
In CS you're usually started off with something like C, Java, or Python. But those languages are so HUGE (in comparison) that it can be really overwhelming. Meanwhile, with assembly, you have far fewer items in your toolbox and you have no abstractions between what you write and what's being executed.
After googling, I found emu8086 [0], which I think is what I used to debug the tiny apps I was writing. It was great! You could step back and forward in time, and you could inspect every detail.
I think this is tricky. I was exposed to 8086 assembly having only previously seen a couple flavors of BASIC, and I have to say that it mostly went over my head. It was hard for me to see the bigger picture, the useful functionality that we were painstakingly working toward with our little instructions moving bytes around and taking different paths based on flags. I remember system calls (which are covered early on in the OP) seeming especially confusing and magical. On the other hand, when I dove deeper into (m68k) assembly a few years later, after getting some C and C++ under my belt, it was an amazing eureka moment that made both the earlier assembly exposure and what I had learned of higher level languages click into place. So I do think learning assembly early on is great, but I'm not sure it's such a panacea to learn it first.
Assembly was my second language (BASIC being my first), but it was not the x86 line I learned first (it was my second assembly language). I can name half a dozen architectures that are probably better for learning than the x86, some of them relevant today.
Probably in order: 6502---not my favorite, but tons of material and emulators exist. The most popular 8 bit CPU and one of the most simple ones.
6809---my favorite 8-bit and it lends itself fairly well towards higher level languages. A nice register set (2 8bit general purpose registers, four 16bit index registers) and a regular instruction set makes it easy to program.
8080 (or Z80)---if you are going to start down the dark side of x86 line, it's probably best to start at the beginning. A mess of registers, most of which are somewhat general purpose but each has a specific use.
68000---a very nice 32-bit architecture with enough regularity to make it easy to write compilers for, with enough instructions to make it fun to write assembly code. 8 32bit data registers, 8 32bit index registers.
VAX---another nice 32-bit architecture with a very regular instruction set. 16 32bit general purpose registers.
MIPS---one of the original RISC architectures that's still in use. Perhaps a bit too spare on instructions. 32 32bit registers.
ARM---Kind of reminds me of a RISCish 68000 with unique features that make it, again, fun to program at the assembly level.
The x86 line is ... baroque due to its history (you can think of it as an overglorified 8080 that gained power over time, much like the MCP in TRON if you think about it) of ossified layers. It's hard to learn because it's almost all exceptions to a non-regular instruction format.
6809 is vastly underrated I think. It is super simple, but also has everything you need to program efficiently. I think if I were to introduce people to assembly langue programming first, I would pick that one. Especially being 8 bit with 16 bit addressing gives you a good segue into byte ordering.
6502 was my first (and only really significant) experience with assembly. It is so minimal as to be awkward at times, but as you say with the huge numbers of emulators and instructional materials around, it's hard to discount it.
Personally, I would skip x86 altogether unless you actually want to do x86 programming ;-). I've never looked at ARM, but your description makes me want to play with it. The added bonus is that it is still relevant today.
Great anecdotes! I still have a real soft spot for 68000. It's definitely the IS that made this whole "how computers work" thing make sense to me, and I still often imagine what my programs are doing in terms of what they'd be doing in the 68000 world, even though I know it's actually x86_64 (or ARM, on my phone).
Question for you, if you happen to come back to this: what's the best documentation on ARM that you've found? Every time I try to find some, it seems like I'm supposed to pay for a license, and I get confused.
I've been looking to buy "Introduction to 64 Bit Assembly Programming for Linux and OS X: Third Edition - for Linux and OS X". It's an introduction to x84-64 assembly, which I thought makes sense for me to learn, because it is the architecture all my computers use.
You say x86 is not a good choice to learn, but this book purports to make assembly for this architecture accessible. If you stand by your opinion, are there even CPUs readily available that support the alternative architectures you mention?
Also, can anyone recommend this or any other book or learning resource?
As a first language, or even as a first assembly language. Once can certainly learn it, and it may even come across as half decent if you skip some of the less used instructions (like LOOP, REP, LODS, STOS, AAD, AAS---basically, the really archaic instructions that aren't really used or optimized). But with nearly 40 years of backwards compatibility, it's a monster of an architecture.
I'm not sure it's fair to group x86 with x86_64 - the latter is much more pleasant (or less unpleasant). It certainly is a huge instruction set, and if you really want to know the ins and out, it's going to be a long journey. But for playing around, it's actually kind of fun IMNHO. I don't think 32 bit x86 really qualifies as fun. Maybe fun, as in trying to solve a complicated murder mystery fun, but not so much as in playing with water colours fun.
I had the same epiphany. For the long time, I regarded computers deeply magical, and didn't understand why some programming languages had the features they had, and some don't. (At the time, I only knew Python and some superficial C.)
Then I read a book that explained how assembly works and how a compiler would implement C's features with assembly. It all clicked since then. Too bad I've forgotten the name of the book.
One of his two nameservers, ns2.missoulaweb.com (66.235.178.171), is non-responsive. The other one works but it's listed second. No round-robin. That can slow down the DNS lookup, as the first listed nameserver will likely be tried first, adding to the "slowness" of a website.
Dead nameservers are relatively rare among sites that get posted to HN, since many are high traffic and use CDNs (that rely on DNS kludges). But it is still a regular occurrence. No one ever talks about it. But it's one of the DNS kludges that's continually slowing things down behind the scenes.
There is that term "link rot" for URL's that do not work anymore. And "bit rot" for source code. I propose a new one: "NS rot".
Does it make sense to learn x86 assembly now, or is x64 assembly simpler? All the machines I use are x64 and the binaries I generate are 64-bit. (I realize it's a superset, but I suppose I want to learn the smallest useful subset, since it's a big topic...)
Learn the x86 instructions that are supported in x86-64 and ignore those that aren't. When you need a specific extension, learn it then. For example, I haven't needed to use the AES instructions yet, so I've yet to learn them.
Can anyone recommend the best resource for learning x64 then? I did computer architecture and assembly language 15+ years ago in school but haven't really needed to touch assembly language as a professional programmer.
So I know the basic idea of registers and addressing modes and pipelines and stuff, but nothing specific to any architecture. My goal is basically to be able to write faster C and C++ programs and analyze constructs like the cost of things like smart pointers, virtual dispatch, templates, etc.
I'd be very interested in such a resource as well. At the moment I'm just staring at the output of 'objdump -S', trying to make sense of things and find what's different between two versions. For smaller things, Matt Godbolt's compiler explorer is very helpful as well. But I'm still having trouble figuring out what exactly is happening and why some things are required, etc.
Blogger.com blogs such as this one ("programminggroundup") can be obtained in a single file containing all posts. Text or minimal HTML file. No gratuitous Javascript and no comment spam.
Books generally collect lots of upvotes, but few comments, since you generally have to read the book first before commenting.
Controversial news articles, whose entire content can be summed up in the headline, collect both upvotes and comments. The ranking algorithm punishes submissions with many comments, to push those submissions off the front page.
If those upvotes are fast enough, it will have a very high score, and it will shoot straight onto the top of HN.
~(Score) = votes / time
Also, people open links, upvote if they like it, and then continue reading (most of the time). That way if their reading is interrupted, they can check their "saved stories" to get back to reading later.
(1) Start off by making changes to a program that someone/some book gave us as a basis and blindly stumble through some changes and compiler/interpreter/runtime errors.
(2) Take a step back, absorb what we've learned from the errors.
(3) Read more about the principles of the programming language/OS/API/operating environment.
(4) Internalize concepts behind the language/OS/API/operating environment.
(5) Goto (1).
It's only after many iterations through this process that I can now see similarities among different processors (CPU/DSP/GPU/etc), operating systems, APIs, programming languages. So many classes of problems can be solved effectively without ever understanding these details, so it makes sense for those new programmers not to learn the details at first. IMO, this is at least an intermediate-level text.