> When they ran the resulting code, they found it was 40% faster and used half the memory. This points to a persistent myth that Fried has heard: Python 3 is slower than Python 2. That may have been true for earlier releases of Python 3, but it is definitely not true now, he said.
I wonder why that is? My theory is that it has to do with the new Unicode encoding; if 99% of your strings are ASCII compatible, the new encoding will be twice as efficient. Seems unlikely that would account for all of it though.
Optimizations of the dict class (both memory and cpu wise) in python 3.5+ are pretty important, as it’s used a lot internally. E.g class members and methods are stored in a dict. This basically affects all python code.
The dictionary implementation changes are the biggest speed up I can think of. They touch nearly everything in the language.
One of the interesting side-effects of this, is in 3.7+ dictionaries are now ordered by insertion, which I'm sure will help fix tons and tons of subtle non-deterministic bugs (and probably expose a few too).
Although I'm not 100% on the performance characteristics of the new unicode implementation. The unicode first string handling is a massive improvement when it comes to non-English languages. So IMO it's worth a hefty performance hit, based solely on it's own merits. In practice, it hasn't seemed to make a meaningful difference to me. So I'd wager it's good enough, unless you're hitting some sort of use case specific bind. In which case, it's probably time to leverage the C bindings anyway.
> is in 3.7+ dictionaries are now ordered by insertion,
3.6
> The unicode first string handling is a massive improvement when it comes to non-English languages. So IMO it's worth a hefty performance hit, based solely on it's own merits. In practice, it hasn't seemed to make a meaningful difference to me. So I'd wager it's good enough, unless you're hitting some sort of use case specific bind. In which case, it's probably time to leverage the C bindings anyway.
That's because everyone had to use Unicode strings on Python 2 already, so the en/decoding and Unicode handling overhead was already there. Applications not using unicode strings were mostly just buggy or didn't work outside ASCII+. The major difference is that Python 3 string code is a lot less fragile than Python 2 code, because it either works and does so for non-English text, too, or it doesn't. Meanwhile Python 2 code would appear to work fine until you started to drop those sweet umlauts and got exceptions all over the place.
Well it’s like that by accident in 3.6 because cpython36 does it this way and no other implementation exists. In 3.7 it’s officially a part of the spec thus required by all Python 3.7 compatible implementations and not just cpython.
>> is in 3.7+ dictionaries are now ordered by insertion,
>3.6
It was an implementation detail in 3.6, 3.7 made it part of the language spec.
Which is good, can finally get rid of the metaclass I hacked together for my spark (earley parser that used to be used in the python build process to parse ASDL) shenanigans.
Do you have a link for the rationale behind this?
So it is not a hashmap anymore? Or in addition to the hashmap, it stores the insertion index? I wonder if it is a good idea to have this part of the language spec, as they might want to use a faster (non-deterministic) implementation later at some point, which would then break again lots of code.
The point of putting it in the language spec is to promise that they won't change back, even if they do later dream up a faster nondeterministic implementation.
The rationale is that even if they clearly said "we might make this non-sorted again later", in practice too much code would end up relying on the new behaviour (whether by accident or because people didn't pay attention to the warning).
So if changing the behaviour is likely to become impractical anyway, they might as well make it a promise so people can benefit from the guarantee.
They made essentially the same choice when they changed sort() to an algorithm that happened to be stable, about 15 years ago.
It's still a hashmap. The idea is to essentially use a separate array for the values and only map hashes to indices into that array. That way your hashmap-part of the data structure is far more compact, so you can use lower load factors without increasing memory consumption too much, which often makes lookups faster (because fewer collisions happen at lower load factors, because the same number of keys is spread over more slots).
That was the reason why they didn't make it a language feature in 3.6. However, after some careful consideration (there are some smart people making these optimizations and language spec choices, I trust they are doing the right thing), they decided to have 3.7 guarantee ordering.
After you have used it for a while, it is a great feature. I would now be happy to take a small dict performance hit in order to preserve the ordering guarantee.
Ah, I missed that little detail. Good for me, I suppose, since I almost immediately started to write code depending on this behaviour in a Python 3.6 code base :)
Just to be clear, I wasn't talking about the change making Unicode strings universal in 3.0. I was talking about the change to the internal string representation in 3.3, described by PEP 393.
The change to dictionaries seems a lot more likely to make a difference, although it was introduced much later.
Anything improving dict performance will have significant impact because lots of things in Python are implemented using dicts under the hood (including namespaces).
Unfortunately, one area where Python 3 is definitely slower is the startup time (specifically module importing). This can be very noticeable on embedded systems as I discovered after porting our codebase. The remaining language improvements still make this worth it though.
I always felt just a tiny bit of xenophobia, or at least ignorance, in those complaints. Yes, your strings have the beauty of containing no more than 127 different characters. But isn't it quite obvious that there are people "out there" that don't only speak english?
Being from Europe, my strings are still 99% ASCII compatible. But 1% is about four orders of magnitude too big for a failure rate. Every single string that goes through an application and isn't ISO-specified (bank account etc) will sooner or later be hit with unicode. Just wait 'till bank accounts allow emoji identifiers.
> He wanted to tell the "story about how I and couple of engineers used our free time, with no authority whatsoever, and made Python 3 the dominant version at Facebook."
Isn't it great when engineers upgrade your codebase in their free time.
I'm actually fairly certain it was "within working hours" free time. Facebook doesn't have 20% time or anything like that, but, in practice, there's lots of extra time for whatever you want. This was my experience as an Engineer at Facebook and I certainly "pay it forward" as a manager.
Probably worth adding that we ultimately created a python foundation team, of which Jason is a member.
I've always had the drive to do things like this in my spare time. I'd love to get paid for solving technical problems for an employer. I think besides the obvious, I'd make a good candidate for an employee at something like Facebook.
More companies should encourage hiring developers who would solve problems in their spare time. When I applied for an apprenticeship at 8th Light, they had me write a tic tac toe game, and I wrote it in 4 or 5 different languages, just for fun. I think that's why they hired me.
Man, it would be good to have an employer. Any takers?
This is the quote you focus on? Free time here here is probably defined as time in between their main projects, or when code is building, before meetings, etc.
I always take the opportunity to identify some long-term-beneficial changes which would look good on my CV, and run with them even if it means investing my free time before I can present the potential to the business.
Ha! The 2013-era linter he mentions that required “from __future__” imports was actually one of my bootcamp tasks! Hilarious and kind of gratifying to stumble across it in a random HN post all these years later.
Facebook code kinda has three layers:
1. clients (Objective-C for iOS, Java for Android, JS for Web and React Native)
2. application logic (Hack)
3. services (c++ and python are most prominent and well-supported, but there's a long tail of languages like Java, Haskell, OCaml, Ruby, etc.)
The prominent languages in each layer have pretty substantial code-bases.
More importantly, there's an engineering culture of eschewing "code ownership" and encouraging "code upgradability" so that motivated folks like Jason can drive these large changes.
First they made HHVM for PHP and then they made the Hack language for HHVM. Hack is based on PHP. I don’t know how much non-Hack PHP they still have but it is my impression that they turned pretty much all of their PHP code into Hack code.
Yes, and the php7 typing system is heavily influenced by Hack. However, Hack goes further with support for more typehint features + a typechecker daemon
Much of Instagram's application code is written in Python. Python also powers some service infrastructure. For example, FB's release automation system is written in Python.
I had to google and double check. Because the first Jason Fried I have in my mind is Founder, CEO of basecamp, Ruby Rails. And I thought when did he became a python guy?
Turns out there is another person in Facebook with the same name.
Besides what other people have mentioned FB is also one of the biggest Mercurial (which is build in Python) users out there and has a few developers working on it exclusively.
FWIW larger companies don't hire "x coders" where x is some language. They look for CS fundamentals, mastery of programming paradigms, and then you work in whatever language you need / makes sense given the task.
Depends, I'm currently coding almost entirely in python, the guy behind me is using almost solely C++, and the person to my right is working solely in Java right now. Last week I was split heavily between python and javascript, but will likely write very little JS in the near future. At that point, another person on my team was working heavily in Go, although that is also finished now.
Its not just "what your team works on", but what fits the task/what the existing tooling is in. If you're only working on a single product/project ("I develop the android app for XYZ product"), you likely work in the same language or two as the rest of your team. But that isn't always the case.
Usually, yes. But it does not mean that you should use a screwdriver to drive a nail just because that's what others are doing. They may be working with screws.
In more practical terms, you are not going to write a cronjob in java just because that's what the main system is written in.
The statement is true for many teams in the best tech companies.
Many of the "crappy" companies do look for "X programmers" even when there is no need, just because of the tech they are currently working on at that specific point in time. Even when they are large (especially when they are large, it would seem).
Just yesterday I was talking with my teacher about how he was tired of students (and parents) asking if he would add x and y languages to the curriculum.
Well, unfortunately many (both tech and non-tech) companies still want to hire developers with "x years of y language, n years of m framework" and so on. So there is some reason to ask for specific languages in the curriculum, from that purely practical point of view. Not everyone goes to work to a top place.
If you are at school, university, whatever that is, you need to be exposed to concepts. Those will rarely change. For instance, Lisp already had garbage collection and object orientation decades before they were available in mainstream language, and the basic concepts still have not changed.
Once you understand then, learning a new programming language is done in a matter of days (libraries and common idioms takes more time). You are basically transferring concepts you already know to a slightly different implementation, but there will be hardly anything entirely new. More so for "mainstream" languages.
Then you need practical experience, which do not require teachers to be involved.
I agree with this. But unfortunately my experience is industry tends to not agree with us. They are like 'cool you know fundamentals, now do this take home project that requires x programming language with m framework'. This isn't too nice for new hires that learn just fundamentals and not the tooling and frameworks industry expects new grads to have. Industry seems to expect CS degrees to be vocational degrees.
> But unfortunately my experience is industry tends to not agree with us. They are like 'cool you know fundamentals, now do this take home project that requires x programming language with m framework'. This isn't too nice for new hires that learn just fundamentals and not the tooling and frameworks industry expects new grads to have.
I think the idea is that you should be able to pick up the language and framework as you go, if you do actually have the concepts down. If your generalised knowledge isn't to the point where you can apply it to some arbitrary framework/codebase in some arbitrary (imperative) language and start being productive more or less immediately (with documentation open in another tab), then your generalised knowledge just isn't at the level where people care about it, yet.
edit: That generalised knowledge they're expecting absolutely might go beyond what you're shown in school (it depends on the school). If you draw the line of "fundamentals" at "what was in my curriculum", then that's an arbitrary line you're drawing, and it's not the line that most of the industry is using (even if people are using their own coursework as their meter). If you consider (eg, in a web dev context) HTTP requests and MVC "fundamental", then you should be able to pick up Rails or Django or whatever without much trouble. Whether or not your school had classes on that doesn't really matter to anyone.
I think you missed my point. Companies I've seen don't care if you have fundamental knowledge of algorithms and data structures. They are testing you on specific frameworks via interview questions and take home projects. They are looking for industry experts in certain frameworks. These companies seem to equate CS programs with vocational schools or bootcamps and are only looking for coders that already know certain frameworks. Your comment justifies what I already believe and understand, but I am pointing out some companies seem to value bootcamp-tier knowledge more than fundamental knowledge and discriminate against graduates that focused heavily in theory but little in frameworks.
> Companies I've seen don't care if you have fundamental knowledge of algorithms and data structures. They are testing you on specific frameworks via interview questions and take home projects
No, I got that. My point was that no one cares that you completed an algorithms and data structures course, they care that you know the fundamentals of (the subgenre they're interested in of) software development. You're just using a definiton of "fundamentals" that excludes the (entirely language and framework independent) concepts they care about that you don't know.
A take home project using a specific language and framework does not require you to know either to complete it before you start; if you know the concepts, you can pick them up as you're writing it. If you've never seen MVC before (and the knowledge you have doesn't allow you to grok it fast), then in a web dev context I would argue that that is a fundamental concept you are missing, which was my point.
Interview questions can be different, but you didn't mention them in the post I responded to.
My definition of fundamentals is more traditional CS algorithms and data structures, not "fundamentals" to a specific genre of the software development genre outside of traditional CS algorithms (ones start up companies or what have you are interested in). The fundamental taught in a traditional CS program may exclude the fundamentals that start ups use day-to-day and the fundamentals taught in traditional CS programs many be useless to startups.
So, while a student may have strong CS theory, they may lack industry specific "fundamentals" and companies may discriminate against those candidates because they lack that industry specific skills (that otherwise could easily be picked up).
I argue that traditional fundamentals as defined above are more important than industry specific tooling or frameworks, since the tooling or frameworks change but the core theory does not. I would like industries to not discriminate against graduates that come from a strong theory background but weak industry specific background, since these people should be able to pick up whatever tooling or frameworks on the job.
> companies may discriminate against those candidates because they lack that industry specific skills (that otherwise could easily be picked up).
What does "otherwise" mean, here? The context we were discussing is that they asked you to quickly pick it up in a take-home assignment, and you couldn't pick it up that quickly. I don't see how this is them being shortsighted; it sounds like you just not delivering on the promise of your "fundamentals".
> the tooling or frameworks change but the core theory does not.
I 100% agree: someone who learnt MVC with Smalltalk in 1979 won't have any trouble picking up Rails' or Django's or whatever's version today. The concepts are timeless.
edit: to boil it down:
> these people should be able to pick up whatever tooling or frameworks on the job.
Yes, and if they can, they won't have trouble picking it up in the process of writing a take-home assignment. If they can't do that, then they just failed a litmus test.
>you couldn't pick it up that quickly. I don't see how this is them being shortsighted; it sounds like you just not delivering on the promise of your "fundamentals".
I never failed any take home assignments, so this isn't about me. I am speaking more generally and not about me personally. My point is fundamentals don't change, frameworks do.
Edit: For some reason I can't reply to your response of this question. But no problem. I see your point and I do agree with what you're saying. Thanks for clarifying.
> I never failed any take home assignments, so this isn't about me. I am speaking more generally and not about me personally.
It was generic "you", the hypothetical person who was having trouble with take home assignments but not interview questions that you were talking about initially. All of my points were about that hypothetical person, and not you personally. Sorry for the misunderstanding.
> My point is fundamentals don't change, frameworks do.
Yes, that's what I was addressing. Companies aren't asking about frameworks when they give you a take home question, they're asking you to learn the framework if need be, which will be easy if you know the fundamentals. I'm not sure where you're getting lost on this point.
I think the deeper point is the difference between a career and a job.
Most companies have very short-term thinking and are looking to hire for the task at hand (to be fair to them, it's not always good to over engineer or over invest). If you do get a job, the problem can be in 5-10 years when your company's needs change or you want to move up in your career.
It may not apply equally to tech or could be a generational thing, but generally trade schools taught specific job skills while colleges taught concepts (and usually omitted a lot of the specific operational skills). I've heard my dad describe it as a familiarity with the wider discipline and the knowledge of where to go for a deeper understanding, when needed. A lot of the baby boomers I know went back to school later for their Masters or PhD because that was a limiting factor for their career advancement.
Less and less often companies are investing long-term into employees. It's up to the individual. That being said, not everyone has to be a lead or a CEO. You can have a long and fulfilling career being a really good mechanic or code monkey. It's such a well earned stereotype to see people go into management and really miss the day-to-day coding work.
Industry seems to expect CS degrees to be vocational degrees.
Yes, I think you hit the nail on the head here. Unfortunately, in some places software developers are treated not so much as engineers as more like advanced technicians.
E.g. when I was at FB I joined a C++ team while having written about 200 lines of C++ before in my life — we also had proper C++ experts in the team, and they cared more about experience in the problem domain. All of these conversations were held _after_ I was hired though — the pipeline is a mixture of hiring experts for specific positions and "hire people that seem good enough, and let them find their own teams"
Similar with me, background in mostly C/C++/Pascal, etc. - joined Google and had to write Java - what surprised me mostly were: Killing app due to GC stress, Dependency Injection, Checked Exceptions and using various containers (map, set, etc.) as an interface without assuming concrete implementation (or maybe not so). Didn't graduate well from this "program", but give full respect to Java now. Back to C++ land for me, and it's been good.
If it's what you enjoy and you're not tied into a different job for other reasons, go do it. I'm not sure what Facebook's requirements are like, but there are tons of places that would hire someone with Python skills at a decent salary.
The best advice I can think of is to get involved with open-source. You'll learn a lot by working with your peers (some of which may be even at Facebook and other companies) and you'll figure out what you need to learn. You'll also end up with code you can show, and evidence of being able to work in a team and complete actual tasks. This is huge for prospective employers.
I don’t believe it’s kosher to provide a subscriber link to everyone on HN. I believe it is intended to be used to communicate with a few colleagues, not to be posted on a news aggregator site.
Per LWN.net:
“The "subscriber link" mechanism allows an LWN.net subscriber to generate a special URL for a subscription-only article. That URL can then be given to others, who will be able to access the article regardless of whether they are subscribed. This feature is made available as a service to LWN subscribers, and in the hope that they will use it to spread the word about their favorite LWN articles.
If this feature is abused, it will hurt LWN's subscription revenues and defeat the whole point. Subscriber links may go away if that comes about.”
Sharing a single subscriber link and having it hit the front page of HN is unquestionably good for LWN and its revenue. Which is, as your quote says, the whole point.
Python is really slow, but luckily, most python code can be optimized by improving the interpreter... maybe Facebook will release a better interpreter (there is already some faster interpreters, not sure why Python doesn't adopt them)...
There was an article somewhere recently, of someone proposing changes to CPython for just that reason. Guido had already resisted several attempts at the kind of thing he was doing, from a desire to keep the CPython implementation simple.
I wonder why that is? My theory is that it has to do with the new Unicode encoding; if 99% of your strings are ASCII compatible, the new encoding will be twice as efficient. Seems unlikely that would account for all of it though.