Yep. I think the idea that the benchmark is determinative is just as deluded as the notion that it should be unbreakable.
Benchmarks are on the honor system. Even the tightest benchmark can be cheated. If the benchmark is so secret and air-gapped that it can't be cheated by models, it can be cheated by its own authors. You can't use benchmarks to gate out cheating.
If you don't have the honor system in mind when you're reading scores, you're wasting your time. Is it some unknown outfit with wild claims? Is it connected to Epstein, Russia, the real estate "industry", or sleazeballing in general? Do they have previous history of ratgaming the numbers? Replace its scores with asterisks and move on.
This is an odd choice of a thread for a laundry list of complaints about AI and about a person that, say what you will, is nowhere near the list of planetary "really bad guys". Even if we limit it to tech, the list starts with someone way richer, then goes through four or five way-shadier people.
If you're OK with victim-shaming here, doesn't it say more about you than Altman? What does it say about your viewpoint?
> about a person that, say what you will, is nowhere near the list of planetary "really bad guys". Even if we limit it to tech, the list starts with someone way richer, then goes through four or five way-shadier people.
You really don't need to go that high up the ladder to find members of the 'list of planetary really bad guys'. Sam Altman is single-handedly responsible for starting the current DRAM crunch - that too based on an untenable economic framework. He's also an enthusiastic participant in the AI bubble that threatens to cause a massive global economic depression when it pops. He's also involved in the cabal that wrecks the labor market (wages) by hyping up the 'AI will replace labor' narrative. On top of all that, he and his ilk are on a building spree of data centers that will guzzle huge amount of energy and dump tonnes of extra CO2 into the atmosphere, as if there's no tomorrow. This wrecks all the hard efforts of millions of others before him to rein in the damages caused by the climate change. Needless to say, all of these have pretty deleterious effects on the economy, biosphere and the welfare of ordinary people, including loss of innumerable lives.
But does he care? He is one of those people who simply ignore the trail of serious damage and enormous suffering they leave in their wake, because they don't see anything beyond money - more money than they can spend in a hundred lifetimes! Nobody needs a justification to see him as one of those 'planetary bad guys'.
> What does it say about your viewpoint?
As someone else here said, it goes without saying that lobbing Molotov cocktail at anyone is a no-no. I don't support physical violence in any form. Having said that,...
> If you're OK with victim-shaming here
It's sad that the aristocratic society didn't learn anything from the murder of Brian Thompson. The 'victim' had caused thousands of preventable deaths per year, and his death saved thousands by forcing the industry to deal with the problem. Suddenly, even the pacifists (like me) are left wondering if the death was unethical. If true justice existed, the state would have stopped them from their crimes (aka professions), if not outright execute them for the lives lost. Whom will you choose when they pitch their own lives against thousands of innocent lives? You can't claim victimhood after putting yourself in that position.
I read the New Yorker article like most people here. I didn't find anything incendiary enough in it to provoke a Molotov attack. I wouldn't put it past him to have arranged it himself, given how much he lies and what he stands to gain from it. But let's assume that the attack is real and is connected to the report. The reply seems overly dramatic and self-righteous, given that the attack was against his iron gate! He's milking the situation to indulge in virtue signaling, sympathy farming and gaslighting the critics. This is one hell of a victim posing! But I have no sympathies to spare if it distressed him so much. He shouldn't be able to sleep anyway, if only he had a conscience. Advocating sympathy for the unsympathetic super-privileged is a bit tone deaf under such circumstances. Evidently, nobody is in a mood to oblige to such manipulations.
I understand the temptation to Streisand this, but for the love of, please don't. S1/2 were the best show I've seen on TV. It would be a crime against good taste.
> He hasn't kept ahead of the destruction of the dollar very well.
You can't price dollars in gold to measure value. Gold doesn't measure value better than the dollar at any point in time, let alone over time. Just use the price index for one currency, or the relative price indexes across currencies.
I broadly agree with you, however: during the classic gold standard years, gold did have a pretty stable purchasing power (as eg measured by your favourite inflation index) in the long run.
However, since the world largely went off the gold standard, the purchasing power of gold has been a lot more volatile.
Suits in agriculture don't drive the combine either, a farmer does. The other 99% of pre-automation farmers went on to other jobs. They happened to be better jobs than farming, but that's not necessarily always the case.
Same background as you, and same exact experience as you. Opus and Gemini have not come close to Codex for C++ work. I also run exclusively on xhigh. Its handling of complexity is unmatched.
At least until next week when Mythos and GPT 6 throw it all up in the air again.
Yep, I think the lede might be buried here and we're probably cooked (assuming you mean SWEs, but the writing has been on the wall for 4 months.)
I guess I'm still excited. What's my new profession going to be? Longer term, are we going to solve diseases and aging? Or are the ranks going to thin from 10B to 10000 trillionaires and world-scale con-artist misanthropes plus their concubines?
I need to start SaaS for getting people to start doing lunges and squats so they can carry others around on their back, I need a founding engineer, a founding marketer, and 100m hard currency.
If wealth becomes too captured at the top, the working class become unable to be profitably exploited - squeezing blood from a stone.
When that happens, the ultra wealthy dynasties begin turning on each other. Happens frequently throughout history - WWI the last example.
Your options become choosing a trillionaire to swear fealty to and fight in their wars hoping your side wins, or I guess trying to walk away and scrape out a living somewhere not worth paying attention to.
Or, I suppose, revolution, but the last one with persistent success was led by Mao and required throwing literally millions of peasants against walls of rifles. Not sure it'd work against drones.
I kind of think that these threads are destined to fossilize quickly. Most every syllogism about LLMs from 2024 looks quaint now.
A more interesting question is whether there's really a future for running a coding agent on a non-highest setting. I haven't seen anything near "Shall I implement it? No" in quite a while.
Unless perhaps the highest-tier accounts go from $200 to $20K/mo.
Benchmarks are on the honor system. Even the tightest benchmark can be cheated. If the benchmark is so secret and air-gapped that it can't be cheated by models, it can be cheated by its own authors. You can't use benchmarks to gate out cheating.
If you don't have the honor system in mind when you're reading scores, you're wasting your time. Is it some unknown outfit with wild claims? Is it connected to Epstein, Russia, the real estate "industry", or sleazeballing in general? Do they have previous history of ratgaming the numbers? Replace its scores with asterisks and move on.
reply