Anthropic has just been focused on coding/terminal work longer mostly, and their PRO tier model is coding focused, unlike the GPT and Gemini pro tier models which have been optimized for science.
Their whole "training the LLM to be a person" technique probably contributes to its pleasant conversational behavior, and making its refusals less annoying (GPT 5.2+ got obnoxiously aligned), and also a bit to its greater autonomy.
Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).
Autonomy for agentic workflows has nothing to do with "replying more like a person", you have to refine the model for it quite specifically. All the large players are trying to do that, it's not really specific to Anthropic. It may be true however that their higher focus on a "Constitutional AI"/RLAIF approach makes it a bit easier to align the model to desirable outcomes when acting agentically.
You think it has nothing to do with it. Even they only have a loose understanding of exactly the final results of trying to treat Claude like a real being in terms of how the model acts.
For example, Claude has a "turn evil in response to reinforced reward hacking" behavior which is a fairly uniquely Claude thing (as far as I've seen anyhow), and very likely the result of that attempt to imbue personhood.
Anthropic has shared that API inference has a ~60% margin. OpenAI's margin might be slightly lower since they price aggressively but I would be surprised if it was much different.
Yeah but the argument people make is that when the music stops cost of inference goes through the roof.
I could imagine that when the music stops, advancement of new frontier models slows or stops, but that doesn't remove any curent capabilities.
(And to be fair the way we duplicate efforts on building new frontier models looks indeed wasteful. Tho maybe we reach a point later where progress is no longer started from scratch)
Ironically, apples are one of the fruits where tree ripening isn't a big deal for a lot of varietals. You should have used tomato as the example, the difference there is night and day pretty much across the board.
If humans can prove that bespoke human code brings value, it'll stick around. I expect that the cases where this will be true will just gradually erode over time.
If humans can prove that bespoke human code brings value, it'll stick around.
Value to who, and which type of value?
Will that defining value be purely economic in nature? While it be purely defined by mega-corps, and their perception of value? The market moves, the money flows to those which control its direction.
We already see it today, with some firms literally forcing people to use LLM coding tools. The stories abound, of simply being forced to use whatever it spits out. Value, is often designed by cost, and code maintainability 5 years later isn't an immediate, quarter-profit induced concern.
It feels like you're glossing over a lot of my point.
In terms of new coding languages, it's rare to see new coding languages gain any traction, rust is the only recent one I know, and it's had a larger backer behind it since its release 15 years ago. It was supported in house for almost a decade before even seeing the light of day.
Will that happen now? And created (or at least managed) by a human? If not, what would a new language look like? Would it be human maintainable? Understood?
To me, this goes right back to the classic "buying a car with the hood welded shut". No way to maintain, repair, or even evaluate the quality of the thing under you.
Fusion isn't a good example. Self driving cars are a battle between regulation and 9's of reliability, if we were willing to accept self driving cars that crashed as much as humans it'd be here already.
Whatever models suck at, we can pour money into making them do better. It's very cut and dry. The squirrely bit is how that contributes to "general intelligence" and whether the models are progressing towards overall autonomy due to our changes. That mostly matters for the AGI mouthbreathers though, people doing actual work just care that the models have improved.
Just because Bob doesn't know e.g. Rust syntax and library modules well, doesn't mean that Bob can't learn an algorithm to solve a difficult problem. The AI might suggest classes of algorithms that could be applicable given the real world constraints, and help Bob set up an experimental plan to test different algorithms for efficacy in the situation, but Bob's intuition is still in the drivers's seat.
Of course, that assumes a Bob with drive and agency. He could just as easily tell the AI to fix it without trying to stay in the loop.
But if Bob doesn't know rust syntax and library modules well, how can he be expected to evaluate the output generating Rust code? Bugs can be very subtle and not obvious, and Rust has some constructs that are very uncommon (or don't exist) in other languages.
Human nature says that Bob will skim over and trust the parts that he doesn't understand as long as he gets output that looks like he expects it to look, and that's extremely dangerous.
Then perhaps Bob should have it use functional Scala, where my experience is that if it compiles and looks like what you expect, it's almost certainly correct.
Actions are bad, but they're free (to start) and just good enough that they're useful to set up something quick and dirty, and tempt you to try and scale it for a little while.
Which, as far as I found so far, means Forgejo. Haven't found any others. And even Forgejo Actions says that it's mostly the same as Github Actions syntax, meaning you still have to double-check that everything still works the same. It probably will, but if you don't know what the corner cases are then you have to double-check everything. Still, it's probably the best migration option if you rely on GHA.
Team scale doesn't tend to impact this that much, since as teams grow they naturally specialize in parts of the codebase. Shared libs can be hotspots, I've heard horror stories at large orgs about this sort of thing, though usually those shared libs have strong gatekeeping that makes the problem more one of functionality living where it shouldn't to avoid gatekeeping than a shared lib blowing up due to bad change set merges.
Their whole "training the LLM to be a person" technique probably contributes to its pleasant conversational behavior, and making its refusals less annoying (GPT 5.2+ got obnoxiously aligned), and also a bit to its greater autonomy.
Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).
reply