More

CuriouslyC · 2026-04-08T19:55:23 1775678123

Anthropic has just been focused on coding/terminal work longer mostly, and their PRO tier model is coding focused, unlike the GPT and Gemini pro tier models which have been optimized for science.

Their whole "training the LLM to be a person" technique probably contributes to its pleasant conversational behavior, and making its refusals less annoying (GPT 5.2+ got obnoxiously aligned), and also a bit to its greater autonomy.

Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).

zozbot234 · 2026-04-08T20:19:41 1775679581

Autonomy for agentic workflows has nothing to do with "replying more like a person", you have to refine the model for it quite specifically. All the large players are trying to do that, it's not really specific to Anthropic. It may be true however that their higher focus on a "Constitutional AI"/RLAIF approach makes it a bit easier to align the model to desirable outcomes when acting agentically.

CuriouslyC · 2026-04-08T22:46:54 1775688414

You think it has nothing to do with it. Even they only have a loose understanding of exactly the final results of trying to treat Claude like a real being in terms of how the model acts.

For example, Claude has a "turn evil in response to reinforced reward hacking" behavior which is a fairly uniquely Claude thing (as far as I've seen anyhow), and very likely the result of that attempt to imbue personhood.

CuriouslyC · 2026-04-08T19:45:37 1775677537

To be fair, starting with a toy model to get a first order approximation then building on it is kind of how theoretical science is done.

CuriouslyC · 2026-04-06T14:43:16 1775486596

I foresee people shopping in masks, with phone off, using cash as a protest, and poor people being black market designated shoppers.

CuriouslyC · 2026-04-05T13:15:57 1775394957

Anthropic has shared that API inference has a ~60% margin. OpenAI's margin might be slightly lower since they price aggressively but I would be surprised if it was much different.

bigfishrunning · 2026-04-05T13:49:48 1775396988

Is that margin enough to cover the NRE of model development? Every pro-AI argument hinges on the models continuing to improve at a near-linear rate

tonfa · 2026-04-05T14:43:57 1775400237

Yeah but the argument people make is that when the music stops cost of inference goes through the roof.

I could imagine that when the music stops, advancement of new frontier models slows or stops, but that doesn't remove any curent capabilities.

(And to be fair the way we duplicate efforts on building new frontier models looks indeed wasteful. Tho maybe we reach a point later where progress is no longer started from scratch)

phist_mcgee · 2026-04-06T01:12:32 1775437952

Gross margin

CuriouslyC · 2026-04-05T13:13:50 1775394830

Ironically, apples are one of the fruits where tree ripening isn't a big deal for a lot of varietals. You should have used tomato as the example, the difference there is night and day pretty much across the board.

If humans can prove that bespoke human code brings value, it'll stick around. I expect that the cases where this will be true will just gradually erode over time.

b112 · 2026-04-06T07:57:32 1775462252

If humans can prove that bespoke human code brings value, it'll stick around.

Value to who, and which type of value?

Will that defining value be purely economic in nature? While it be purely defined by mega-corps, and their perception of value? The market moves, the money flows to those which control its direction.

We already see it today, with some firms literally forcing people to use LLM coding tools. The stories abound, of simply being forced to use whatever it spits out. Value, is often designed by cost, and code maintainability 5 years later isn't an immediate, quarter-profit induced concern.

It feels like you're glossing over a lot of my point.

In terms of new coding languages, it's rare to see new coding languages gain any traction, rust is the only recent one I know, and it's had a larger backer behind it since its release 15 years ago. It was supported in house for almost a decade before even seeing the light of day.

Will that happen now? And created (or at least managed) by a human? If not, what would a new language look like? Would it be human maintainable? Understood?

To me, this goes right back to the classic "buying a car with the hood welded shut". No way to maintain, repair, or even evaluate the quality of the thing under you.

CuriouslyC · 2026-04-05T13:07:56 1775394476

Fusion isn't a good example. Self driving cars are a battle between regulation and 9's of reliability, if we were willing to accept self driving cars that crashed as much as humans it'd be here already.

Whatever models suck at, we can pour money into making them do better. It's very cut and dry. The squirrely bit is how that contributes to "general intelligence" and whether the models are progressing towards overall autonomy due to our changes. That mostly matters for the AGI mouthbreathers though, people doing actual work just care that the models have improved.

CuriouslyC · 2026-04-05T13:03:56 1775394236

Just because Bob doesn't know e.g. Rust syntax and library modules well, doesn't mean that Bob can't learn an algorithm to solve a difficult problem. The AI might suggest classes of algorithms that could be applicable given the real world constraints, and help Bob set up an experimental plan to test different algorithms for efficacy in the situation, but Bob's intuition is still in the drivers's seat.

Of course, that assumes a Bob with drive and agency. He could just as easily tell the AI to fix it without trying to stay in the loop.

bigfishrunning · 2026-04-05T13:44:44 1775396684

But if Bob doesn't know rust syntax and library modules well, how can he be expected to evaluate the output generating Rust code? Bugs can be very subtle and not obvious, and Rust has some constructs that are very uncommon (or don't exist) in other languages.

Human nature says that Bob will skim over and trust the parts that he doesn't understand as long as he gets output that looks like he expects it to look, and that's extremely dangerous.

ndriscoll · 2026-04-05T14:02:39 1775397759

Then perhaps Bob should have it use functional Scala, where my experience is that if it compiles and looks like what you expect, it's almost certainly correct.

bigfishrunning · 2026-04-05T14:32:54 1775399574

Sure, but bob is very unlikely to do that unless his AI tool of choice suggests it.

CuriouslyC · 2026-03-26T15:18:10 1774538290

Actions are bad, but they're free (to start) and just good enough that they're useful to set up something quick and dirty, and tempt you to try and scale it for a little while.

knocte · 2026-03-26T18:05:09 1774548309

Exactly. Any github alternative needs to consume same GithubActions syntax OOTB I'm afraid.

rmunn · 2026-03-27T02:55:53 1774580153

Which, as far as I found so far, means Forgejo. Haven't found any others. And even Forgejo Actions says that it's mostly the same as Github Actions syntax, meaning you still have to double-check that everything still works the same. It probably will, but if you don't know what the corner cases are then you have to double-check everything. Still, it's probably the best migration option if you rely on GHA.

suslik · 2026-03-27T06:06:28 1774591588

Gitea also, I think.

CuriouslyC · 2026-03-23T15:13:01 1774278781

Two issues:

1. People don't want to switch frameworks, even though you can pull prompts generated by DSPy and use them elsewhere, it feels weird.

2. You need to do some up-front work to set up some of the optimizers which a lot of people are averse to.

CuriouslyC · 2026-03-22T17:28:00 1774200480

Team scale doesn't tend to impact this that much, since as teams grow they naturally specialize in parts of the codebase. Shared libs can be hotspots, I've heard horror stories at large orgs about this sort of thing, though usually those shared libs have strong gatekeeping that makes the problem more one of functionality living where it shouldn't to avoid gatekeeping than a shared lib blowing up due to bad change set merges.