> An ongoing [United States] military investigation has determined that the United States is responsible for a deadly Tomahawk missile strike on an Iranian elementary school, according to U.S. officials and others familiar with the preliminary findings.
That article is based on anonymous sources ("according to [people] familiar with the preliminary findings").
It doesn't mean it's wrong, but it's not an official confirmation by the US government, and it only speaks to the responsibility of the strike, not the various claims of "killed children".
Those sources don't say anything about casualties, or the presence of children. The NYT does its best to make it sound like they do ("responsible for a deadly strike"), but so far the only source for how deadly it is, remains the IRGC. And the NYT happily quotes their claim that the death toll was "at least 175 people".
For what it's worth, I personally believe the US is responsible for the strike. I also think the IRGC is lying about casualties, but there's no way to know for sure, and a US investigation probably won't tell us more on that point.
You’re acting like the U.S. government is a monolithic good faith actor right now. The current administration’s behavior is qualitatively different than past administrations.
Do you also believe this administration will ever officially confirm Renee Good and Alex Pretti were not domestic terrorists?
It’s hard to interpret your points charitably here.
So I read the entire TFA, where do you see “quotes [from] those in the know who believe this should have been eliminated as a target”? I saw no such quotes about the school in TFA. Maybe I missed it.
> there was precisely one mis-strike in 1000s of sorties
How did you verify this? Because I’ll remind you, the U.S. administration denied responsibility for some time before owning up to this due to public pressure. Absent public pressure, I guess we would’ve had zero mis-strikes.
> so this already is a low error rate
As a father of similarly aged daughters, I can’t express enough how grotesque and disturbing the term “error rate” is here.
We targeted and killed young children. Plain and simple.
> However, you have made a very, very strong assumption that these targets were not carefully evaluated.
Let’s take the opposing assumption that this target was carefully evaluated then. Please reason through the implications now?
> So I read the entire TFA, where do you see “quotes [from] those in the know who believe this should have been eliminated as a target”? I saw no such quotes about the school in TFA. Maybe I missed it.
TFA is from The Guardian while GP you responded to specifically called out the NYT analysis. These are different things. Maybe reading the GP's suggested source would leave you with a different set of questions?
I will try to respond to all these independent threads, but we can't continue all of them at once.
> . “These aren’t just nameless, faceless targets,” he said later. “This is a place where people are going to feel ramifications for a long time.” The targeting cycle had been fast enough to hit 50 buildings and too fast to discover it was hitting the wrong ones.
> The air force’s own targeting guide, in effect during the Iraq war, said this was never supposed to happen. Published in 1998, it described the six functions of targeting as “intertwined”, with the targeteer moving “back” to refine objectives and “forward” to assess feasibility. “The best analysis,” the manual stated, “is reasoned thought with facts and conclusions, not a checklist.”
> A former senior government official asked the obvious question: “The building was on a target list for years. Yet this was missed, and the question is how.”
---
> Please reason through the implications now?
It was a mistake. My girls are about to enter this level of school, as well (cool parent card). A mistake/error/tragedy can all accurately be used to describe this. It's horrible it happened. All I'm saying is that no process is perfect. It is not excusable, but it is unfortunately understandable how it happened in this situation.
> 1000s
1000s is fairly easily understood. 1/1000 is inferred b/c as you say, "public pressure" sprang up immediately after this one bombing. Iran regularly posts pictures and videos online, and human rights orgs are clamoring to find evidence. Either we are really good at suppressing the world except for this one case or there aren't that many schools being bombed. We cannot be simultaneously horrible at picking targets and suppressing evidence and also great at it in every other case. Planet labs themselves provided the pictures - they are freely available.
Yes maybe the machine lumbers on, stomping on kids, or maybe we've learned our lesson and are now perfect, but this seems like the kind of mistake that can happen, and it seems likely that the analysts involved here are now benched and I wouldn't be surprised if some corrections are happening internally. These are human beings, despite what the article would have you believe, that are doing the best they can.
> we targeted and killed young children
We killed young kids, but not on purpose. We targeted a building and intent matters. I refuse to believe anyone in the decision chain would move forward if they believed kids were going to be killed. If you do - how can you? Why would they?
We're going to quickly get into hypotheticals here. There's a lot of open threads, and believe me I hate with the fullest extent of the word violence against children. We can leave it at that.
It feels like an appreciation for hypotheticals or givens is missing here. One can simultaneously be against the war and the bombing in general, and also accept it as a given and then think about a certain situation being understandable within that given.
I can't answer why they would do it, but I don't think it's unusual for these people to knowingly strike civilian targets that they believe will have children present. In the famous Pete Hegseth leaked Signal chat, they were discussing bombing a residential apartment building in the middle of the night because they thought a single target was there visiting his girlfriend. Obviously that carries a high risk of killing children, and in that particular case the Secretary of Defense and Vice President were intimately involved and celebrated after learning that the building had collapsed. If those at the very top are willing to move forward with bombing civilians asleep in a residential building, I have to believe that everyone below them in the chain of command is expected to follow their lead.
This is very different from targeting civilians as a goal in itself, which is what it would have had to be if this was not just negligence, but intentional, as GP suggested. Parent correctly points out that there's both no political incentive for that, and that it's not realistic from a psychological point of view, given reasonable assumptions about human nature.
The claim I'm responding to is "I refuse to believe anyone in the decision chain would move forward if they believed kids were going to be killed." I agree it's unusual for anyone in the US military to drop a bomb primarily because they want to kill some children. I think it is not unusual for people involved in bombing campaigns to anticipate killing children and move forward anyway.
> This is very different from targeting civilians as a goal in itself
Targeting a single person which might be a valid target had war been declared, while also intentionally striking many civilians around them, is the same as targeting those civilians. You knew the bomb you dropped was going to kill them, and you pressed the button. It makes no difference who the primary "target" is.
Otherwise, countries would just bomb all the civilians and all their infrastructure and medical facilities and schools with the excuse that they heard from an unnamed source that there was a combatant nearby, like israel does in Palestine.
No evidence has shown up suggesting there was some sort of compelling target in the school. As foul as Trump and Hegseth may be, they aren't cartoon character villains. The Occam's razor explanation is that this was an intelligence failure and a tragic mistake.
There are no cartoon villains in general, that's the point GP is making by using the word "cartoon". Let's use some common sense, it's not like Trump and Hegseth got together and sneaked in the school on the list of targets just because they liked the idea of children being killed. It's naive to suggest this is a possibility worth considering.
Yeah, going to have to go ahead and disagree with you there boss. The man Hegseth in all his 'no quarter' bravado is only affirming his own mother's claim that he is a piece of shit. respectfully of course, I would not put it past him to kill some kids for a political or terrorism reason (the parents).
just because you assume that trump and hegseth aren't cartoonishly evil, doesn't mean they aren't. looking at america's actions for a long time, the occam's razor explanation is that america is cartoonishly evil. the reason you struggle with that is about emotions, not logic. and i get it.
Your first two quotes are about targeting in the Iraq War; specifically how the breakdown in careful analysis, precipitated by the new systems, led to the exact mis-targeting they were trying to solve. That’s what the entire article is about.
And your third quote is from an ex-official commenting on the event after the school strike happened.
These quotes contradict your original point, ie they show how careful analysis has been designed out of the system.
> We killed young kids, but not on purpose. We targeted a building and intent matters. I refuse to believe anyone in the decision chain would move forward if they believed kids were going to be killed. If you do - how can you? Why would they?
This sounds incredibly naive. For starters, plausible deniability due to diffuse responsibility is a thing.
“Of course we don’t target schools and kill children, this was a system error.” But the message gets sent regardless and meanwhile we have people arguing back-and-forth over grains of sand because they took an action with deliberate plausible deniability.
Of course they didn’t intend to kill the children, they only intended to disperse the strikers by setting their tents on fire. It was simply a mistake.
> I refuse to believe anyone in the decision chain would move forward if they believed kids were going to be killed. If you do - how can you? Why would they?
Because they’re openly callous and contemptful of anyone they don’t consider a heritage American? Because the admin has already abused children to lure out parents in their anti immigrant push?
And that’s before getting into the Epstein file allegations and if he raped and killed kids already.
I’m gonna throw it back on you, how can you believe that this admin cares if foreign kids die?
we are speaking politicians who make a habit of bluster and liking "shows of force" and are openly contemptful of the lives of those who don't agree with or look like them
some of them believe that it is their religious duty to start this war and make it heinous enough to start ww3 and bring forth the return of jesus christ
I think you are ascribing a level of systems thinking and care about consequences which one cannot simply assume is there
if you were to, say, start with an assumption that some of the actors have the mental patterns and world model of an angsty, self-centered teenager, or younger, then you might draw different conclusions
You have evidence in front of you on a weekly basis of these people being that evil and that stupid, and we’re coming up on 2 years of that playing out.
It's incredible, after all the grotesque stories about rape, torture and murder of children, men and women during the Iraq war, active support of genocide (and 10s of thousands of children murdered by Israel, on purpose), prisoners rape and child imprisonment, a "secretary of war" and president publicly admiting to war crimes and saying things like "negotiate with bombs" you still "refuse to believe" that anyone in decision chain wouldn't do anything like this.
My comment is to say the US has proven how brutal they are consistently through all the wars of aggression they have waged in the past several decades. They do not see their "enemies" as human. I can't fix anything unfortunately.
The terrorists that struck the World Trade Center targeted a building too.
If we aren't going to have a military doctrine that cares about who's in the building, we will be treated the same by our enemies. I don't think we want that.
If I recall we saw two planes. We did not see any individual as such in the planes, did we? We saw some passports; not sure that this proves much at all. We also had WTC 7 going down and the strike on the other building (was it in Washington) but not much aside from this.
I am not saying the-cake-is-a-lie, everything was fabricated, mind you. What I am saying is that IF we are going to make any conclusions, we need to look at what we have, and then find explanations and projections to what is missing. For instance, any follow-up question such as damage to a building, can be calculated by a computer, so this is not a problem. The problem, though, is IF one can not trust a government, to then buy into what they show or present to the viewer. Hitler also used a fake narrative to sell the invasion of Poland, for example: https://en.wikipedia.org/wiki/Gleiwitz_incident
That does not mean everything else is a false flag or fake, per se, but I do not automatically trust any allegation made by any government. You can look back in history and wonder about attempts to sell explanations, such as Warren Commission and a magic bullet switching directions multiple times. Again, that can be calculated via computers, so that's not an issue per se; the issue is if they made claims that are factually incorrect and/or incomplete.
> Either we are really good at suppressing the world except for this one case or there aren't that many schools being bombed. We cannot be simultaneously horrible at picking targets and suppressing evidence and also great at it in every other case.
...is a logical fallacy (false dichotomy). It presumes a level of intent that isn't necessarily present.
For an example of how these might coexist, I'd encourage The Toxoplasma of Rage, which is a long essay that frequently comes up here:
The idea is that rage is its own, self-replicating emotion, and given the medium of the Internet, it's possible that some memes have no purpose other than self-perpetuation. A story about a girls' school being blown up is self-replicating: it gets people riled up enough to share it. A story about a random factory, or some dead person's house, or an empty patch of desert is not really. It's entirely possible that attacks on these happened hundreds of times in the Iran war, but if it did, I would never know about it. I probably wouldn't care about it. Those are not stories that go viral, they don't have enough emotional valence to make people care. And the media knows this, and so they don't bother to seek them out or run them.
And in fact, a handful of illegal targets get hit each day, according to HRANA. HRANA is an Iranian human rights org that was banned in Iran during an election and has since been re-established in the US. They are a reliable source.
If you scroll down to the "Facilities Protected Under International Humanitarian Law", you will see a list of non-military targets. That part is never empty in these reports.
How much did this cost? Has there ever been an engineering focus on performance for liquid?
It’s certainly cool, but the optimizations are so basic that I’d expect a performance engineer to find these within a day or two with some flame graphs and profiling.
He used Pi as the harness but didn't say which underlying model. My stab-in-the-air guess would be no more than a few hundred dollars in token spend (for 120 experiments run over a few days assuming Claude Opus 4.6 used without the benefits of the Claude Max plan.)
So cheaper than a performance engineer for a day or two... but the Shopify CEO's own time is likely a whole lot more expensive than a regular engineer!
> I want to see good/interesting work where the model is going off and doing its thing for multiple hours without supervision.
I'd be hesitant to use that as a way to evaluate things. Different systems run at different speeds. I want to see how much it can get done before it breaks, in different scenarios.
I never claimed Opus 4.5 can one-shot things? Even human-written software takes a few iterations to add/polish new features as they come to mind.
> And you clearly “broke” the model a few times based on your prompt log where the model was unable to solve the problem given with the spec.
That's less due to the model being wrong and more due to myself not knowing what I wanted because I am definitely not a UI/UX person. See my reply in the sibling thread.
Apologies, I may have misinterpreted the passage below from your repo:
> This crate was developed with the assistance of Claude Opus 4.5 initially to answer the shower thought "would the Braille Unicode trick work to visually simulate complex ball physics in a terminal?" Opus 4.5 one-shot the problem, so I decided to further experiment to make it more fun and colorful.
Also, yes, I don’t dispute that human written software takes iteration as well. My point is that the significance of autonomous agentic coding feels exaggerated if I’m holding the LLM’s hand more than I have to hold a senior engineer’s hand.
That doesn’t mean the tech isn’t valuable. The claims just feel over exaggerated.
If you click the video that line links to, it one-shot the original problem as very explicitly defined as a PoC, not the entire project. The final project shipped is substantially different, and that's the difference between YOLO vibecoding and creating something useful.
There's also the embarrassing corner physics bugs present in that video, which was something that required a fix in the first few prompts.
Weird, I broke Opus 4.5 pretty easily by giving some code, a build system, and integration tests that demonstrate the bug.
CC confidently iterated until it discovered the issue. CC confidently communicated exactly what the bug was, a detailed step-by-step deep dive into all the sections of the code that contributed to it. CC confidently suggested a fix that it then implemented. CC declared victory after 10 minutes!
The bug was still there.
I’m willing to admit I might be “holding it wrong”. I’ve had some successes and failures.
It’s all very impressive, but I still have yet to see how people are consistently getting CC to work for hours on end to produce good work. That still feels far fetched to me.
Almost always they start by having connections that hire them (old colleagues, former friends, etc.), building out those connections (conference talks, doing really good work, writing high quality blogs), and then if you're lucky, some word of mouth.
Good points - admittedly, I didn’t put enough effort into building connections through different pipelines back when I was contracting. Upwork and a few personal connections were my sole sources.
It just felt really difficult to do both the engineering work while trying to do customer development at the same time.
The fact that OP has been able to do this for so long, while supporting a family, piqued my interest.
While I do have some experience with vibe coding, this could've been done by my wife who has little tech knowledge. That's the scary part.
My flow was to open 3 terminals, ask AI to work on some feature in each, check how it looks in the frontend and if it didn't look/work quite right, I asked it again. Once I deemed the feature OK, I just cleared the context and went on to a new task. The 3 terminals ate through claude $20 within 10-15 minutes.
I wonder if this breaks at some point when the codebase is more complex/large, or does not. If it doesn't, then the future is scary, because everyone can recreate many of the SAAS products within hours. What's the moat for Todoist for example? Without AI, it would've taken quite some effort, know-how and time to get something similar up and running. I reckon that with the $100 plan, I could have made it almost identical to it. Perhaps I could even create mobile app builds as well (react native perhaps).
What stops me from then offering this for 1/5 the cost of the real app?
And that's established apps. Imagine how easy/trivial it is to clone something that's new, and that was possibly vibe coded itself. E.g. someone posts to HN "Show HN: I made xyz". It looks great, it works great, it has a great idea. Then we take LLM's and recreate it within 4 hours. Poof! There's no reason to pay for it instantly.
That's what I find depressing, though - having a great idea and using LLM to create a great product, will not be enough. People will be able to clone everything. At least that's what my little experience with claude tells me. And now let's just wait 1 more year and see how good claude code 2.0 and co will be. I reckon sooner than later, 0 tech-knowledge will be needed to get apps up and running.
That's why it's time to pivot to some other role in the near future ;)
> It's that we're paying more for objectively worse service than we had a decade ago.
> I'm not asking for magic, I'm asking where went the reliability we already had, at the prices we're already paying.
My god thank you! My partner and I have been talking about this for the past 2 years in the context of food service and delivery service industry.
Greater than 50% of all our restaurant orders are straight up wrong or missing items, whether it’s from local places, chains, or fast food restaurants.
The unreliability is staggering, especially because we’re paying so much more!
It’s gotten so bad that we’re done with certain services and establishments for good now, or we make sure to QC before leaving the restaurant to ensure everything is in the bag.
Even more ironic, this happened a couple weeks ago at Texas Roadhouse — the same restaurant I worked in decades ago as a teenager, so I remember the process we had to go through for to-go orders.
First, we’d take the order over the phone. We’d repeat the order back to the customer to confirm everything (1st QC). When the food came up in the window, we’d pack the food in bags, crossing off every item on the receipt before stapling it to the bag (2nd QC). When the customer came to pick up their food, we’d have to take every box out of the bag, show the customer the food, and confirm that everything they expected in their order was there (3rd QC).
No customer. Every left. With an incorrect order. Simple.
That process is gone now. We paid more and came home missing my partner’s meal. Wtf.
I hear a lot of stories like this; but the question I always come back to is - what incentive is their to discard the working system? In your case its the 3qc step process, why is that just gone now?
The more I look into these systematic changes the less sense it makes.
I don't know anything about Texas Roadhouse but in general I'd say a lot of processes got sloppy after technological changes repeatedly added complexity.
Decades ago, when OP worked there, I'm guessing Texas Roadhouse only took takeout orders over the phone and in person, and didn't receive that many. It was less common to order takeout from sit-down restaurants. There was one procedure, the steps made logical sense, it could be implemented entirely within the restaurant without a lot of IT. And it worked, and it probably didn't take that much staff time on a normal night.
Now, you can still order by phone, but I see you can also at least order online for pickup, order via UberEats for delivery, and order via DoorDash for pickup or delivery. They've likely added these various modes over time, and I'm sure each has its own subtly different procedure reflecting various IT systems nobody in the restaurant has any control over.
The three-part QC process might still work for phone orders but those are probably rare. Orders picked up by the actual end customer could use a two-part QC process, verifying the items against the receipt and presenting them for visual inspection. But orders getting picked up by a delivery person can practically only get a quick check as they're loaded in the bag, because the delivery person is in a hurry and won't want to stand there and help check the receipt against what's in the bag. They also may not be able to effectively do so if they don't know the menu (for instance, there are several sides that could be "steamed vegetables" at a quick glance https://www.texasroadhouse.com/location/457-countrysideil/di...) and aren't sufficiently fluent in English.
Rather than have a complex flowchart for the overworked staff maximizing QC for every case, it's very easy to default to the minimum which works in every case, which is hurriedly comparing the menu items that come out of the kitchen to what's on the receipt as they're loaded into the bag. It's very easy to get this wrong, especially if you're overworked and distracted and loading multiple bags at once.
Uhm, you actually just proved their point if you run the numbers.
For simplicity’s sake we’ll assume DeepSeek 671B on 2 RTX 5090 running at 2 kW full utilization.
In 3 years you’ve paid $30k total: $20k for system + $10k in electric @ $0.20/kWh
The model generates 500M-1B tokens total over 3 years @ 5-10 tokens/sec. Understand that’s total throughput for reasoning and output tokens.
You’re paying $30-$60/Mtok - more than both Opus 4.5 and GPT-5.2, for less performance and less features.
And like the other commenters point out, this doesn’t even factor in the extra DC costs when scaling it up for consumers, nor the costs to train the model.
Of course, you can play around with parameters of the cost model, but this serves to illustrate it’s not so clear cut whether the current AI service providers are profitable or not.
NVIDIAs 8xB200 gets you 30ktps on Deepseek 671B at maximum utilization thats 1 trillion tokens per year. At a dollar per million tokens that's $1 million.
The hardware costs around $500k.
Now ideal throughput is unlikely, so let's say your get half that. It's still 500B tokens per year.
Gemini 3 Flash is like $3/million tokens and I assume it's a fair bit bigger, maybe 1 to 2T parameters. I can sort of see how you can get this to work with margins as the AI companies repeated assert.
Cool, that potential 5x cost improvement just got delivered this year. A company can continue running the previous generation until EOL, or take a hit by writing off the residual value - either way they’ll have a mixed cost model that puts their token cost somewhere in the middle between previous and current gens.
Also, you’re missing material capex and opex costs from a DC perspective. Certain inputs exhibit diseconomies of scale when your demand outstrips market capacity. You do notice electricity cost is rising and companies are chomping at the bit to build out more power plants, right?
Again, I ran the numbers for simplicity’s sake to show it’s not clear cut that these models are profitable. “I can sort of see how you can get this to work” agrees with exactly what I said: it’s unclear, certainly not a slam dunk.
Especially when you factor in all the other real-world costs.
Google runs everything on their tpus which are substantially less costly than to make and use less energy to run. While I'm sure openai and others are bleeding money by subsidizing things, I'm not entirely sure that's true for Google (despite it actually being easier if they wanted to).
> An ongoing [United States] military investigation has determined that the United States is responsible for a deadly Tomahawk missile strike on an Iranian elementary school, according to U.S. officials and others familiar with the preliminary findings.
reply