Hacker Newsnew | past | comments | ask | show | jobs | submit | more BrentOzar's commentslogin

> The range only needs to cover the period between mandated brakes.

I was confused there for a second until I realized you meant "breaks."


I have a hard time getting excited about this when they have such an atrocious record of handling pull requests in VS Code already: https://github.com/microsoft/vscode/pulls


It looks to me like they close nearly 30 PRs every day. That's kind of amazing.

I'm no fan of Microsoft but that's a massive maintenance burden. They must have multiple people working on this full time.


If you examine the merged PR's - the overwhelming majority are from Microsoft employees. Meanwhile, community contributions sit and rot.


I thought they just open sourced this? Was there enough time to start reviewing community contributions?


How would I be able to examine the PRs to verify this?


There aren't that many Microsoft employees, it took me a couple minutes to memorize the team.

Of course the majority are from Microsoft, they do seem to merge in a fair amount of PRs from the community though.


Look at the first comment in the PR, it will have a badge "This user is a member of the Microsoft organization". Alternatively, look at the release notes on the website, any non-Microsoft contributions are listed at the bottom.


Right, but how to do that for thousands of PRs to see that there's a bias? I assume it's a ton of work.


Why not sample 20 and see if you can spot a trend?


Because I studied statistics.


You can get the JSON from like,

gh pr list --repo microsoft/vscode --state merged --limit 1000 --json author,mergedAt,title

Then you can do:

jq -r '.[] | [.author.login, .author.name] | @tsv' 1kprs.json | sort | uniq -c | sort -nr

And see there's only 63 authors and > 90% of the merged PRs are from Microsoft (which.. fair, it's their product).

I think the signal is strong enough that you can legitimately reach the same conclusion by mk 1 eyeball.

NOTE: I'm not criticising, it's a Microsoft-driven project and I am fine with that. The _do_ merge things from "random" contributors (yay) but they are overwhelmingly things that a) fix a bug while b) being less than 5 lines of code. If that is your desired contribution then things will go well for you and arguably they do well at accepting such. But they are very unlikely to accept a complex or interesting new feature from an outsider. All of this is seems totally reasonable and expected.


Yet the Settings UI is still a nonsensical mess.


How is it a nonsensical mess? It’s clean, searchable, allows to fallback to json..

I get it might not be perfect, but "nonsensical mess" is maybe in bad faith here.


Valid question. There's some area where the text of the setting's state is the opposite of what the checkbox shows. I'll try to dig up a screen shot.

Historically, setting syntax colors has sucked too, but I don't remember the current state of that.


That's because it's Microsoft's Trademarked version of Open Source.

All the good FOSS vibes, without any of the hard FOSS work...


Out of Apple, Google, and Microsoft, Microsoft is _by far_ the most active and open to open source and contributions.


I hate this analogy. Just because something is open source, doesn’t mean it is forced to commit or comment on every pull request which takes development time. If that notion really bothers you, you are free to fork VSCode and close all 600 pull requests on your fork.


Agree. OSS is hard work and not obligatory.


It's a common theme across most (all?) Microsoft "Open Source" repos. They publish the codebase on Github (which implies a certain thing on it's own), but accept very little community input/contributions - if any.

These repo's will usually have half a dozen or more Microsoft Employees with "Project Manager" titles and the like - extremely "top heavy". All development, decision making, roadmap and more are done behind closed doors. PR's go dormant for months or years... Issues get some sort of cursory "thanks for the input" response from a PM... then crickets.

I'm not arguing all open source needs to be a community and accept contributions. But let's be honest - this is deliberate on Microsoft's part. They want the "good vibes" of being open source friendly - but corporate Microsoft still isn't ready to embrace open source. ie, it's fake open source.


I've looked at a bunch of the popular JS libraries I depend on and they are all the same story, hundreds of open PRs. I think it's just difficult to review work from random people who may not be implementing changes the right way at all. Same with the project direction/roadmap, I'd say the majority of open source repos are like that. People will suggest ideas/direction all day and you can't listen to everyone.

Not sure for VSCode, but for .NET 9 they claim: "There were over 26,000 contributions from over 9,000 community members! "


f. o. r. k. everything costs money, waaaay more than a $5 buy me a coffee. Every PR MS closes costs them thousands of dollars.


Why is that bad? Seems like a perfectly valid approach for an Open Source project, SQLite is doing the same.


I'm not sure I see the problem. The number of merged PR's looks on the high side for a FOSS project.

https://github.com/microsoft/vscode/pulls?q=is%3Apr+is%3Aclo...


I've had a lot of PRs merged. If you don't create an issue or the issue already says it doesn't suit their vision then it won't get merged. It also helps to update the PR in November/December, even if there are no merge conflicts, as that's when they "clean up" and try to close as many as possible.


Author here - whoa, didn’t expect this to hit HN. Here for any questions, but I think the post speaks for itself.


Summary of the video: 24 hours before the fatal helicopter-American CRJ midair crash, a similar event was prevented by TCAS (traffic collision avoidance system) because the plane was above 1000 feet altitude. It shuts off below that.


So a trial run then?...


It sounds like you're not training it with your existing code base, and that you're running it with relatively small contexts. Have you done any custom LLM training on your code base, and what model are you using?


Who is training models on a code base? That's an extraordinary use case.


Engineers that want to drastically improve their AI tooling outcomes so they can iterate at scale?


Have you or anyone you work with trained a model for code base familiarity?


Reminds me of Go being released without a package manager because at Google they didn't need it lol What codebase size do you have?


> SQL Server is one of the few commercial DB's that does real nested transactions

Not sure where this myth keeps coming from, but no, it does not:

https://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a...


Interesting. Is this still up-to-date? The link you posted is 15 years old


Yep, nothing's changed there around transactions.


Because it's more about trends than current rankings.


> Anyone knows which movie was that?

Strange Days: https://en.wikipedia.org/wiki/Strange_Days_(film)


Which, amusingly, required development of a customized camera to film:

> The film's SQUID scenes, which offer a point-of-view shot, required multi-faceted cameras and considerable technical preparation.[5] A full year was spent building a specialized camera that could reproduce the effect of looking through someone else's eyes.[5] Bigelow revealed that it was essentially "a stripped-down Arri that weighed much less than the smallest EYMO and yet it would take all the prime lenses.

It's an unfairly forgotten film. Much like Blade Runner, it suffers from a clunky plot but has quite smart world building.


Film geeks still talk about Strange Days. It has a cult following.


Thanks so much, I had tried in vain to find this movie before, but it's truly forgotten! I loved it at the time, it made a serious impression on 17yo me, will see if I can get to watch it again after so many years.


Out of curiosity I pasted your comment into ChatGPT and asked it which movie you were referring to and it got it correct.

I find GPT quite useful for those “tip of your tongue” type queries, and have used it to name movies and actors quite a few times.


Ha, I don't know why it didn't occur to me to ask AI :D. Will remember that next time.


Wow what a loop close. I’ve also been wondering randomly about this movie with no idea what it was called. I remember this film and it left an impression on me. I’ll even bring it up to people from time to time. Didn’t occur to me for some reason to have a GPT try and guess it.

What’s funny is that there are others out there that are thinking the same thing regarding that film. Cheers!


Can’t you send PDFs to a Kindle? Been a decade since I used one, but I vaguely remember that being a thing.


Yes you can, but it's not the best format to read on, and a PDF with 2 columns is kind of even harder to read.


Pdfs are perfect on the kindle scribe, so long as they are not color. This is my primary use for the device. On smaller kindles, pdfs are probably a pain to read though.


> no trouble uploading all of that data in less than a week

When you're doing e-discovery, deadlines are often measured in days - not just for the upload time, but for the analysis and finding the needle in the haystack.


Also gotta think of what else is using the corporate internet pipe you can’t drown it in one aws upload for days.


I'd imagine with LLMs today, discovery work is probably done on the cloud by bots.


It'd be a great way to get sued for negligence. You can't even assume the counterparty has correctly put everything into discovery for you. What you don't know is what gets you into trouble.

An example from the Karen Reed case, the police, somehow, uploaded a video that had been put through a "mirror filter" and thus showed a vehicle in the opposite orientation from reality. Is your LLM going to notice that?


Do you know of a single attorney who has been held liable for negligence for using an LLM to help accelerate their document research work?


It's already a common grounds for a malpractice suit. In a lot of cases these will be handled by the attorneys insurance and will probably be settled.

You really shouldn't take the absence of evidence as any sort of evidence itself.


Using an LLM is “common grounds” for a malpractice suit? Come on, the technology hasn’t even been around that long. Without corroborating evidence, why should anyone believe you?


Failure to properly perform discovery is already common grounds for a malpractice suit. I don't care if you believe me. You seem to have your mind made up anyways.


Yes, you can be held liable for failing to properly perform discovery. But the general case isn’t what we’re talking about here. It’s the specific case of using an LLM to assist with it.

> You seem to have your mind made up anyways.

I haven't made up my mind about anything; it's you who claimed that using an LLM "is a great way to get sued for negligence." It's a fundamental rule of debate that person who makes the argument bears the burden of supporting it.

You seem to be making the implicit assumption that using an LLM to assist with the process will probably be found to constitute negligence. Again, why should anyone believe you, especially if it hasn’t happened yet? Your argument is just FUD, pure and simple.

As an attorney I can tell you these questions just aren't that simple. You can get sued for anything. But that's not really all that important. What matters is whether LLMs would do a worse job of performing document review than human review would. The answer to that question will depend on the specific facts of the case and the current state of the art.

We simply don't know yet what the error rate is of using an LLM; and the tech is improving rapidly. One should expect enterprising attorneys to test them out experimentally to build trust. For example, they can easily be tested vs. human review on small document corpuses.


A few years ago there was definitely document processing automation and query based filtering but still alot of human work.

I assume you’re right and AI now does some of the work but I doubt all of it. Also how reliable would the AI be… you’d hate to not have critical evidence at trial because you trusted the AI fully and it missed something.

Discovery data includes audio, video, social site data, as well as the usual documents and emails.


Yeah that’s the quickest way to go bankrupt. Imagine trusting the current LLMs to do that and the prompts involved. No one is going to trust that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: