Hacker Newsnew | past | comments | ask | show | jobs | submit | baruch's commentslogin

It is mathematically impossible for a proper hash function (one with an output range smaller than its input range) to not have collisions. The proof uses the Pigeon Hole Principle https://en.wikipedia.org/wiki/Pigeonhole_principle

ok damn, I did not know this, obviously. Thanks.

I guess I've never actually had this problem because was always hashing things that were static, or specialty cases like password hashes where the salt obviously guarantees uniqueness.


It's very very unlikely to get collisions there, but still not impossible. Whenever you map data of arbitrary length (infinite possibilities) to a limited length collisions are possible.

Doesn't even have to be arbitrary length.

Whenever you map into a smaller space, you get collisions. The bigger space doesn't have to be infinite.


with a password you may be mapping into a smaller space or a bigger space, because what you want is to get them all same length, but yeah you may in some cases be mapping into a smaller space, hadn't thought of that, although I sort of also think it is unlikely.

But there it doesn't matter anyway because the password is put together with the email to identify the user so in practicality passwords will never collide even if they could in theory.


For passwords: the input _space_ is bigger. That doesn't say anything about the length of any particular password.

> But there it doesn't matter anyway because the password is put together with the email to identify the user so in practicality passwords will never collide even if they could in theory.

For passowrds, you are not worried so much about two users accidentally getting the same hash, you are worried about people finding a pre-image that hashes to the same output as your user's password.


>For passowrds, you are not worried so much about two users accidentally getting the same hash

right I was thinking I've probably personally never had a situation where a collision would have affected anything, but then I thought of one, when I had to do image hashing it was a potential problem.


Let's consider a hash table with an allocation of 1MB, which is about 2^20 bytes. Assume also that each entry occupies a byte. Assuming that the hash function's values are distributed randomly, the probability of there being a collision with only 1000 entries is approximately 38% = 1-(2^20)!/(2^20 - 1000)!/(2^20)^1000. See the "Birthday Problem".

You can ask an LLM to create a github action for that. The action can fail if the rebase fails and you can either fix it yourself or ask an LLM to do it for you.


Isn't that handled pretty well these days with sub agents? They can research the code without polluting the context.


Somewhat. The stewards (sub-Claudes) won't share their parents context window and have the ability to talk to one another. That said I'm sure there's a lot out there that solves for this.


There is a global setting you can do to disable using the Claude.ai MCPs from being used on your Claude code.


Thank you so much.


I work with D and LLMs do very well with it. I don't know if it could be better but it does D well enough. The problem is only working on a complex system that cannot all be held in context at once.


I based my opinion on this recent thread, https://forum.dlang.org/thread/bvteanmgrxnjiknrkeyg@forum.dl...

Which the discussion seems to imply it kind of works, but not without a few pain points.


The complaints are against the open-weight LLMs, I didn't try them much. I do use mostly Claude as that's what the company is paying for. They don't pay for laptops with GPUs or locally hosted LLMs to test those.

It's not like it knows perfect D, it does make mistakes and I don't work on a C++ or Rust project to compare its behavior. Generating templates from scratch is a bit of a challenge but given we have plenty of examples in our code with some prodding it manages to write well enough.


Is it possible to put such a driver for nvme under igb_uio or another uio interface? I have an app that uses raw nvme devices and being able to tests strange edge cases would be a real boon!


Ages ago, working on an embedded system we did something similar by running gdb server on the embedded machine and gdb on the server and running a script to collect periodic stack traces to get a sampling profiler.


The company still got $20B of cash(?) in its books, it can pay dividends to its shareholders (investors) and they get their payment. The company can go down the drain afterwards. If it can still make money with its remaining assets that's only a nice small bonus.

So the only ones getting shafted are the employees.


I suppose the firm could simply roll the 20 billion into a long term asset. It’s not a big deal to anyone except employees if the asset never pays out. Departed employees would not be privy to how the money is eventually exited from the now shell company 20 years hence.


We do storage systems and use DPDK in the application, when the network IS the bottleneck it is worth it. Saturating two or three 400gbps NICs is possible with DPDK and the right architecture that makes the network be the bottleneck.


Once there will be a business around this and people will make money the businesses will maintain a lobby to keep doing it and even increase the operation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: