> I reboot, log into Epic and GOG, and start downloading The Outer Worlds, a game from 2019 I’ve been playing a bit lately. It runs fine with Proton, and I can even sync my saves from the cloud. I play it for a few minutes with my trackball, remember I hate gaming on a trackball, and plug my gaming mouse back in. It works fine as long as I’m in the game, but outside the game, mouse clicks stop working again. It makes sense — the bug is on the desktop, not in games — but it’s very funny to have a gaming mouse that only works for gaming.
What is it with mice and OSes?
Windows is the only OS I can seem to configure to get low latency, high accuracy, linear movement with, and it's not for lack of effort.
I struggled for several years to do SWE work on a Mac and no 3rd party program could get it working the way it does on Windows. I tried Linear Mouse and many others. I eventually gave up, went against the prevailing (90%) culture where I work, and exchanged my mac for a windows laptop. I haven't measured it, but I feel more productive simply because I can click what I want to click marginally faster.
Is something in Mac drivers performing non-linear mapping? Why?
Based on the quote above it seems like Linux hasn't even gotten up to par with Mac for mice.
The best litmus test for an OS for me is whether I could play an RTS or FPS competitively with it, even though I haven't played either for years.
I see the reasoning for accountants keeping future liabilities off of the balance sheet. I do this myself in multiple contexts.
Still, when making decisions about whether to take out or grant a loan (personal or business) I need to consider future "value" and cash flows. To someone running a business this is probably more important than the balance sheet. So I think the interest recording criticism is valid but relatively minor in the context of the whole article.
It's not keeping future liabilities off balance sheet. It's marking them at their current value. Same thing for assets. Nobody wants to see a balance sheet where 30 year government bonds are written down at the sum of all interest payments to be received plus the principal. If you did that, you'd have balance sheets jumping all around the place as companies just managed cash on a day to day basis.
The vast majority of the article is trash. It's wrong in many situations. The only reason the accounting issue was brought up is it's early, and so incredibly stupid that it renders the rest of the thing untrustworthy. The rest is bad. If you don't think so, you don't know the subject and are learning from bad sources.
One way would be to use a polycentric form of law where each individual could determine what or which form of law to live under that protects them by voluntarily entering some sort of protective group. Or choose none at all and merely protect their natural rights on their own.
A key difference here from democracy is that merely living in one place doesn't lock you into a specific legal system.
Of course these also suffer one of the same weaknesses as democracy, e.g. if certain groups disagree with you they can just kill you if they're able. We see this in US for instance where if a guy named Randy Weaver cuts a shotgun 1/4" too short than what the 'people' say your right to bear arms includes and then for contested reasons doesn't show up for court, then a man named Lon Horiuchi can snipe his wife dead while holding a child and then get promoted and go on to do similar things at Waco.
It’s illuminating and sad to piece together the picture of what you actually want. You mention dictatorship being better than democracy and then talk about “polycentric law” which seems to basically be sovereign citizen stuff. And then you trot out Weaver who was specifically under siege for refusing to appear on charges of dealing illegal arms to white suprematists.
You don’t have a problem with democracy. You’re just a white supremacist. You want a white dictator so you don’t have to worry about the voting rights of minorities or whites who might be sympathetic to minorities.
It must be tough trying to illegally deal arms at an Aryan Nations meeting and not know which guys are really white supremacists and which are undercover agents.
Weaver never 'tried' to be an arms dealer. The ATF approached him and asked him to do it. The ATF was the only 'white supremacist' that we even suspect Weaver might have discussed gunsmithing with. And as far as I know, he's never even been convicted of doing so. The only thing Weaver was convicted of AFAIK is not showing up for court, for something he is still presumed innocent of (edit: actually, fully acquitted of).
It's quite possible he didn't even cut the shotgun too short. The barrel of the gun is supposedly in ATF archives somewhere, but no one seems to know where it is. Very convenient.
And that’s shitty and entrapment. Which is why the charges were dismissed.
I wonder how you see the government do something bad and think a dictatorship is the solution.
> The ATF was the only 'white supremacist' that we even suspect Weaver might have discussed gunsmithing with.
Is there something backing this claim? Like, was there testimony somewhere from the rest of the Aryan Nations group that he never discussed this with anyone else?
You’re also putting “white supremacist” in quotes like that’s questionable. Maybe the ATF agent was not a white supremacist but this was at an Aryan Nations meeting and Weaver was a self described white separatist.
>Is there something backing this claim? Like, was there testimony somewhere from the rest of the Aryan Nations group that he never discussed this with anyone else?
OK I stand corrected, you still suspect it, even though the only evidence has turned up is that he dealt to ATF agents after they asked him to do it but actually he was acquitted of that. So you suspect he did this with others without evidence just like I might suspect he is Bigfoot or DB Cooper since I cannot prove that negative.
>I wonder how you see the government do something bad and think a dictatorship is the solution.
I do not. I am not in favor of any monopolistic form of government, so that eliminates both democracy and dictatorships.
> One way would be to use a polycentric form of law where each individual could determine what or which form of law to live under that protects them by voluntarily entering some sort of protective group. Or choose none at all and merely protect their natural rights on their own.
I don't have much in the way of critique or judgement to offer on this political philosophy, just an observation: it sounds tribal or even pre-civilization. Out of curiosity I asked an LLM what present day countries most closely implement it. It came back with Somalia and a label: anarcho-libertarianism, with the caveat that it isn't an exact match. Historical examples were also interesting. I'm curious whether you think that's a good example or not.
If the world had more unsettled land I think your ideal would be a lot easier to implement. The U.S. was borne out of people fed up with their current situation (legal or otherwise) deciding to start something new. The fact that it's made up of 50 states, each with their own set of laws and relatively high internal mobility, suggests that its already a mild compromise away from pure democracy and toward your ideal.
To me the purest form of your ideal seems unstable, especially in the face of power imbalances and conflicting choices, and I suspect it would inevitably evolve into something else. As far as I can tell history supports that view.
Yes it was sort of done in Somalia. That's how xeer law works -- there is a great book by Dutch lawyer Michael van Notten[0] that explains how polycentric law works in Somalia. It was found by most objective measures to be more stable and prosperous than democracy there [1] -- the researchers called it 'anarchy' but actually the period of 'anarchy' in somalia wasn't so much anarchy but decentralized legal system.
This allowed Somalia to be one of the few regions in sub-Saharan Africa that had a fairly smooth negotiation between interacting with various tribes while preventing any majority tribe from crushing the minority tribes. Tribes could still live in the same regions and practice their own laws while allowing feuds to be appealed up intertribal 'courts.' Thus even if it was just a guy and a camel and another guy and a camel, you still had law and you could even dish out the consequences yourself but still be held accountable up the chain.
I do agree the 50 states was an interesting and helpful idea. Under the constitutional form of the federal government, which narrowly restrains the federal government via the 10th amendment, there was a lot more room for states to 'compete' yet free travel and trade between the states.
You could probably get a lot closer to a hybrid of ideals by pulling the powers of the federal government way back into what the constitution authorizes. It wouldn't be polycentric law but it would make the monopoly far less onerous, as the cost of moving between jurisdictions is pretty cheap. There were a lot of challenges with racism and sexism in early USA but overall the restraints on the federal government were very good at giving the states a close approximation of polycentric law. Most of this started to get crushed in the very early 1900s and completely crushed by the 30s, although the civil war's elimination of any notion of a right to secession pretty much sealed the deal that the feds could gain an iron grip and the states couldn't check those powers by seceding so they had no real teeth to stop it.
>o me the purest form of your ideal seems unstable
Yes this is the story of the history of man. Hardly any theoretically pure form of governance has been able to exist in the history of man, let alone be stable in that form.
> Neither was slavery. Was that OK too? And to clarify (though it’s worrying this point needs to be made), I mean morally.
It may well have been morally OK to most people (see: moral relativism), and since you're implying it wouldn't have been OK to you, it's worth pointing out that you probably wouldn't have done anything about it in the relevant time periods.
If you're an American you don't even need to try that hard to make moral relativism visceral: was the displacement (and far worse) of Native American tribes "OK"? I'd say no, but it isn't morally urgent enough to me or the 99%+ of Americans who are unwilling to pack their bags and return the entirety of two continents to the native descendants.
This thought process is pretty baffling to me, and this is at least the second time I've encountered it on HN.
What's the value of a secret benchmark to anyone but the secret holder? Does your niche benchmark even influence which model you use for unrelated queries? If LLM authors care enough about your niche (they don't) and fake the response somehow, you will learn on the very next query that something is amiss. Now that query is your secret benchmark.
Even for niche topics it's rare that I need to provide more than 1 correction or knowledge update.
I have a bunch of private benchmarks I run against new models I'm evaluating.
The reason I don't disclose isn't generally that I think an individual person is going to read my post and update the model to include it. Instead it is because if I write "I ask the question X and expect Y" then that data ends up in the train corpus of new LLMs.
However, one set of my benchmarks is a more generalized type of test (think a parlor-game type thing) that actually works quite well. That set is the kind of thing that could be learnt via reinforcement learning very well, and just mentioning it could be enough for a training company or data provider company to try it. You can generate thousands of verifiable tests - potentially with verifiable reasoning traces - quite easily.
Ok, but then your "post" isn't scientific by definition since it cannot be verified. "Post" is in quotes because I don't know what you're trying to but you're implying some sort of public discourse.
1. The purpose of the benchmark is to choose what models I use for my own system(s). This is extremely common practice in AI - I think every company I've worked with doing LLM work in the last 2 years has done this in some form.
> To me it's in the same spirit as claiming to have defeated alpha zero but refusing to share the game.
This is an odd way of looking at it. There is no "winning" at benchmarks, it's simply that it is a better and more repeatable evaluation than the old "vibe test" that people did in 2024.
I see the potential value of private evaluations. They aren't scientific but you can certainly beat a "vibe test".
I don't understand the value of a public post discussing their results beyond maybe entertainment. We have to trust you implicitly and have no way to validate your claims.
> There is no "winning" at benchmarks, it's simply that it is a better and more repeatable evaluation than the old "vibe test" that people did in 2024.
Then you must not be working in an environment where a better benchmark yields a competitive advantage.
> I don't understand the value of a public post discussing their results beyond maybe entertainment. We have to trust you implicitly and have no way to validate your claims.
In principle, we have ways: if nl's reports consistently predict how public benchmarks will turn out later, they can build up a reputation. Of course, that requires that we follow nl around for a while.
My question was "What's the value of a secret benchmark to anyone but the secret holder?"
The root of this whole discussion was a post about how Gemini 3 outperformed other models on some presumably informal question benchmark (a"vibe test"?). When asked for the benchmark, the response from the op and and someone else was that secrecy was needed to protect the benchmark from contamination. I'm skeptical of the need in the op's cases and I'm skeptical of the effectiveness of the secrecy in general. In a case where secrecy has actual value, why even discuss the benchmark publicly at all?
The point is that it's a litmus test for how well the models do with niche knowledge _in general_. The point isn't really to know how well the model works for that specific niche.
Ideally of course you would use a few of them and aggregate the results.
I actually think "concealing the question" is not only a good idea, but a rather general and powerful idea that should be much more widely deployed (but often won't be, for what I consider "emotional reasons").
Example: You are probably already aware that almost any metric that you try to use to measure code quality can be easily gamed. One possible strategy is to choose a weighted mixture of metrics and conceal the weights. The weights can even change over time. Is it perfect? No. But it's at least correlated with code quality -- and it's not trivially gameable, which puts it above most individual public metrics.
It's hard to have any certainty around concealment unless you are only testing local LLMs. As a matter of principle I assume the input and output of any query I run in a remote LLM is permanently public information (same with search queries).
Will someone (or some system) see my query and think "we ought to improve this"? I have no idea since I don't work on these systems. In some instances involving random sampling... probably yes!
This is the second reason I find the idea of publicly discussing secret benchmarks silly.
I learned in another thread there is some work being done to avoid contamination of training data during evaluation of remote models using trusted execution environments (https://arxiv.org/pdf/2403.00393). It requires participation of the model owner.
> By this logic, literally nobody anywhere has free speech.
Nobody anywhere has freedom of speech. And a majority of people don't really think about what it means and don't want it in the purest form despite what they say.
Two examples of "free speech" that are protected in the U.S. under the first amendment:
1. Overt racism (less threat of imminent violence).
2. Nazi apparel.
Say the wrong word or show the wrong symbol in certain settings and you'll quickly understand what I mean. Furthermore I'm confident > 50% of U.S. citizens would find you in the wrong and would support whatever happens to you without much consideration of legality.
Freedom of speech is an ideal with no successful implementation and I don't think that's a bad thing. I prefer to live in the real world where saying stupid shit has consequences and people think just a little bit more carefully about what they say.
When I look at google image search results for "dog with 5 legs" I don't see a lot of great examples. The first unequivocal "dog with 5 legs" was an illustration. Here was my conversation with Chat GPT.
> How many legs does this dog have?
"The dog in the image has four legs."
> look closer.
" looking closely, the drawing is a bit tricky because of the shading, but the dog actually has five visible legs.
Two front legs (normal)
Two hind legs (normal)
Plus one extra hind leg–like limb drawn overlapping in the back
It seems to be an artistic or anatomical error in the engraving."
Sounds like they used GenAI to make them. The "Editor" models (Seedream, Nano-Banana) can easily integrate a fifth limb to create the "dog with awkward walking animation".
Nobody does this. It would be an insane proposition. The vulnerability is going to die very shortly into your attempt to capitalize on it. Businesses have startup costs they have to pay off.
Wouldn't that require, if true, that new revenue streams around exploits aren't generally pursued? It seems like new scams, and variations on old ones around new methods, come about on a somewhat regular basis. And as with any business, there is going to be some speculative work around new "product offerings", so to speak. I'm with you on the idea that they are less valuable, as 'spec work, than something that enhances existing revenue streams in a more predictable way.
I would think so. Who says that's not happening now? It seems reasonable that evolutionary pressure can be strong enough to have a significant impact in 1-2 generations (for example due to the introduction of a new environmental threat) or weak enough to take thousands of generations.
What is it with mice and OSes?
Windows is the only OS I can seem to configure to get low latency, high accuracy, linear movement with, and it's not for lack of effort.
I struggled for several years to do SWE work on a Mac and no 3rd party program could get it working the way it does on Windows. I tried Linear Mouse and many others. I eventually gave up, went against the prevailing (90%) culture where I work, and exchanged my mac for a windows laptop. I haven't measured it, but I feel more productive simply because I can click what I want to click marginally faster.
Is something in Mac drivers performing non-linear mapping? Why?
Based on the quote above it seems like Linux hasn't even gotten up to par with Mac for mice.
The best litmus test for an OS for me is whether I could play an RTS or FPS competitively with it, even though I haven't played either for years.
reply