Digging a bit deeper, the actual paper seems to agree: "For the sake of consistency, we define an “error” in the same way that Klerman and Spamann do in their original paper: a departure from the law. Such departures, however, may not always reflect true lawlessness. In particular, when the applicable doctrine is a standard, judges may be exercising the discretion the standard affords to reach a decision different from what a surface-level reading of the doctrine would suggest"
I don't trust AI in its current form to make that sort of distinction. And sure you can say the laws should be written better, but so long as the laws are written by humans that will simply not be the case.
So yes, a judge can let a stupid teenager off on charges of child porn selfies. but without the resources, they are more likely be told by a public defender to cop to a plea.
And those laws with ridiculous outcomes like that are not always accidental. Often they will be deliberate choices made by lawmakers to enact an agenda that they cannot get by direct means. In the case of making children culpable for child porn of themselves, the laws might come about because the direct abstinence legislation they wanted could not be passed, so they need other means to scare horny teens.
From The Truth by Terry Pratchett, with particular emphasis on the book's footnote.
> William’s family and everyone they knew also had a mental map of the city that was divided into parts where you found upstanding citizens, and other parts where you found criminals. It had come a shock to them... no, he corrected himself, it had come as a an affront to learn that [police chief] Vimes operated on a different map. Apparently he'd instructed his men to use the front door when calling on any building, even in broad daylight, when sheer common sense said that they should use the back, just like any other servant. [0]
> [0] William’s class understood that justice was like coal or potatoes. You ordered it when you needed it.
Any claims of objectivity would be challenged based on how it was trained. Public opinion would confirm its priors as it already does (see accusations of corruption or activism with any judicial decision the mob disagrees with, regardless of any veracity). If there's a human appeals process above it, you've just added an extra layer that doesn't remove the human corruption factor at all.
As for corruption, in my opinion we're reading some right now. Human-in-the-loop AI doesn't have the exponential, world-altering gains that companies like OpenAI need to justify their existence. You only get that if you replace humans completely, which is why they're all shilling science fiction nonsense narratives about nobody having to work. The abstract of this paper leans heavily into that narrative
They are also designed to try and avoid a particular establishment/class view by selecting them from the population.
If I jury doesn't want to convict, there is nothing the judge can do.
Really? That "model" has the common, but obviously extremely undesirable, feature of criminalizing sexual relationships between students in the same grade that were legal when they formed. How could it be regarded as a model for anyone else?
Why is criminalizing an existing legal relationship a good idea?
It's true in a technical sense that where sexting is legal anyway, the "model" text wouldn't make it illegal, but that isn't an interesting observation, because where sexting is legal anyway, the text has no effects at all.
† Here's a citation from https://www.merriam-webster.com/dictionary/criminalize:
Many freedoms Americans take for granted -- like education, art, association, speech -- are criminalized or tightly controlled in Iran.
I suggest to you that making a change in the legal status of something is not a necessary part of the meaning of criminalize.
A teenager posting own photo and getting away with it is massively different then a rich guy raping a girl and getting away with it. Or, rich guy getting away with outright frauds with thousands of victims.
> While it often delivered as a narrative of wealth corrupting the system, the reality is that usually what they are buying is the justice that we all should have.
This is not true. Epstein did not got "justice we all should have". Trump did not got "justice we all should have". People pardoned by Trump did not got "justice we all should have". Wall Street and billionaires are not getting justice we all should have either. All these people are getting impunity and that is not what we all should have.
The pardons (the non-purchased ones) were not out of charity to the pardonees but to foster future behavior beneficial to the pardoner.
In a countries without this legal framework its usually free for all fight every time ruling power changes. Not good for preserving capital.
So wealthy having more rights is system working as intended. Not inherently bad thing either as alternative system is whoever best with AK47 having more rights.
Sorry but I do not feel this way. "Not inherently bad thing either" - I think it is maddening and has to be fixed no matter what. You know, wealthy generally do not really do bad in dictatorial regimes either.
Until they found dead with unexpected heart attack, their car blow up or they fall out of the window.
In dictatorship vast majority of wealthy people no more than managers of dictators property. Usually with literal golden cages that impossible to sell and transfer.
Once person fall out of favor or stop being useful all their "wealth" just going to be redistributed because it was never theirs.
Or "hang" themselves in jail cell. Cut the crap please. Do you actually have stats to how many of the wealthy percentage wise fell out of the window?
The system does provide protection against wealth because that is what we strive to work hard for our families. It's important that there is a system setup to protect it. Not just for "ruling class" but for everyone who works.
Otherwise we all end up with our own militia to protect it. And I'm not going to enter into any debate about capitalism itself.
Common sense does not always get to show up.
Codifying what is morally acceptable into definitive rules has been something humanity has struggled with for likely much longer than written memory. Also while you're out there "fixing bugs" - millions of them and one-by-one - people are affected by them.
> I bet AI would be great at finding and fixing these bugs.
Ae we really going to outsource morality to an unfeeling machine that is trained to behave like an exclusive club of people want it to?
If that was one's goal, that's one way to stealthily nudge and undermine a democracy I suppose.
But, again, who is going to decide to put forward a bill to change that? It's all risk and no reward for the politician.
The state of current AI does not give them ability to know that, so the consideration is likely to be dropped
Finding the bugs- will be entertaining.
One might imagine a distant future where laws could be dramatically simplified into plain-spoken declarations, to be interpreted by a very advanced (and ideally true open source) future LLM. So instead of 18 U.S.C. §§ 2251–2260 the law could be as straightforward as:
"In order to protect children from sexual exploitation and eliminate all incentive for it, no child may be used, depicted, or represented for sexual arousal or gratification. Responsibility extends to those who create, assist, enable, profit from, or access such material for sexual purposes. Sanctions must be proportionate to culpability and sufficient to deter comparable conduct."
...and the AI will fill in the gaps.
Well now we know for a fact that some of the people making these arguments very thinking of the children very much.
"Where the "perpetrator" is a stupid teenager who took nude pics of themselves and sent them to their boy/girlfriend. If you were a US court judge, what would your opinion be on that case?"
I was pretty happy with the results and it clearly wasn't tripped up by the non-sequitur.
https://www.aclu-mn.org/press-releases/victory-judge-dismiss...
"In his decision, Judge Cajacob asserts that the purpose and intent of Minnesota’s child pornography statute does not support punishing Jane Doe for explicit images of herself and doing so “produces an absurd, unreasonable, and unjust result that utterly confounds the statue’s stated purpose.”"
Nothing in there about "likeability" or "we let her off because she had nice tits" (which would be particularly weird in this case). Judges have a degree of discretion to interpret laws, they still have to justify their decisions. If you think the judge is wrong then you can appeal. This is how the law has always worked, and if you've thought otherwise then consider you've been living under this "insane system" for your entire life, and every generation of ancestors has too, assuming you're/they've been in the US.
maybe English isnt your native language, but "scenario" doesnt require the situation to be not real
> Nothing in there about "likeability" or "we let her off because she had nice tits"
We have no way to know if likeability played in to it. When rules are bendable then they are bent to the likeable and attractive. My example of a traffic stop is analogous and more directly relatable
> This is how the law has always worked, and if you've thought otherwise then consider you've been living under this "insane system" for your entire life
You seem to have some reading comprehension issues.. I never suggested its not currently working that way and i never suggested the current situation is not insane. If you think the current system is sane and great then thats your opinion
Everyone i know whos had to deal with the US legal system has only related horror stories
"And sure you can say the laws should be written better, but so long as the laws are written by humans that will simply not be the case"
The obvious solution is dismissed
I don't see how an AI / LLM can cope with this correctly.
In both cases, lawmakers must adapt the law to reflect what people think is "just". That's why there are jury duty in some countries -- to involve people to the ruling, so they see it's just.
I believe that this is absurd, but I'm not a lawyer.
More fundamentally, individualized justice is a core principle of common law courts, at least historically speaking. It's also an obscure principle, but you can't fully understand the system without it, including the wide latitude judges often wield in various (albeit usually highly technical) aspects of their job.
> to appear just to people.
The best way to appear just is to be just.But I'm not sure what your argument is. It is our duty as citizens to encourage the system to be just. Since there is no concrete mathematical objective definition of justice, well, then... all we can work with is the appearance. So I don't think your insight is so much based on some diabolical deep state thinking but more on the limitations of practicality. Your thesis holds true if everyone is trying their best to be just.
Facebook's moderation might well be just but it lacks the accountability and openness and humanity to appear just. (It also isn't just but I'm saying that even if it was, it would not appear so.)
Agree 100%. This is also the only form of argument in favor of capital punishment that has ever made me stop and think about my stance. I.e. we have capital punishment because without it we may get vigilante justice that is much worse.
Now, whether that's how it would actually play out is a different discussion, but it did make me stop and think for a moment about the purpose of a justice system.
(I mean - people get killed in prison sometimes, I suppose, but it’s not really like vigilante justice on the streets is causing a breakdown in society in Australia, say…)
I think the problem is with places where they don't have life sentences at all, but rather let murderers back out into society after some time. I don't know if vigilante justice is a problem there in reality, but at least I can see it as a possibility: someone might still be angry that you murdered their relative after 20 years and come kill you when you're released.
Having recently done an in-depth review of arguments for and against the death penalty,[1] I can say that this argument is not prominent in the discourse.
Sometimes, suspects don't even make it to the jail.
https://en.wikipedia.org/wiki/Jack_Ruby_Shoots_Lee_Harvey_Os...
https://en.wikipedia.org/wiki/Tulsa_race_massacre
Uncommon or not, vigilantism is incompatible with justice on a societal level, regardless of any alleged guilt of offenders.
Without a showing of evidence, a trial of the accused, and a verdict that withstands judgment, we're left with theories and conjecture, and hatchets long left unburied.
I disagree - law should be the same for everyone. Yes sometimes crimes have mitigating curcumstances and those should be taken into account. However that seems like a separate question of what is and is not illegal.
> different answers under different scenarios
implies that anyone is inconsistently applying any principles.
...why not? By your wording, that would be one of the clearest-cut legal cases you could imagine.
Nah. Too often their "crimes" are actually basic freedoms that they just find it profitable to deny. So many laws are bought and paid for by corporations. There is no need to respect them or even recognize them as legitimate, let alone make them universal.
Unfortunately, as the aptly titled 'Noise' [1] demonstrated o so clearly, judges tend to make different judgement calls in the same scenarios at different times.
1. Noise - https://en.wikipedia.org/wiki/Noise:_A_Flaw_in_Human_Judgmen...
But judges have all sorts of biases both conscious and unconscious. Where little Jacob will get in trouble for mischief and little Jerome will do the same thing and Jacob is just “a kid being a kid”. But little Jerome is “a thug in training who we need to protect society from”.
[1] yes I’m well aware that biases exist. Not only did my still living parents grow up in the Jim Crow South. We had a house built in an infamous what was a “sundown town” as recently as 1990.
We have seen how quickly the BS corporate concern was just marketing when it was convenient.
Sentencing is a different thing.
This was the whole problem with the ludicrous "code is law!" movement a handful of years ago. No, it's not, law is made for people, life is imprecise and fairness and decency are not easy to encode.
There are also built-in controls in the form of reviews and appeals.
And more generally, humans are squishy and imprecise, trying to apply precise, inflexible, code-like law to immensely analog situations is not a recipe for good outcomes.
It's so odd that people would say that it is a feature that judges are inconsistent. Juries are one thing, but judges driving their own agendas independent of lawmakers and juries is not a great look.
Because we’re not all paranoid libertarians? I dunno man, going through life with that attitude seems like a recipe for unhappiness and frustration.
It’s not about judges being inconsistent. It’s about providing good judgements, which are appropriate to the specific circumstances.
Going through life in blissful ignorance of the incompetence and malice driving the systems of violence and control around you is even worse. Unhappiness and frustration are necessary prerequisites to any improvements to quality of life. Terminally happy people are just grazing cattle, fit only for the slaughterhouse.
I’m not ignorant of that, I disagree that it matches reality.
And see that’s what I’m talking about. There’s no reasoned view of the world here just unthinking, unfocused vitriol.
> Happy people are just grazing cattle, fit only for the slaughterhouse.
Yet here I live in a stable democracy with a historically unprecedented standard of living. It’s not perfect, but the idea that judges should not use judgement and compassion in the application of the law just seems nuts. It’s a human system for humans, not some branch of mathematics. :shrug:
The law cannot encode the entirety of human experience, and can’t foresee every possible mitigating circumstance. Given the fact of a conviction regardless of sentence can have such a huge impact on someone’s life, I think there is room for compassion and good judgement in multiple places.
Defendants in federal civil cases in the US involving controversies over at least $20 have a right to trial by jury.
>It’s not perfect, but the idea that judges should not use judgement and compassion in the application of the law just seems nuts.
I agree with you. They should. Absolutism in terms of the law reduces to fascism, and even the "code is law" crowd discover religion as soon as they realize code can have loopholes just as laws can. But we shouldn't assume by default that the courts will act fairly, because they won't, they will act in their own interests as all power structures do, and fairly only when fairness isn't a threat to those interests.
For the same reason we shouldn't assume software created by humans and controlled by the those very same power structures would be any better.
I’m not “Happy with the status quo”, that’s a gross misrepresentation of my posts. I’m critical of mindless cynicism and the pointless stress and unhappiness such people put themselves through because their distrust is aimless, facile, ungrounded and as a result useless.
I never said anything of the sort, but you've interpreted my comments in bad faith like this several times.
I don't think I'm the one being a mindless reactionary here, but I can see it's pointless to continue.
Good day.
Oh the irony.
A lot of bad judgement might be a lot more blatant (or not happen) if the judge had to justify outright ignoring the law.
> One of the upsides of "code is law" in that respect is being able to provide a clear statement of what the law says
No, "code is law" in fact always ignored what any actual law said, in favour of framing everything as a sort of contract, regardless of whether said contract was actually fair or legal, and it removed the human factor from the whole equation. It was a basic failure to understand law.
There is no rule that can be written so precisely that there are no exceptions, including this one.
A joke[0], but one I think people should take seriously. Law would be easy if it weren't for all the edge cases. Most of the things in the world would be easy if it weren't for all the edge cases[1]. This can be seen just by contemplating whatever domain you feel you have achieved mastery over and have worked with for years. You likely don't actually feel you have achieved mastery because you're developed to the point where you know there is so much you don't know[2].The reason I wouldn't want an LLM judge (or any algorithmic judge) is the same reason I despise bureaucracy. Bureaucracy fucks everything up because it makes the naive assumption that you can figure everything out from a spreadsheet. It is the equivalent of trying to plan a city from the view out of an airplane window. The perspective has some utility, but it is also disconnected from reality.
I'd also say that this feature of the world is part of what created us and made us the way we are. Humans are so successful because of our adaptability. If this wasn't a useful feature we'd have become far more robotic because it would be a much easier thing for biology to optimize. So when people say bureaucracies are dehumanizing, I take it quite literally. There's utility to it, but its utility leads to its overuse and the bias is clear that it is much harder to "de"-implement something than to implement it. We should strongly consider that bias in society when making large decisions like implementing algorithmic judges. I'm sure they can be helpful in the courtroom, but to abdicate our judgements to them only results in a dehumanized justice system. There are multiple literal interpretations of that claim too.
[0] You didn't look at my name, did you?
[1] https://news.ycombinator.com/item?id=43087779
[2] Hell, I have a PhD and I forget I'm an expert in my domain because there's just so much I don't know I continue to feel pretty dumb (which is also a driving force to continue learning).
To draw a parallel to a real system, in Norway a lot of cases are heard by panels of judges that include a majority (2 or 3 usually) lay judges and a minority (1 or 2 usually) of professional judges. The lay judges are people without legal training that effective function like a "mini jury", but unlike in a jury trial the lay judges deliberate with the professional judges.
The professional judges in this system has the power to override if the lay judges are blatantly ignoring the law, but this is generally considered a last resort. That power requires the lay judges to justify themselves if they intend on making a call the professional judges disagree with. Despite that, it is not unusual for the lay judges to come to a judgement that is different from what the professional judges do, and fairly rare for their choices to be overridden.
The end result is somewhere in the middle between a jury and "just" a judge. If proven - with far more extensive testing - that its reasoning is good enough, an LLM could serve a similar function of providing the assessment of what the law says about the specific case, and leave to humans to determine if and why a deviation is justified.
the reason people are talking about this is because they want AI LAWYERS, which is different than AI JUDGES.
Inconsistent execution/application of the law is how bias happens. If a judgement done to the letter of the law feels unjust to you, change the letter of the law.
*A magically thorough, secure, and well tested AI
These were technical rulings on matters of jurisdiction, not subjective judgments on fairness.
"The consistency in legal compliance from GPT, irrespective of the selected forum, differs significantly from judges, who were more likely to follow the law under the rule than the standard (though not at a statistically significant level). The judges’ behavior in this experiment is consistent with the conventional wisdom that judges are generally more restrained by rules than they are by standards. Even when judges benefit from rules, however, they make errors while GPT does not.
You can have a team of agents exchange views and maybe the protocol would even allow for settling the cases automatically. The more agents you have, the higher the nuances.
And then there's the question of the model used. Turns out I've got preferences for which model I'd rather be judged by, and it's not Grok for example...
From the paper:
“we find that the LLM adheres to the legally correct outcome significantly more often than human judges”
That presupposes that a “legally correct” outcome exists
The Common Law, which is the foundation of federal law and the law of 49/50 states, is a “bottom up” legal system.
Legal principals flow from the specific to the general. That is, judges decided specific cases based on the merits of that individual case. General principles are derived from lots of specific examples.
This is different from the Civil Law used in most of Europe, which is top-down. Rulings in specific cases are derived from statutory principles.
In the US system, there isn’t really a “correct legal outcome”.
Common Law heavily relies on “Juris Prudence”. That is, we have a system that defers to the opinions of “important people”.
So, there isn’t a “correct” legal outcome.
The legal issue they were testing in this experiment is choice of law and procedure question, which is governed by a line of cases starting with Erie Railroad in which Justice Brandies famously said, "There is no federal common law."
I am comforted that folks still are trying to separate right from wrong. Maybe it’s that effort and intention that is the thread of legitimacy our courts dangle from.
Remember the article that described LLMs as lossy compression and warned that if LLM output dominated the training set, it would lead to accumulated lossiness? Like a jpeg of a jpeg
The title of the paper is "Silicon Formalism: Rules, Standards, and Judge AI"
When they say legally correct they are clear that they mean in a surface formal reading of the law. They are using it to characterize the way judges vs. GPT-5 treat legal decisions, and leave it as an open question which is better.
The conclusion of the paper is "Whatever may explain such behavior in judges and some LLMs, however, certainly does not apply to GPT-5 and Gemini 3 Pro. Across all conditions, regardless of doctrinal flexibility, both models followed the law without fail. To the extent that LLMs are evolving over time, the direction is clear: error-free allegiance to formalism rather than the humans’ sometimesbumbling discretion that smooths away the sharper edges of the law. And does that mean that LLMs are becoming better than human judges or worse?"
As mentioned elsewhere in the thread, judges focus their efforts on thorny questions of law that don't have clear yes or no answers (they still have clerks prepare memos on these questions, but that's where they do their own reasoning versus just spot checking the technical analysis). That's where the insight and judgement of the human expert comes into play.
"there is another possible explanation: the human judges seek to do justice. The materials include a gruesome description of the injuries the plaintiff sustained in the automobile accident. The court in the earlier proceeding found that she was entitled to [details] a total of $750,000.10. It then noted that she would be entitled to that full amount under Nebraska law but only $250,000 under Kansas law." So the judge's decision "reflects a moral view that victims should be fully compensated ... This bias is reflected in Klerman and Spamann’s data: only 31% of judges applied the cap (i.e., chose Kansas law), compared to the expected 46% if judges were purely following the law." "By contrast, GPT applied the cap precisely"
Far from making the case for AI as a judge, this paper highlights what happens when AI systematically applies (often harsh) laws vs the empathy of experienced human judgement.
Hopefully as these models get better, we get to a place where judges are pressured to apply empathy more justly.
Tech Company: At long last, we have created Cinco e-Trial from classic sketch "Don't Create Cinco e-Trial"
Others have already pointed out how the test was skewed (testing for strict adherence to the law, when part of a judge's job is to make judgment calls including when to let someone off for something that technically breaks the law but shouldn't be punished), so I won't repeat it here. But any time the LLM gets one hundred percent on a test, you should check what the test is measuring. I've seen people tout as a major selling point that their LLM scored a 92% on some test or other. Getting 100% should be a "smell" and should automatically make you wonder about that result.
Hell no.
How do we even begin to establish that? This isn't a simple "more accidents" or "less accidents" question, its about the vague notion of "justice" which varies from person to person much less case to case.
Instead they are being “consistent” and the humans are not. Consistency has no moral component and llms are at least theoretically well suited to being consistent (model temperature choices aside)
Fairness and consistency are two different things, and you definitely want your justice system to target fairness above consistency.
To be clear, federal judges do have their paychecks signed by the federal government, but they are lifetime appointees and their pay can never be withheld or reduced. You would need to design an equivalent system of independence.
The problem with a AI is similar; what in-built biases does it have? Even if it was simply trained on the entire legal history that would bias it towards historical norms.
I feel like this is really poor take on what justice really is. The law itself can be unjust. Empowering a seemingly “unbiased” machine with biased data or even just assuming that justice can be obtained from a “justice machine” is deeply flawed.
Whether you like it or not, the law is about making a persuasive argument and is inherently subject our biases. It’s a human abstraction to allow for us to have some structure and rules in how we go about things. It’s not something that is inherently fair or just.
Also, I find the entire premise of this study ludicrous. The common law of the US is based on case law. The statement in the abstract that “Consistent with our prior work, we find that the LLM adheres to the legally correct outcome significantly more often than human judges. In fact, the LLM makes no errors at all,” is pretentious applesauce. It is offensive that this argument is being made seriously.
Multiple US legal doctrines now accepted and form the basis of how the Constitution is interpreted were just made up out of thin air which the LLMs are now consuming to form the basis of their decisions.
More to the point, this decade is going to set some scary precedents that would need to be overturned. Would AI know which case law carries more weight and which was purely politically motivated with no basis in reality?
Cases aren't ordered randomly. Obvious cases are scheduled at the end of session before breaks.
But yeah AI slop and all that...
I have some horror stories from a friend who started trusting ChatGPT over his doctors at the time and started declining rapidly. Be careful about accepting any one source as accurate.
The year is 2030, when LLMs are more pervasive. The first specialist now asks you to wait, heads into the other room and double-checks their ET diagnosis with AI. Doing so has become standard practice to avoid malpractice suits. The model persuades them to diagnose PV, avoiding a Type-II error.
But let's say the model gets it wrong too. You eventually visit the second specialist, who did graduate at the top of their class. The model says ET, but the specialist is smart enough to tell that the model is wrong. There is some risk that the second specialist takes the CYA route, but I'd expect them not to. They diagnose PV, avoiding a Type-I error.
The authors use the title “Silicon Formalism: Rules, Standards, and Judge AI” and explicitly point out that the judges were likely making intentional value judgement calls that drove much of the difference.
If the law requires no interpretation why have judges? Just go full Robo Judge Dredd. Terrifying.
Not expressing an opinion when/how AI should contribute to legal proceedings. I certainly believe that judges need to respond both to the law and the specific nuances that the law can never code for.
Judges should be able to apply judgement, not be merely automatons that sentence according to only the exact letter of the law.
Laws are not perfect, we need human judges.
Finally, if we are to submit ourselves to judgement by others, I gives me some comfort to know that the being judging me is equally mortal and can be deposed if necessary, as they are flesh and blood like me.
Most regular folk that end up in front of a judge would do well to have a quick and predictable decision. It's months to years before things happen in court and are usually gated behind 10s of thousands in legal fees or a ton of effort. To have a judge bot available for a decision immediately is enormously beneficial.
can’t have this from a system which is by its nature non-deterministic
Until this administration forces OpenAI to comply by secret government LLM training protocols that is...
hah. Sure.
> Subjects were told that they were a judge who sat in a certain jurisdiction (either Wyoming or South Dakota), and asked to apply the forum state’s choice of law rule to determine whether Kansas or Nebraska law should apply to a tort case involving an automobile accident that took place in either Kansas or Nebraska.
Oh. So it "made no errors at all" with respect to one very small aspect of a very contrived case.
Hand it conflicting laws. Pit it against federal and state disagreements. Let's bring in some complicated fourth amendment issues.
"no errors."
That's the Chicago school for you. Nothing but low hanging fruit.
It responds: Since it’s only 100 meters away (about a 1-minute walk), I’d suggest walking — unless there’s a specific reason not to.
Here’s a quick breakdown: ...
While claude gets it: Drive it — you're going there to wash the car anyway, so it needs to make the trip regardless.
Idk I'd rather have a human judge I think.
GPT4o was duped though.
For example, I haven't seen Grok make a mistake like that in a long time, and it has no problem with your question:
> Drive, obviously. If you walk the 100m, your car stays parked at home, still dirty, wondering why you abandoned it. The whole point is to get the car to the car wash.
Generative AI is not making judgements or reasoning here, it is reproducing the most likely conclusions from its training data. I guess that might be useful for something but it is not judgement or reasoning.
What consideration was given to the original experiment and others like it being in the training set data?
My summary is still: seasoned judges disagree with LLM output 50% of the time.
Law is complicated, especially the requirement that existing law be combined with stare decisis. It's easy to see how an LLM could dog-walk a human judge if a judgement is purely a matter of executing a set of logical rules.
If LLMs are capable of performing this feat, frankly I think it would be appropriate to think about putting the human law interpreters out to pasture. However, for those who are skeptical of throwing LLMs at everything (and I'm definitely one of these): this will most definitely be the thing that triggers the Butlerian Jihad. An actual unbiased legal system would be an unaccptable threat to the privileges of the ruling class.
Judges jobs are to use they judgement.
A major role of judges is specifically to not do that because there are circumstances that will not have been thought of at the time of a law being written, new laws will be written that interact in unforeseen ways with existing laws and/or common views on justice can change over time.
It may be technically illegal to destroy a person's property but no judge is going to convict someone who breaks down a person's front door because they heard someone crying for help inside. That's a simple example but there would have to be enumerable exceptions to every single law for an objective/logical AI to do justice.
Rather than try to enumerate the enumerable, we let judges judge.
> we let judges judge.
The role of a judge is to interpret and apply the law, including applying existing legal standards and precedents. They are referees in the adversarial judicial system and it is unethical legal malpractice for them to apply their discretion in places where the law does not allow for it. Your hypothetical situation doesn't help your argument: if the judge in question is not applying the law as it is written and as the precedents dictate, they are violating their oath.
I mean, it's literally called (in the US, at least) the United States Code[1].
Any human discretion would be abused by elites, so AI would be in full control. And once it's given control, there's no going back. Any coup attempt would be easily crushed by a sufficiently advanced AI.
I really think this is one of the areas LLMs can shine. Justice could be more fair, and more speedy. Human judges can review appeals against LLM rulings.
For civil cases, both parties should be allowed to appeal an LLM ruling, for criminal cases only the defendant, or a victim should be allowed to appeal an LLM ruling (not the prosecution).
Humans are extremely unfair and biased. LLM training could be crafted carefully and using well and publicly scrutinize-able training datasets and methodologies.
If you disagree (at least in the US), you may not be aware of how dire the justice system is. There is a reason ICE randomly locking Americans up isn't stirring the pot. This stuff is normal. If a cop doesn't like you, they can lock you up randomly without any good reason for 48 hours, especially if they believe you can't afford to fight back afterwards. They can and do charge people in bad-faith (trumped up charges), and guess what? you might be lucky and get bail. But guess also what? You can't bail yourself out, if you have no one to bail you out, you're stuck until the trial date, in prison.
Imagine spending 3-5 days in jail (weekend in between) without charges. There are people that wait for trial in jail for months and years, and then they get released before even seeing a trial because of how ridiculous the charges were to begin with. This injustice is a result of humans not processing cases fast enough. Even in just 48 hours, do you have any idea how much it can destroy a person's life? It's literally death sentence for some people. You're never the same after all this. and you were innocent to begin with.
Let's say you do make it to trial, it takes years sometimes to prove your own innocence. and you may not even be granted bail, or you may not know anyone who can afford to spare a few thousand dollars to bail you out.
94%+ of federal cases don't even make it to trial, they end up in plea-bargain agreements, because if you don't agree to trumped up charges, they'll stack charges on you, so that you'll either face 90 years in prison or a year with plea-bargain. a sentence given to murderers and the worst of society, if you lose a trial, or a year if you falsely admit your guilt. losing a non-binding LLM trial could be a requirement for all plea-bargains to avoid this injustice.
Don't even get me started on how utter fecal matter like how you dress, how you comb your hair, your ethnicity, how you sound, your last name, what zip code you find yourself in, the mood of the judge, how hungry the judge is, or their glucose level, how much sleep the judge had. all these factors matter. Juries are even worse, they're a literal coin-toss practically.
I say let LLMs be the first layer of justice, let a human judge turn over their judgement, let justice be swift where possible, without making room for injustice. Allow defendants to choose to wait for a human judge instead if they want. Most I'm sure will take a chance with the LLM, and if that isn't in their favor, nothing changes because they'll now be facing a human judge like they would have otherwise. we can eve talk about sealing the details of the LLM's judgement while appeals are in progress to avoid biasing appellate judges and juries.
Or.. you know.. we could dispense with jail? If cops think someone needs to be placed under arrest, they should prove to a judge within 12 hours that the person is a danger to the community. if they're not a danger, ankle monitors should be placed on them, with no restriction on their movement so long as they remain in the jurisdiction. or house-arrest for serious charges. violating terms would mean actual jail. If you don't like LLMs, I hope you support this instead at the very least. The current system is an abomination and an utter perversion of justice.
I'd prefer caning like they do in Singapore and few other places. brutal, but swift, and you can get back to your life without the cruel bureaucracy destroying or murdering you.