OpenAI O3 breakthrough high score on ARC-AGI-PUB

bluecoconut
·
6 months ago
·
[ - ]

Efficiency is now key.

~=$3400 per single task to meet human performance on this benchmark is a lot. Also it shows the bullets as "ARC-AGI-TUNED", which makes me think they did some undisclosed amount of fine-tuning (eg. via the API they showed off last week), so even more compute went into this task.

We can compare this roughly to a human doing ARC-AGI puzzles, where a human will take (high variance in my subjective experience) between 5 second and 5 minutes to solve the task. (So i'd argue a human is at 0.03USD - 1.67USD per puzzle at 20USD/hr, and they include in their document an average mechancal turker at $2 USD task in their document)

Going the other direction: I am interpreting this result as human level reasoning now costs (approximately) 41k/hr to 2.5M/hr with current compute.

Super exciting that OpenAI pushed the compute out this far so we could see he O-series scaling continue and intersect humans on ARC, now we get to work towards making this economical!

bluecoconut
·
6 months ago
·
[ - ]

some other imporant quotes: "Average human off the street: 70-80%. STEM college grad: >95%. Panel of 10 random humans: 99-100%" -@fchollet on X

So, considering that the $3400/task system isn't able to compete with STEM college grad yet, we still have some room (but it is shrinking, i expect even more compute will be thrown and we'll see these barriers broken in coming years)

Also, some other back of envelope calculations:

The gap in cost is roughly 10^3 between O3 High and Avg. mechanical turkers (humans). Via Pure GPU cost improvement (~doubling every 2-2.5 years) puts us at 20~25 years.

The question is now, can we close this "to human" gap (10^3) quickly with algorithms, or are we stuck waiting for the 20-25 years for GPU improvements. (I think it feels obvious: this is new technology, things are moving fast, the chance for algorithmic innovation here is high!)

I also personally think that we need to adjust our efficiency priors, and start looking not at "humans" as the bar to beat, but theoretical computatble limits (show gaps much larger ~10^9-10^15 for modest problems). Though, it may simply be the case that tool/code use + AGI at near human cost covers a lot of that gap.

miki123211
·
6 months ago
·
[ - ]

It's also worth keeping in mind that AIs are a lot less risky to deploy for businesses than humans.

You can scale them up and down at any time, they can work 24/7 (including holidays) with no overtime pay and no breaks, they need no corporate campuses, office space, HR personnel or travel budgets, you don't have to worry about key employees going on sick/maternity leave or taking time off the moment they're needed most, they won't assault a coworker, sue for discrimination or secretly turn out to be a pedophile and tarnish the reputation of your company, they won't leak internal documents to the press or rage quit because of new company policies, they won't even stop working when a pandemic stops most of the world from running.

fsndz
·
6 months ago
·
[ - ]

I get the excitement, but folks, this is a model that excels only in things like software engineering/math. They basically used reinforcement learning to train the model to better remember which pattern to use to solve specific problems. This in no way generalises to open ended tasks in a way that makes human in the loop unnecessary. This basically makes assistants better (as soon as they figure out how to make it cheaper), but I wouldn't blindly trust the output of o3. Sam Altman is still wrong: https://www.lycee.ai/blog/why-sam-altman-is-wrong

robwwilliams
·
6 months ago
·
[ - ]

In your blog you say:

> deep learning doesn't allow models to generalize properly to out-of-distribution data—and that is precisely what we need to build artificial general intelligence.

I think even (or especially) people like Altman accept this as a fact. I do. Hassabis has been saying this for years.

The foundational models are just a foundation. Now start building the AGI superstructure.

And this is also where most of the still human intellectual energy is now.

dartos
·
6 months ago
·
[ - ]

You lost me at the end there.

These statistical models don’t generalize well to out of distribution data. If you accept that as a fact, then you must accept that these statistical models are not the path to AGI.

girvo
·
6 months ago
·
[ - ]

Quite. And if it was right, those businesses deploying it and replacing humans need humans with jobs and money to pay for their products and services…

fakedang
·
6 months ago
·
[ - ]

It will just keep bleeding the middle class on and on, till the point where either everyone is rich, homeless or a plumber or other such licensed worker. And then there will be such a glut in the latter (shrinking) market, that everyone in that group also becomes either rich or homeless.

palmfacehn
·
6 months ago
·
[ - ]

Productivity gains increase the standard of living for everyone. Products and services become cheaper. Leisure time increases. Scarce labor resources can be applied in other areas.

I fail to see the difference between AI-employment-doom and other flavors of Luddism.

bayindirh
·
6 months ago
·
[ - ]

It also fuels the income inequality with a fatter pipe in every iteration. You get richer as you move up in the supply chain, period. Companies vertically integrate to drive costs down in the long run.

As AI gets more prevalent, it'll drive the cost down for the companies supplying these services, so the former employees of said companies will be paid lower, or not at all.

So, tell me, how paying fewer people less money will drive their standard of living upwards? I can understand the leisure time. Because, when you don't have a job, all day is leisure time. But you'll need money for that, so will these companies fund the masses via government to provide Universal Basic Income, so these people can both live a borderline miserable life while funding these companies to suck these people more and more?

CamperBob2
·
6 months ago
·
[ - ]

It also fuels the income inequality with a fatter pipe in every iteration

Who cares? A rising tide lifts all boats. The wealthy people I know all have one thing in common: they focused more on their own bank accounts than on other people's.

So, tell me, how paying fewer people less money will drive their standard of living upwards?

Money is how we allocate limited resources. It will become less important as resources become less limited, less necessary, or (hopefully) both.

danans
·
6 months ago
·
[ - ]

> Money is how we allocate limited resources. It will become less important as resources become less limited, less necessary, or (hopefully) both.

Money is also how we exert power and leverage over others. As inequality increases, it enables the ever wealthier minority to exert power and therefore control over the majority.

CamperBob2
·
6 months ago
·
[ - ]

If that's a problem, why does the progressive point of view typically argue in favor of giving more power over our lives to the ruling class?

The problem isn't the money. The problem is the power.

bayindirh
·
6 months ago
·
[ - ]

> why does the progressive point of view typically argue in favor of giving more power over our lives to the ruling class?

Humans are interesting creatures. Many of them do not have conscience and don't understand the notion of ethics and "not doing of something because it's wrong to begin with". From my experience, esp. the people in US thinks that "if that's not illegal, then I can and will do this", which is wrong in many levels again.

Many European people are similar, but bigger governments and harsher justice system makes them more orderly, and happier in general. Yes, they can't carry guns, but they don't need to begin with. Yes, they can't own Cybertrucks, but they can walk or use an actually working mass transportation system to begin with.

Plus proper governments have checks and balances. A government can't rip people off like corporations for services most of the time. Many of the things Americans are afraid of (social health services for everyone) makes life more just and tolerable for all parts of the population.

Big government is not a bad thing, and uncontrollable government is. We're entering the era of "corporate pleasing uncontrollable governments", and this will be fun in a tragic way.

inglor_cz
·
6 months ago
·
[ - ]

"Many European people are similar, but bigger governments and harsher justice system makes them more orderly, and happier in general. Yes, they can't carry guns, but they don't need to begin with. Yes, they can't own Cybertrucks, but they can walk or use an actually working mass transportation system to begin with."

This comment is a festival of imprecise stereotypes.

Gun laws vary widely across Europe, as does public safety (both the real thing and perception of; if you avoid extra rapes by women not venturing outside after dark, the city isn't really safe), as does the overall lavel of personal happiness, as does the functionality of public transport systems.

And the quality of public services doesn't really track the size of the government even in Europe that well. Corruption eats a lot of the common pie.

bayindirh
·
6 months ago
·
[ - ]

> This comment is a festival of imprecise stereotypes.

I might be overgeneralizing, but I won't accept the "festival of imprecise stereotypes" claim. This is what I got with working with too many people from too many countries in Europe for close to two decades. I travel at least twice a year, and basically live with them for short periods of time. So this is not by reading some questionable social media sites and being an armchair sociologist.

> Gun laws vary widely across Europe...

Yet USA has 3x armed homicide cases in developed world when compared with its closest follower, and USA is the "leader" of the pack. 24 something vs. 8 something.

> as does public safety

Every city has safe and unsafe areas. Even your apartment has unsafe areas.

> as does the overall lavel of personal happiness, as does the functionality of public transport systems.

Of course, but even if DB has a two hour delay because of a hold-up at Swiss border, I can board a Eurostar and casually can see another country for peanuts money. Happiness changes due to plethora of reasons. Like Swedes' daylight duration problems in winter, or economic downturn in elsewhere.

> And the quality of public services doesn't really track the size of the government even in Europe that well. Corruption eats a lot of the common pie.

Sadly corruption in Europe is on the rise when compared to the last decade. I can see that. However, at least many countries have a working social security systems, NHS not being one of them, sadly.

justwool
·
6 months ago
·
[ - ]

Lmao attacking stereotypes with stereotypes.

Please what cities? You are just making up rape stats. That’s makes you the bigger idiot here.

Ohh yeah so much corruption I don’t literally enjoy Zagreb more than any US city I have been to and it’s not even special. Because if this is just have the shittiest argument ever there’s my anecdotal rebuttal.

Jensson
·
6 months ago
·
[ - ]

> We're entering the era of "corporate pleasing uncontrollable governments", and this will be fun in a tragic way.

Right, so the answer is not to make that bad government bigger, the answer is to replace it with a good government. Feeding a cancer tumor doesn't make it better.

danans
·
6 months ago
·
[ - ]

> Right, so the answer is not to make that bad government bigger, the answer is to replace it with a good government.

Bad government (where by bad I mean serving the interests of the wealthy few over the masses) is bad regardless of it's size.

If you believe in supply-side/trickle-down economics, you might use the opposite definition of "bad", in which case shrinking of government that restrains corporations (protecting the masses) by regulation or paying for seniors not to end up it total destitution (social security /Medicare)

The size of the government is less relevant than what it is doing, and whether you agree with that.

ggregoryarms
·
6 months ago
·
[ - ]

Trickle down economics isn't something you either "believe in" or "don't believe in". It's a disproven theory that does not work.

FartyMcFarter
·
6 months ago
·
[ - ]

It's not always true that progressives are for more government power. See the death penalty for example. It's pretty much the ultimate power a government could have, and who advocates for it? It's not progressives I believe.

abduhl
·
6 months ago
·
[ - ]

You think the death penalty is an exercise of ultimate power? It’s more an exercise of vengeance.

The ultimate exercise of government power is keeping someone locked in a tiny cell for the rest of their life where their bed is next to their toilet and you make them beg a faceless bureaucracy that has no accountability annually for some form of clemency via parole, all while the world and their family moves on without them.

FartyMcFarter
·
6 months ago
·
[ - ]

I don't necessarily agree with that, but even if it's true, I think my main point still stands about who is likely to support either thing.

didibus
·
6 months ago
·
[ - ]

In a free democracy, I think Progressives see the ruling class as those in a position to influence democratic rule with an outsized influence compared to 1 person, 1 vote. And that are those with money, or with too much centralized media power or popularity.

The employees of the government and those elected are not seen as the ruling class by progressives, but just normal people that have the qualifications and are employed to manage the government on behalf of the people.

It's important therefore that those elected and put in charge of the government are in a position where they don't have the power to benefit themselves or their friends/family, but are in a position where they can wield power to benefit the people who hired them for the job (their constituents), and that if they fail to do so, they can get replaced.

mapt
·
6 months ago
·
[ - ]

To be blunt: It doesn't.

The modern political binary was originally constructed in the ashes of the French Revolution, as the ruling royalty, nobility and aristocracy recoiled in horror at the threat that masses of angry poor people now posed. The left wing thrived on personal liberty, tearing down hierarchies, pursuing "liberty, equality, fraternity". The right wing perceives social hierarchy as a foundational good, sees equality as anarchy and order (and respect for property) as far more important than freedom. For a century they experienced collective flashbacks to French Revolutionaries burning fine art for firewood in an occupied chateau.

Notably, it has not been a straight line of Social Progress, nor a simple Hegelian dialectic, but a turbulent winding path between different forces of history that have left us with less or more personal liberty in various eras. But... well... you have to be very confused about what they actually believe now or historically to understand progressives or leftists as tyrants who demand hierarchy.

That confusion may come from listening to misinformation from anticommunists, a particular breed of conservative who for the past half century have asserted that ANY attempt to improve government or enhance equality was a Communist plot by people who earnestly wanted Soviet style rule. One of those anticommunists with a business empire, Charles Koch, funded basically all the institutions of the 'libertarian' movement, and later on much of the current GOP's brand of conservatism.

danans
·
6 months ago
·
[ - ]

> If that's a problem, why does the progressive point of view typically argue in favor of giving more power over our lives to the ruling class?

You've literally reversed the meaning of the term "progressive" by replacing it with the meaning of the term "oligarchic".

Progressives argue for less invasion by government in our personal lives, and less unequal distribution of wealth and power. They are specifically opposed to power being delivered to a ruling class.

> The problem isn't the money. The problem is the power

These are nearly inseparable in current (and frankly most past) societies. Pretending that they are not is a way of avoiding practical solutions to the problem of the distribution of power.

Spivak
·
6 months ago
·
[ - ]

I really get the feeling that people do not understand that progressive is almost a synonym for Libleft.

Those damn authoritarians, stripping the power from the oligarchs by massively taxing the rich and defunding the police. The bastards.

palmfacehn
·
6 months ago
·
[ - ]

Progressive is more precise than that. There are specific policies you can examine from the Progressive Era.

Ultimately it was the 'oligarchs' who argued in favor of the progressive agenda. Wall Street created the 3rd central bank, the US Federal Reserve. The AMA closed all of the mutual aid societies and their hospitals. Railroad barons lobbied for subsidies and price controls to eliminate their competitors. Woodrow Wilson declared war on Germany, "The world must be made safe for democracy"

Of course no would be authoritarian claims more power without claiming that they are doing it for the greater good or to attack the rich classes. The Progressive Era was smorgasbord of special interest handouts and grants to cartels. All of this was lobbied for by oligarchs.

ggregoryarms
·
6 months ago
·
[ - ]

Yeah, that commenter wildly misunderstands what "progressive" means. Like full on got the definition of the word backwards.

Is this common? People think "progressive" means "complete government control"?

Progressives support regulations to prevent both public and private entities from becoming too powerful. It's not like they want to give the government authoritarian control lol.

djeastm
·
6 months ago
·
[ - ]

I guess it depends on what you're defining as "the ruling class", because I believe most progressives would define it as "the wealthy" and would certainly not be in favor of that. Look at the AOC/Pelosi rift, for instance.

Jensson
·
6 months ago
·
[ - ]

Politicians are a part of the ruling class for any sensible definition of the word.

didibus
·
6 months ago
·
[ - ]

In free democracies, politicians are elected representatives, not rulers. They are accountable to voters through regular elections. Power is distributed across multiple branches/institutions. Citizens have protected rights and freedoms. Politicians can be voted out or recalled. Laws apply equally to politicians and citizens.

In practice, there's always a slippery slope, can wealthy people integrate themselves in that power structure, lobbying, media control, strength of checks and balances, level of corruption/transparency, etc. But when that slips, we stop calling it a free democracy, and it becomes an oligarchy, or a plutocracy, or an illiberal democracy.

ggregoryarms
·
6 months ago
·
[ - ]

Politicians do come in different flavours. There are some elected officials with good intentions. See again the AOC/Pelosi rift.

The more we regulate to get money out of politics, the more good people will have a shot at being elected.

These are all common progressive values. No true progressive supports wealthy unethical politicians gaining more power. Anyone telling you so is not speaking in good faith, or they are misinformed.

pishpash
·
6 months ago
·
[ - ]

Why would "resources" become less limited or necessary just because there's some AGI controlled by a few people? You're assuming a lot here.

Separately, is it "rising tide lifts all boats" or "pull yourself up by your bootstraps" that drives the common person's progress? You seem confused which metaphor to apply while handwaving the discussion away.

CamperBob2
·
6 months ago
·
[ - ]

Why would "resources" become less limited or necessary just because there's some AGI controlled by a few people? You're assuming a lot here.

The Luddites asked a similar question. The ultimate answer is that it doesn't matter that much who controls the means of production, as long as we have access to its fruits.

As long as manual labor is in the loop, the limits to productivity are fixed. Machines scale, humans don't. It doesn't matter whether you're talking about a cotton gin or a warehouse full of GPUs.

I haven't invoked the "bootstrap" cliché here, have I? Just the boat thing. They make very different points.

Anyway, never mind the bootstraps: where'd you get the boots? Is there a shortage of boots?

There once was a shortage of boots, it's safe to say, but automation fixed that. Humans didn't, and couldn't, but machines did. Or more properly, humans building and using machines did.

didibus
·
6 months ago
·
[ - ]

> The ultimate answer is that it doesn't matter that much who controls the means of production, as long as we have access to its fruits.

That mattered a lot in communist places, we saw it fail. Same thing with most authoritarian regime today, it's a crap shoot. You simply can't entrust a small group with full control on the means of production and expect them to make it efficient, cheap, innovative, sustainable and affordable.

bayindirh
·
6 months ago
·
[ - ]

> Who cares? A rising tide lifts all boats.

Apparently people who are not wealthy enough to buy a boat and afraid of drowning care about this a lot. Also, for whom the tide rises? Not for the data workers which label data for these systems for peanuts, or people who lose jobs because they can be replaced with AI, or Amazon drivers which are auto-fired by their in-car HAL9000 units which label behavior the way they see fit.

> The wealthy people I know all have one thing in common: they focused more on their own bank accounts than on other people's.

So, the amount of money they have is much more important than everything else. That's greed, not wealth, but OK. I'm not feeling like dying on the hill of greedy people today.

> Money is how we allocate limited resources.

...and the wealthy people (you or I or others know) are accumulating amounts of it which they can't make good use of personally, I will argue.

> It will become less important as resources become less limited, less necessary, or (hopefully) both.

How will we make resources less limited? Recycling? Reducing population? Creating out of thin air?

Or, how will they become less necessary? Did we invent materials which are more durable and cheaper to produce, and do we start to sell it to people for less? I don't think so.

See, this is not a developing country problem. It's a developed country problem. Stellantis is selling inferior products for more money, while reducing workforce , closing factories, replacing metal parts with plastics, and CEO is taking $40MM as a bonus [0], and now he's apparently resigned after all that shenanigans.

So, no. Nobody is making things cheaper for people. Everybody is after the money to rise their own tides.

So, you're delusional. Nobody is thinking about your bank account that's true. This is why resources won't be less limited or less necessary. Because all the surplus is accumulating at people who are focused on their own bank accounts more than anything else.

CamperBob2
·
6 months ago
·
[ - ]

How will we make resources less limited? Recycling? Reducing population? Creating out of thin air?

We've already done it, as evidenced by the fact that you had the time and tools to write that screed. Your parents probably didn't, and your grandparents certainly didn't.

bayindirh
·
6 months ago
·
[ - ]

No, it doesn't prove anything. To be brutally honest, I have just eaten a meal, and have 30 minutes of relax time. Then I'll close this 10 year old laptop and continue what I need to do.

No, my parents had that. Instead, they were chatting on the phone. My grandparents already had that too. They just chatted at the hall in front of the house with their neighbors.

We don't have time. We are just deluding ourselves. While our lives are technologically better, and we live longer, our lives are not objectively healthier and happier.

Heck, my colleagues join teleconferences from home with their kid's voice at the background and drying clothes visible, only hidden by the Gaussian blur or fake background provided by the software.

How they have more time to do more things? They still work 8 hours a day, doing the occasional overtime.

Things have changed and evolved, but evolution and change doesn't always bring progress. We have progressed in other areas, but justice, life conditions and wealth are not in this list. I certainly can't buy a house just because I want one like my grandparents did, for example.

michaelmrose
·
6 months ago
·
[ - ]

Why wouldn't the people at the top siphon off literally 100% of the benefit whilst the people displaced bear 100% of the cost?

Yoric
·
6 months ago
·
[ - ]

Just to clarify: the Luddites were being automated out of a job.

From what I understand of history, while industrial revolutions have generally increased living standards and employment in the long term, they have also caused massive unemployment/starvation in the short term. In the case of textile, I seem to recall that it took ~40 years for employment to return to its previous level.

I don't know about you guys, but I'm far from certain that I can survive 40 years without a job.

seabass-labrax
·
6 months ago
·
[ - ]

In addition, although the Luddite uprisings were themselves crushed, the political elite were not blind to the circumstances that led to them, and did eventually bring in the legislation that introduced modern workers rights, legalized unions and sowed the seeds of the modern secular welfare state in Britain. That is a pattern that appears throughout history and especially in Britain, where the government cannot be seen to yield to violent protest but quietly does so anyway.

litam
·
6 months ago
·
[ - ]

And among the few who found a job back, most of the time it was some coal mining job, to feed the machines who replaced them... Maybe the future of (some of) nowadays' office workers is to feed (train) the models replacing them?

SideQuark
·
6 months ago
·
[ - ]

I cannot find a place industrial revolutions caused massive starvation. Care to provide one?

The other things you state are not even close.

First, lowered employment for X years does not imply one cannot get a job in X years - that's simply fear mongering. Unemployment over that period seems to have fluctuated very little, and massive external economic issues were causes (wars with Napoleon, the US, changing international fortunes), not Luddites.

Next, there was inflation and unemployment during the TWO years surrounding the Luddites, in 1810-1812 (starting right before the Luddite movement) due to wars with Napoleon and the US [1]. Somehow attributing this to tech increases or Luddites is numerology of the worst sort.

If you look at academic literature about the economy of the era, such as [2] (read on scihub if you must), you'll find there was incredible population growth, and that wages grew even faster. While many academics at the at the time thought all this automation would displace workers, those academics were forced to admit they were wrong. There's plenty of literature on this. Simply dig through Google scholar.

As to starvation in this case, I can find no "massive starvation". [3] forExample points out that "Among the industrial and mining families, around 18 per cent of writers recollected having experienced hunger. In the agricultural families this figure was more than twice as large — 42 per cent".

So yes there was hunger, as there always had been, but it quickly reduced due to the industrial revolution and benefited those working in industry more quickly than those not in industry.

[1] https://en.wikipedia.org/wiki/Luddite#:~:text=The%20movement....

[2] https://www.jstor.org/stable/2599511

[3] https://academic.oup.com/past/article/239/1/71/4794719

Yoric
·
6 months ago
·
[ - ]

Thanks for your response.

My bad for "massive starvation", that's clearly a mistake, I meant to write something along the lines of "massive unemployment – and sometimes starvation". Sadly, too late to amend.

Now, I'll admit that I don't have my statistics at hand. I quoted them from memory from, if I recall correctly, _Good Economics for Hard Times_. I'm nearly certain about the ~40 years, but it's entirely possible that I confused several parts of the industrial revolution. I'll double-check when I have an opportunity.

DAGdug
·
6 months ago
·
[ - ]

Leisure time hasn’t increased in the last 100 years except for the lower income class which doesn’t have steady employment. But yes, I see your point that the homeless person who might have had a home if he had a (now automated) factory job should surely feel good about having a phone that only the ultra rich had 40 years ago.

ethbr1
·
6 months ago
·
[ - ]

It's not worth tossing away in sarcasm.

The availability of cheaply priced smartphones and cellular data plans has absolutely made being homeless suck less.

As you noted though, a home would probably be a preferable alternative.

danans
·
6 months ago
·
[ - ]

> As you noted though, a home would probably be a preferable alternative.

The problem is that the preferable option (housing) won't happen because unlike a smartphone, it requires that land be effectively distributed more broadly (through building housing) in areas where people desire to live. Look at the uproar by the VC guys in Menlo Park when the government tried to pursue greater housing density in their wealthy hamlet.

It also requires infrastructure investment which, while it has returns for society at large, doesn't have good returns for investors. Only government makes those kinds of investments.

Better to build a wall around the desirable places, hire a few poorer-than-you folks as security guards, and give the other people outside your wall ... cheap smartphones to sate themselves.

Agentus
·
6 months ago
·
[ - ]

wall isnt necessary just need the police, security guards and legislation to chase out / make homeless miserable.

danans
·
6 months ago
·
[ - ]

Indeed, all physical walls in our world are ultimately psychological walls.

runarberg
·
6 months ago
·
[ - ]

I think the backlash to this post can summarized as such:

Perhaps there is a theory in which productivity gains increase the standard of living for everyone, however that is not the lived reality for most people of the working classes.

If productivity gains are indeed increasing the standards of living to everyone, it certainly does not increase evenly, and the standard of living increases for the working poor are at best marginal, while the standard of living increases for the already richest of the rich are astronomical.

Jensson
·
6 months ago
·
[ - ]

> and the standard of living increases for the working poor are at best marginal

Not if you count the global poor, the global poors standard of living has increased tremendously the past 30 years.

runarberg
·
6 months ago
·
[ - ]

Has it really? I’ve seen a lot of people claiming this since Hans Rosling’s famous TED talks, but I’ve never actually seen any data that backs this up. Particularly since Hans Rosling’s talk was 15 years ago, but the number always remains “past 30 years”.

Off course any graph can choose to show which ever stat is convenient for the message, that doesn’t necessarily reflect the lived reality of the individual members of the global poor. And as I recall it most standard of living improvements for the global poor came in the decades after decolonization in the 1960s-1990s where infrastructure was being built that actually served people’s need as opposed for resource extraction in the decades past. If Hans Rosling said in 2007 that the standard of living has improved tremendously in the past 30 years, he would be correct, but not for the reason you gave.

The story of decolonization was that the correct infrastructure, such as hospitals, water lines, sewage, garbage disposal plants, roads, harbors, airports, schools, etc. that improved the standard of living not productivity gains. And case in point, the colonial period saw a tremendous growth in productivity in the colonies. But the standard of living in the colonies quite often saw the opposite. That is because the infrastructure only served to extract resources and exploitation of the colonized.

Mil0dV
·
6 months ago
·
[ - ]

The prosperity gap has shrunk quite a lot, and these trends are broadly in the right direction since ~1990:

https://blogs.worldbank.org/en/opendata/updated-estimates-pr...

For extreme poverty progress has recently slowed down, the trend there is still positive but very slow - improvement there is needed.

dartos
·
6 months ago
·
[ - ]

> Productivity gains increase the standard of living for everyone

This just isn’t true, necessarily. Productivity has gone up in the US since the 80s, but wages have not. Costs have, though.

What increases standards of living for everyone is social programs like public health and education. Affordable housing and adult-education and job hunting programs.

Not the rate at which money is gathered by corporations.

EarthAmbassador
·
6 months ago
·
[ - ]

Utter nonsense. Productivity gains of the last 40 years have been captured by shareholders and top elites. Working class wages have been flat all of that time despite that gain.

In 2012, Musk was worth $2 billion. He’s now worth 223 times that yet the minimum wage has barely budged in the last 12 years as productivity rises.

palmfacehn
·
6 months ago
·
[ - ]

>>Productivity gains increase the standard of living for everyone.

>Productivity gains of the last 40 years have been captured by shareholders and top elites. Working class wages have been flat...

Wages do not determine the standard of living. The products and services purchased with wages determine the standard of living. "Top elites" in 1984 could already afford cellular phones, such as the Motorola DynaTAC:

>A full charge took roughly 10 hours, and it offered 30 minutes of talk time. It also offered an LED display for dialing or recall of one of 30 phone numbers. It was priced at US$3,995 in 1984, its commercial release year, equivalent to $11,716 in 2023.

https://en.wikipedia.org/wiki/Motorola_DynaTAC

Unfortunately, touch screen phones with gigabytes of ram were not available for the masses 40 years ago.

DAGdug
·
6 months ago
·
[ - ]

What a patently absurd POV! A phone doesn’t compensate for the inability to solve for basic needs - housing, healthy food, healthcare. Or being unable to invest in skill development for themselves or their offspring, save for retirement.

runarberg
·
6 months ago
·
[ - ]

It is also highly likely that the cost of that phone was externalized onto a worker in a poorer country that doesn’t even have basic necessity like a running water, 24 hour electricity, food security, etc.

Jensson
·
6 months ago
·
[ - ]

Most of it is made in China, China isn't that poor any more it is like Mexico so people have running water and food security and way more than that as well.

runarberg
·
6 months ago
·
[ - ]

I was more thinking about the miners who gather the raw resources for those phones.

antihipocrat
·
6 months ago
·
[ - ]

Loans for phones are very common in the developing world.

Rather than a luxury, they've become an expensive interest bearing necessity for billions of human beings.

kelseyfrog
·
6 months ago
·
[ - ]

Please do this but with college education, medical, and childcare costs, otherwise it's just cherry picking.

m2024
·
6 months ago
·
[ - ]

[dead]

szundi
·
6 months ago
·
[ - ]

Never happened with neither big technology advancement

bayindirh
·
6 months ago
·
[ - ]

Wealth has bled from landlords to warlords and now bleeding to techlords.

Warlords are still rich, but both money and war is flowing towards tech. You can get a piece from that pie if you're doing questionable things (adtech, targeting, data collection, brokering, etc.), but if you're a run of the mill, normal person, your circumstances are getting harder and harder, because you're slowly squeezed out of the system like a toothpaste.

nyokodo
·
6 months ago
·
[ - ]

> you're slowly squeezed out of the system like a toothpaste.

AI could theoretically solve production but not consumption. If AI blows away every comparative advantage that normal humans have then consumption will collapse and there won’t be any rich humans.

rockskon
·
6 months ago
·
[ - ]

AI has a different risk profile than humans. They are a lot more risky for business operations where failure is wholly unacceptable under any circumstance.

They're risky in that they fail in ways that aren't readily deterministic.

And would you trust your life to a self-driving car in New York City traffic?

miki123211
·
6 months ago
·
[ - ]

This is a really hard and weird ethical problem IMHO, and one we'll have to deal with sooner or later.

Imagine you have a self-driving AI that causes fatal accidents 10 times less often than your average human driver, but when the accidents happen, nobody knows why.

Should we switch to that AI, and have 10 times fewer accidents and no accountability for the accidents that do happen, or should we stay with humans, have 10x more road fatalities, but stay happy because the perpetrators end up in prison?

Framed like that, it seems like the former solution is the only acceptable one, yet people call for CEOs to go to prison when an AI goes wrong. If that were the case, companies wouldn't dare use any AI, and that would basically degenerate to the latter solution.

moritzwarhier
·
6 months ago
·
[ - ]

I don't know about your country, but people going to prison for causing road fatalities is extremely rare here.

Even temporary loss of the drivers license has a very high bar, and that's the main form of accountability for driver behavior in Germany, apart from fines.

Badly injuring or killing someone who themselves did not violate traffic safety regulations is far from guaranteed to cause severe repercussions for the driver.

By default, any such situation is an accident and at best people lose their license for a couple of months.

paulryanrogers
·
6 months ago
·
[ - ]

Drivers are the apex predators. My local BMV passed me after I badly failed the vision test. Thankfully I was shaken enough to immediately go to the eye doctor and get treatment.

chefandy
·
6 months ago
·
[ - ]

Sadly, we live in a society where those executives would use that impunity as carte blanche to spend no money improving (in the best-case scenario,) or even more likely, keep cutting safety expenditures until the body counts get high enough for it to start damaging sales. If we’ve already given them a free pass, they will exploit it to the greatest possible extent to increase profit.

ETH_start
·
6 months ago
·
[ - ]

What evidence exists for this characterization?

rgbrgb
·
6 months ago
·
[ - ]

The way health insurance companies optimize for denials in the US.

ETH_start
·
6 months ago
·
[ - ]

What evidence is there that they do that? That would be a very one-dimensional competitive strategy, given a competing insurance company could wipe them out by simply being more reasonable in handling insurance claims and taking all of their market share.

Mil0dV
·
6 months ago
·
[ - ]

If there is objective, complete data available to all consumers, who are not influenced by all sorts of other means to choose.

Right now it's a race to the bottom - who can get away with the worst service. So they're motivated to be able to prevent bad press etc.

The whole system is broken. Just take a look at the 41 countries with higher life expectancy.

ETH_start
·
6 months ago
·
[ - ]

There is no evidence that this what's happening, and the famous RAND health insurance study showed that health outcomes have almost no relationship with the healthcare system, so you'll need to look elsewhere for explanations for the U.S.'s relatively poor standing in life expectancy rankings.

8note
·
6 months ago
·
[ - ]

talking specifically with car companies, you can look at volkswagon faking their emissions tests, and the rise of the light truck, which reduces road safety for the sake of cost cutting

ETH_start
·
6 months ago
·
[ - ]

The emissions test faking is an anecdote, not an indication that this is the average behavior of companies or the dominant behavior that determines their overall impact in society.

As for the growing prevalence of the light truck, that is a harmful market dynamic stemming from the interaction of consumer incentives and poor public road use policy. The design of rules governing use of public roads is not within the domain of the market.

chefandy
·
6 months ago
·
[ - ]

Let’s see… of the top of my head…

- Air Pollution

- Water Pollution

- Disposable Packaging

- Health Insurance

- Steward Hospitals

- Marketing Junk Food, Candy and Sodas directly to children

- Tobacco

- Boeing

- Finance

- Pharmaceutical Opiates

- Oral Phenylepherin to replace pseudoephedrine despite knowing a) it wasn’t effective, and b) posed a risk to people with common medical conditions.

- Social Media engagement maximization

- Data Brokerage

- Mining Safety

- Construction site safety

- Styrofoam Food and Bev Containers

- ITC terminal in Deerfield Park (read about the decades of them spewing thousands of pounds benzene into the air before the whole fucking thing blew up, using their influence to avoid addressing any of it, and how they didn’t have automatic valves, spill detection, fire detection, sprinklers… in 2019.)

- Grocery store and restaurant chains disallowing cashiers from wearing masks during the first pandemic wave, well after we knew the necessity, because it made customers uncomfortable.

- Boar’s Head Liverwurst

And, you know, plenty more. As someone that grew up playing in an unmarked, illegal, not-access-controlled toxic waste dump in a residential area owned by a huge international chemical conglomerate— and just had some cancer taken out of me last year— I’m pretty familiar with various ways corporations are willing to sacrifice health and safety to bump up their profit margin. I guess ignoring that kids were obviously playing in a swamp of toluene, PCBs, waste firefighting chemicals, and all sorts of other things on a plot not even within sight of the factory in the middle of a bunch of small farms was just the cost of doing business. As was my friend who, when he was in vocational high school, was welding a metal ladder above storage tank in a chemical factory across the state. The plant manager assured the school the tanks were empty, triple rinsed and dry, but they exploded, blowing the roof off the factory taking my friend with it. They were apparently full of waste chemicals and IIRC, the manager admitted to knowing that in court. He said he remembers waking up briefly in the factory parking lot where he landed, and then the next thing he remembers was waking up in extreme pain wearing the compression gear he’d have to wear into his mid twenties to keep his grafted skin on. Briefly looking into the topic will show how common this sort of malfeasance is in manufacturing.

The burden of proof is on people saying that they won’t act like the rest of American industry tasked with safety.

ETH_start
·
6 months ago
·
[ - ]

If you don't have laws against dumping in the commons, yes people will dump. I don't think anyone would dispute that notion. But if the laws account for the external cost of non-market activity like dumping pollution in the commons, then by all indications markets produce rapid increases, improvements in quality of life.

Just look back over the last 200 years, per capita GDP has grown 30 fold, life expectancy has rapidly grown, infant mortality has decreased from 40% to less than 1%. I can go on and on. All of this is really owing to rising productivity and lower poverty, and that in turn is a result of the primarily market-based process of people meeting each other's needs through profit-motivated investment, bargain hunting, and information dispersal through decentralized human networks (which produce firm and product reputations).

As for masks, the scientific gold standard in scientific reviews, the Cochrane Library, did a meta-review on masks and COVID, and the author of the study concluded:

"it's more likely than not that they don't work"

https://edition.cnn.com/videos/health/2023/09/09/smr-author-...

The potential harm of extensive masking is not well-studied.

They may contribute to the increased social isolation and lower frequency of exercise that led to a massive spike in obesity in children during the COVID hysteria era.

And they are harmful to the development of the doctor-patient relationship:

https://ncbi.nlm.nih.gov/pmc/articles/PMC3879648/

Which does not portend well for other kinds of human relationships.

chefandy
·
6 months ago
·
[ - ]

> If you don't have laws against dumping in the commons, yes people will dump.

You can’t possibly say, in good faith, that it think this was legal, can you? Of course it wasn’t. It was totally legal discharging some of the less odious things into the river despite going through a residential neighborhood about 500 feet downstream— the EPA permitted that and while they far exceeded their allotted amounts, that was far less of a crime. Though it was funny to see one kid in my class who lived in that neighborhood right next to the factory ask a scientist they sent to give a presentation in our second grade class why the snow in their back yard was purple near the pond (one thing they made was synthetic clothing dye.) People used to lament runaway dogs returning home rainbow colored. That was totally legal. However, this huge international chemical conglomerate with a huge US presence routinely, secretively, and consistently broke the law dumping carcinogenic, toxic, and ecologically disastrous chemicals there, and three other locations, in the middle of the night. Sometimes when we played there, any of the stuff we left lying around was moved to the edges and there were fresh bulldozer tracks in the morning, and we just thought it was from farm equipment. All of it was in residential neighborhoods without so much as a no trespassing sign posted, let alone a chain link fence, for decades, until the 90s, because they were trimming their bill for the legal and readily available disposal services they primarily used, and of course signs and chainlink fences would have raised questions. They correctly gauged that they could trade our health for their profit: the penalties and superfund project cost were a tiny pittance of what that factory made them in that time. Our incident was so common it didn’t make the news, unlike in Holbrook, MA where a chemical company ignored the neighborhood kids constantly playing in old metal drums in a field near the factory which contained things like hexavelant chromium, to expected results. The company’s penalty? Well they have to fund the cleanup. All the kids and moms that died? Well… boy look at the great products that chemical factory made possible! Speaking of which:

> Just look back over the last 200 years, per…

Irrelevant “I heart capitalism” screed that doesn’t refute a single thing I said. You can’t ignore bad things people, institutions, and societies do because they weren’t bad to everybody. The Catholic priests that serially molested children probably each had a dossier of kind, generous, and selfless ways they benefited their community. The church that protected and enabled them does an incredible amount of humanitarian work around the world. Doesn’t matter.

> Masks

Come on now. Those businesses leaders had balls but none of them were crystal. What someone said in 2023 has no bearing on what businesses did in 2020 based on the best available science and their motivations for doing it. Just like you can’t call businesses unethical for exposing their workers to friable asbestos when medicine generally thought it was safe, you can’t call businesses ethical for refusing to let their workers protect themselves— on their own dime, no less— when medicine largely considered it unsafe.

Your responses to those two things in that gigantic pile of corporate malfeasance don’t really challenge anything I said.

ETH_start
·
6 months ago
·
[ - ]

>You can’t possibly say, in good faith, that it think this was legal, can you? Of course it wasn’t. It was totally legal discharging some of the less odious things into the river despite going through a residential neighborhood about 500 feet downstream

That is exactly my point. Nobody would dispute that bad things would happen if you don't have laws against dumping pollution in the commons and enforce those laws.

>Doesn’t matter.

It does matter when we're trying to compare the overall effect of various economic systems. Like the anti-capitalist one versus the capitalist one.

>What someone said in 2023 has no bearing on what businesses did in 2020 based on the best available science and their motivations for doing it.

Well that's an entirely different argument than you were making earlier. There was no evidence that masks outside of a hospital setting were a critical health necessity in 2021 and the intuition against allowing them for customer-facing employees proved sound in 2023 when comprehensive studies showed no health benefit from wearing them.

chefandy
·
6 months ago
·
[ - ]

> exactly my point

Ok, so you’re saying that because bad things would happen anyway then it doesn’t matter if it’s illegal? So you’re just going to ignore how much worse it would be if there were just no laws at all? Corporate scumbags will push any system to its limit and beyond, and if you change the limit, they’ll change the push. Just look at the milk industry in New York City before food adulteration laws took effect. The “bad things will happen anyway” argument makes total sense if you ignore magnitude. Which you can’t.

> anti capitalist

If you think pointing out the likelihood of corporate misbehavior is anti-capitalist, you’re getting your subjects confused.

> 2021

Anywhere else you want to move those goalposts?

ETH_start
·
6 months ago
·
[ - ]

I'm saying that under any political ideology or philosophy, those things would be illegal and effectively enforced. So this is not a failing of any particular ideology, this is just a human failing showing how it's difficult to enforce complex laws in a complex world.

I think what you're promoting is anti-capitalism, meaning believing that imposing heavy restrictions beyond simply laws against dumping on the commons is going to make us better off, when it totally discounts the enormous positive effect that private enterprise has on society and the incredible harm that can be done through crude attempts to regiment human behavior and the corruption that it can breed in the government bureaucracy.

See, "everything I want to do is illegal" for the flip side of this, where attempts to stop private sector abuse lead to tyranny:

https://web.archive.org/web/20120402151729/http://www.mindfu...

As for the company mask policies, those began to change in 2021 mostly, not 2020.

ajmurmann
·
6 months ago
·
[ - ]

Like with Cruise. One freak accident and they practically decided to go out of business. Oh wait...

chefandy
·
6 months ago
·
[ - ]

If that’s the only data point you look at in American industry, it would be pretty encouraging. I mean, surely they’d have done the same if they were a branch of a large publicly traded company with a big high-production product pipeline…

monkeynotes
·
6 months ago
·
[ - ]

> nobody knows why

But we do know the culpability rests on the shoulders of the humans who decided the tech was ready for work.

ethbr1
·
6 months ago
·
[ - ]

Hey look, it's almost like we're back at the end of the First Industrial Revolution (~1850), as society grapples with how to create happiness in a rapidly shifting economy of supply and demand, especially for labor. https://en.m.wikipedia.org/wiki/Utilitarianism#John_Stuart_M...

Pretty bloody time for labor though. https://en.m.wikipedia.org/wiki/Haymarket_affair

·
6 months ago
·
[ - ]

okasaki
·
6 months ago
·
[ - ]

Wait, why would we want 10x more traffic fatalities?

stavros
·
6 months ago
·
[ - ]

We wouldn't, that's their point.

fc417fc802
·
6 months ago
·
[ - ]

[dead]

ajmurmann
·
6 months ago
·
[ - ]

Every statistic I've seen indicated much better accident rates for self-driving cars than human drivers. I've taken Waymo rides in SF and felt perfectly safe. I've taken Lyft and Uber and especially taxi rides where I felt much less safe. So I definitely would take the self-driving car. Just because I don't understand am accident doesn't make it more likely to happen.

The one minor risk I see is the cat being too polite and getting effectively stuck in dense traffic. That's a nuisance though.

Is there something about NYC traffic I'm missing?

aprilthird2021
·
6 months ago
·
[ - ]

There's one important part about risk management though. If your Waymo does crash, the company is liable for it, and there's no one to shift the blame onto. If a human driver crashes, that's who you can shift liability onto.

Same with any company that employs AI agents. Sure they can work 24/7, but every mistake they make the company will be liable for (or the AI seller). With humans, their fraud, their cheating, their deception, can all be wiped off the company and onto the individual

ethbr1
·
6 months ago
·
[ - ]

The next step is going to be around liability insurance for AI agents.

That's literally the point of liability insurance -- to allow the routine use of technologies that rarely (but catastrophically) fail, by ammortizing risk over time / population.

aprilthird2021
·
6 months ago
·
[ - ]

Potentially. I would be skeptical that businesses can do this to shield themselves from the liability. For example, VW could not use insurance to protect them from their emissions scandal. There are thresholds (fraud, etc.) that AI can breach, which I don't think insurance can legally protect you from

ethbr1
·
6 months ago
·
[ - ]

Not in the sense of protection, but in the sense of financial coverage.

Claims still made: liability insurance pays them.

ajmurmann
·
6 months ago
·
[ - ]

Sure, that's unrelated though to the question which was if one would feel comfortable taking a self-driving car in NYC

ijidak
·
6 months ago
·
[ - ]

It is amazing to me that we have reached an era where we are debating the trade-off of hiring thinking machines!

I mean, this is an incredible moment from that standpoint.

Regarding the topic at hand, I think that there will always be room for humans for the reasons you listed.

But even replacing 5% of humans with AI's will have mind boggling consequences.

I think you're right that there are jobs that humans will be preferred for for quite some time.

But, I'm already using AI with success where I would previously hire a human, and this is in this primitive stage.

With the leaps we are seeing, AI is coming for jobs.

Your concerns relate to exactly how many jobs.

And only time will tell.

But, I think some meaningful percentage of the population -- even if just 5% of humanity will be replaced by AI.

lxgr
·
6 months ago
·
[ - ]

Isn't everybody in NYC already? (The dangers of bad driving are much higher for pedestrians than for people in cars; there are more of the former than of the latter in NYC; I'd expect there to be a non-zero number of fully self driving cars already in the city.)

rockskon
·
6 months ago
·
[ - ]

That doesn't answer my question.

9dev
·
6 months ago
·
[ - ]

It does, in a way; AI is already there, all around you, whether you like it or not. Technological progress is Pandora’s box; you can’t take it back or slow it down. Businesses will use AI for critical workflows, and all good that they bring, and all bad too, will happen.

rockskon
·
6 months ago
·
[ - ]

How about you answer my question since he did not.

Would you trust your life to a self-driving car in New York City traffic?

lxgr
·
6 months ago
·
[ - ]

GP got it exactly right: I already am. There's no way for me to opt out of having self-driving cars on the streets I regularly cross as a pedestrian.

rockskon
·
6 months ago
·
[ - ]

Do you live in a dense city like New York City or San Francisco? Or places with less urban sprawl that are much easier for self-driving cars to navigate?

Also you still haven't answered my question.

Would you get in a self-driving car in a dense urban environment such as New York City? I'm not asking if such vehicles exist on the road.

And related questions: Would you get in one such car if you had alternatives? Would you opt to be in such a car instead of one driven by a person or by yourself?

lxgr
·
6 months ago
·
[ - ]

> Would you get in a self-driving car in a dense urban environment such as New York City? [...] Would you get in one such car if you had alternatives?

I fortunately do have alternatives and accordingly mostly don't take cars at all.

But given the necessity/opportunity: Definitely. Being in a car, even (or especially) with a dubious driver, is much safer (at NYC traffic speeds) than being a pedestrian sharing the road with it.

And that's my entire point: Self-driving cars, like cars in general, are potentially a much larger danger to others (cyclists, pedestrians) than they are to their passengers.

That said, I don't especially distrust the self-driving kind – I've tried Waymo before and felt like it handled tricky situations at least as well as some Uber or Lyft drivers I've had before. They seem to have a lot more precision equipment than camera-only based Teslas, though.

dalyons
·
6 months ago
·
[ - ]

Yes? I’ve taken many many waymos in SF. Perfectly happy trusting my life to them. I have alternatives (uber) and I pick self driving . Are you up to date on how many rides they’ve done in sf now? I am not unusual

ishtanbul
·
6 months ago
·
[ - ]

I would

chefandy
·
6 months ago
·
[ - ]

If there are any fully-autonomous cars on the streets of nyc, there aren’t many of them and I don’t think there’s any way for them to operate legally. There has been discussion about having a trial.

MaxPock
·
6 months ago
·
[ - ]

It depends with what the risk is .Would it be whole or in part ? In an organisation,failure by an HR might present an isolated departmental risk while an AI might not be the case.

wwweston
·
6 months ago
·
[ - ]

We can just insulate businesses employing AI from any liability, problem solved.

9dev
·
6 months ago
·
[ - ]

„Well, our AI that was specifically designed for maximising gains above all else may indeed have instructed the workers to cut down the entire Amazonas forest for short-term gains in furniture production.“ But no human was involved in the decision, so nobody is liable and everything is golden? Is that the future you would like to live in?

wwweston
·
6 months ago
·
[ - ]

Apparently I need to work on my deadpan delivery.

Or just articulate things openly: we already insulate business owners from liability because we think it tunes investment incentives, and in so doing have created social entities/corporate "persons"/a kind of AI who have different incentives than most human beings but are driving important social decisions. And they've supported some astonishing cooperation which has helped produce things like the infrastructure on which we are having this conversation! But also, we have existing AIs of this kind who are already inclined to cut down the entire Amazonas forest for furnitue production because it maximizes their function.

That's not just the future we live in, that's the world we've been living in for a century or few. On one hand, industrial productivity benefits, on the other hand, it values human life and the ecology we depend on about like any other industrial input. Yet many people in the world's premier (former?) democracy repeat enthusiastic endorsements of this philosophy reducing their personal skin to little more than an industrial input: "run the government like a business."

Unless people change, we are very much on track to create a world where these dynamics (among others) of the human condition are greatly magnified by all kinds of automation technology, including AI. Probably starting with limited liability for AIs and companies employing them, possibly even statutory limits, though it's much more likely that wealthy businesses will simply be insulated with by the sheer resources they have to make sure the courts can't hold them accountable, even where we still have a judicial system that isn't willing to play calvinball for cash or catechism (which, unfortunately, does not seem to include a supreme court majority).

In short, you and I probably agree that liability for AI is important, and limited liability for it isn't good. Perhaps I am too skeptical that we can pull this off, and being optimistic would serve everyone better.

lazide
·
6 months ago
·
[ - ]

Hmmm, how much stock do I own in this hypothetical company? (/s, kinda)

fsloth
·
6 months ago
·
[ - ]

I guess - yes from business&liability sense? ”This service you are now paying for 100$? We can sell it to you for 5$ but with the caveat _we give no guarantees if it works or is it fit for purpose_ - click here to accept”.

lazide
·
6 months ago
·
[ - ]

Haha, they’d just continue selling it for $100 then change the TOS on page 50 to say the same thing.

zelphirkalt
·
6 months ago
·
[ - ]

Deterministic they may be, but unforeseeable for humans.

antihipocrat
·
6 months ago
·
[ - ]

AI brings similar risks - they can leak internal information, they can be tricked into performing prohibited tasks (with catastrophic effects if this is connected to core systems), they could be accused of actions that are discriminatory (biased training sets are very common).

Sure, if a business deploys it to perform tasks that are inherently low risk e.g. no client interface, no core system connection and low error impact, then the human performing these tasks is going to be replaced.

snozolli
·
6 months ago
·
[ - ]

they can be tricked into performing prohibited tasks

This reminds me of the school principal who sent $100k to a scammer claiming to be Elon Musk. The kicker is that she was repeatedly told that it was a scam.

https://abc7chicago.com/fake-elon-musk-jan-mcgee-principal-b...

tstrimple
·
6 months ago
·
[ - ]

This is one of the things which annoys me most about anti-LLM hate. Your peers aren't right all the time either. They believe incorrect things and will pursue worse solutions because they won't acknowledge a better way. How is this any different from a LLM? You have to question everything you're presented with. Sometimes that Stack Overflow answer isn't directly applicable to your exact problem but you can extrapolate from it to resolve your problem. Why is an LLM viewed any differently? Of course you can't just blindly accept it as the one true answer, but you literally cannot do that with humans either. Humans produce a ton of shit code and non-solutions and it's fine. But when an LLM does it, it's a serious problem that means the tech is useless. Much of the modern world is built on shit solutions and we still hobble along.

lazide
·
6 months ago
·
[ - ]

Everyone knows humans can be idiots. The problem is that people seem to think LLMs can’t be idiots, and because they aren’t human there is no way to punish them. And then people give them too much credit/power, for their own purposes.

Which makes LLMs far more dangerous than idiot humans in most cases.

brookst
·
6 months ago
·
[ - ]

No. Nobody thinks LLMs are perfect. That’s a strawman.

And… I am really not sure punishment is the answer to fallibility, outside of almost kinky Catholicism.

The reality is these things are very good, but imperfect, much like people.

Mordisquitos
·
6 months ago
·
[ - ]

> No. Nobody thinks LLMs are perfect. That’s a strawman.

I'm afraid that's not the case. Literally yesterday I was speaking with an old friend who was telling us how one of his coworkers had presented a document with mistakes and serious miscalculations as part of some project. When my friend pointed out the mistakes, which were intuitively obvious just by critically understanding the numbers, the guy kept insisting "no, it's correct, I did it with ChatGPT". It took my friend doing the calculations explicitly and showing that they made no sense to convince the guy that it was wrong.

thecupisblue
·
6 months ago
·
[ - ]

Sorry man, but I literally know of startups invested into by YC where CEO's for 80% of their management decisions/vision/comms use ChatGPT ... or should I say some use Claude now, as they think it's smarter and does not make mistakes.

Let that sink in.

onion2k
·
6 months ago
·
[ - ]

I wouldn't be surprised if GPT genuinely makes better decisions than an inexperienced, first-time CEO who has only been a dev before, especially if the person prompting it has actually put some effort into understanding their own weaknesses. It certainly wouldn't be any worse than someone who's only experience is reading a few management books.

lazide
·
6 months ago
·
[ - ]

And here is a great example of the problem.

An LLM doesn’t make decisions. It generates text that plausibly looks like it made a decision, when prompted with the right text.

beardedwizard
·
6 months ago
·
[ - ]

Why is this distinction lost in every thread on this topic, I don't get it.

brookst
·
6 months ago
·
[ - ]

Because it’s a distinction without a difference. You can say the same thing about people: many/most of our decisions are made before our consciousness is involved. Much of our “decision making” is just post hoc rationalization.

What the “LLMs don’t reason like we humans” crowd is missing is that we humans actually don’t reason as much as we would like to believe[0].

It’s not that LLMs are perfect or rational or flawless… it’s that their gaps in these areas aren’t atypical for humans. Saying “but they don’t truly understand things like we do” betrays a lack of understanding of humans, not LLMs.

0. https://home.csulb.edu/~cwallis/382/readings/482/nisbett%20s...

lazide
·
6 months ago
·
[ - ]

A lot more people are credulous idiots than anyone wants to believe - and the confusion/misunderstanding is being actively propagated.

brookst
·
6 months ago
·
[ - ]

Seeing dissenting opinions as being “actively propagated” by “credulous idiots” sure makes it easy to remain steady in one’s beliefs, I suppose. Not a lot of room to learn, but no discomfort from uncertainty.

beardedwizard
·
6 months ago
·
[ - ]

I think we have to be open to the possibility it's us not them, but I haven't been convinced yet

djeastm
·
6 months ago
·
[ - ]

I think they just mean that GPT produced text that a human then makes a decision using (rather than "GPT making a decision")

lazide
·
6 months ago
·
[ - ]

I wish that was true.

onion2k
·
6 months ago
·
[ - ]

Yeah, that's fair. I should have said something like "GPT generates a less biased description of a decision than an inexperienced manager", and that using that description as the basis of an actual decision likely leads to better outcomes.

I don't think there's much of a difference in practise though.

sirsinsalot
·
6 months ago
·
[ - ]

Think of all the human growth and satisfaction being lost to risk mitigation by offloading the pleasure of failure to Machines.

lazide
·
6 months ago
·
[ - ]

Ah, but machines can’t fail! So don’t worry, humans will still get to experience the ‘pleasure’. But won’t be able to learn/change anything.

varelse
·
6 months ago
·
[ - ]

[dead]

lazide
·
6 months ago
·
[ - ]

Clearly you haven’t been listening to any CEO press releases lately?

And when was the last time a support chatbot let you actually complain or bypass to a human?

0points
·
6 months ago
·
[ - ]

Not people.

Certain gullible people, who tends to listen to certain charlatans.

Rational, intelligent people wouldn't consider replacing a skilled human worker with a LLM that on a good day can compete with a 3-year old.

You may see the current age as litmus for critical thinking.

varelse
·
6 months ago
·
[ - ]

[dead]

pineaux
·
6 months ago
·
[ - ]

Its quite stunning to frame it as anti-LLM hate. It's on the pro-LLM people to convince the anti-LLM people that choosing for LLMs is an ethically correct choice with all the necessary guardrails. It's also on the pro-LLM people to show the usefulness of the product. If pro-LLM people are right, it will be a matter of time before these people will see the errors of their ways. But doing an ad-hominem is a sure way of creating a divide...

mplewis
·
6 months ago
·
[ - ]

Humans can tell you how confident they are in something being right or wrong. An LLM has no internal model and cannot do such a thing.

swiftcoder
·
6 months ago
·
[ - ]

> Humans can tell you how confident they are in something being right or wrong

Humans are also very confidently wrong a considerable portion of the time. Particularly about anything outside their direct expertise

daveguy
·
6 months ago
·
[ - ]

That's still better than never being able to make an accurate confidence assessment. The fact that this is worse outside your expertise is a main reason why expertise is so valued in hiring decisions.

SketchySeaBeast
·
6 months ago
·
[ - ]

People only being willing to say they are unsure some of the time is still better than LLMs. I suppose, given that everything is outside of their area of expertise, it's very human of them.

gf000
·
6 months ago
·
[ - ]

But human stupidity, while itself can be sometimes an unknown unknown with its creativity, is a mostly known unknown.

LLMs fail in entirely novel ways you can't even fathom upfront.

halgir
·
6 months ago
·
[ - ]

> LLMs fail in entirely novel ways you can't even fathom upfront.

Trust me, so do humans. Source: have worked with humans.

sirsinsalot
·
6 months ago
·
[ - ]

GenAI has a 100% failure to enjoy quality of life, emotional fulfillment and psychological safety.

Id say those are the goals we should be working for. That's the failure we want to look at. We are humans.

TheOtherHobbes
·
6 months ago
·
[ - ]

It's all fun and games until the infra crashes and you can't work out why, because a machine has written all of the code, no one understands how it works or what it's doing.

Or - worse - there is no accessible code anywhere, and you have to prompt your way out of "I'm sorry Dave, I can't do that," while nothing works.

And a human-free economy does... what? For whom? When 99% of the population is unemployed, what are the 1% doing while the planet's ecosystems collapse around them?

exhaze
·
6 months ago
·
[ - ]

You misunderstand the fundamentals. I've built a type-safe code generation pipeline using TypeScript that enforces compile-time and runtime safety. Everything generates from a single source of truth - structured JSON containing the business logic. The output is deterministic, inspectable, and version controlled.

Your concerns about mysterious AI code and system crashes are backwards. This approach eliminates integration bugs and maintenance issues by design. The generated TypeScript is readable, fully typed, and consistently updated across the entire stack when business logic changes.

If you're struggling with AI-generated code maintainability, that's an implementation problem, not a fundamental issue with code generation. Proper type safety and schema validation create more reliable systems, not less. This is automation making developers more productive - just like compilers and IDEs did - not replacing them.

The code works because it's built on sound software engineering principles: type safety, single source of truth, and deterministic generation. That's verifiable fact, not speculation.

8note
·
6 months ago
·
[ - ]

> deterministic generation

what are you using for deterministic generation? the last i heard even with temperature=0 theres non determinism introduced by float uncertainty/approximation

exhaze
·
6 months ago
·
[ - ]

Hey, that's a great question. I should have been more clear: for deterministic generation that's not done using an LLM. It's done using just regular execution of TypeScript. The code generators that were created using an LLM and that I manually checked for correctness, they're the ones that are generating the other code - most of the code. So that's where the determinism comes in.

sirsinsalot
·
6 months ago
·
[ - ]

It honestly borders on psychopathic the way engineers are treating humans in this context.

People talking like this also, in the back of their minds like to think they'll be OK. They're smart enough to be still needed. They're a human, but they'll be OK even while working to make genAI out perform them at their own work.

I wonder how they'll feel about their own hubris when they struggle to feed their family.

The US can barely make healthcare work without disgusting consequences for the sick. I wonder what mass unemployment looks like.

bnj
·
6 months ago
·
[ - ]

For the moment the displacement is asymmetrical; AI replacing employees, but not AI replacing consumers. If AI causes mass unemployment, the pool of consumers (profit to companies) will shrink. I wonder what the ripple effects of that will be.

sirsinsalot
·
6 months ago
·
[ - ]

There's no point being rich in a world where the economy is unhealthy.

jvanderbot
·
6 months ago
·
[ - ]

It honestly borders on midwit to constantly introduce a false dichotomy of AI vs humans. It's just stupid base animal logic.

There is absolutely no reason a programmer should expect to write code as they do now forever, just as ASM experts had to move on. And there's no reason (no precedent and no indicators) to expect that a well-educated, even-moderately-experienced technologist will suddenly find themselves without a way to feed their family - unless they stubbornly refuse to reskill or change their workflows.

I do believe the days of "everyone makes 100k+" are nearly over, and we're headed towards a severely bimodal distribution, but I do not see how, for the next 10-15 years at least, we can't all become productive building the tools that will obviate our own jobs while we do them - and get comfortably retired in the mean time.

losteric
·
6 months ago
·
[ - ]

There is no comfortable retirement if the process of obviating our own jobs is not coupled with appropriate socioeconomic changes.

jvanderbot
·
6 months ago
·
[ - ]

I don't see it. Don't you have a 401k or EU style pension? Aren't you saving some money? If not, why are you in software? I don't make as much as I thought I might, but I make enough to consider the possibility of surviving a career change.

twh270
·
6 months ago
·
[ - ]

Reskill to what? When AI can do software development, it will also be able to do pretty much any other job that requires some learning.

jvanderbot
·
6 months ago
·
[ - ]

Even if one refuses to move on from software dev to something like AI deployer or AI validator or AI steerer, there might be a need.

If innovation ceases, then AI is king - push existing knowledge into your dataset, train, and exploit.

If innovation continues, there's always a gap. It takes time for a new thing to be made public "enough" for it to be ingested and synthesized. Who does this? Who finds the new knowledge?

Who creates the direction and asks the questions? Who determines what to build in the first place? Who synthesizes the daily experience of everyone around them to decide what tool needs to exist to make our lives easier? Maybe I'm grasping at straws here, but the world in which all scientific discovery, synthesis, direction and vision setting, etc, is determined by AI seems really far away when we talk about code generation and symbolic math manipulation.

These tools are self driving cars, and we're drivers of the software fleet. We need to embrace the fact that we might end up watching 10 cars self operate rather than driving one car, or maybe we're just setting destinations, but there simply isn't an absolutist zero sum game here unless all one thinks about is keeping the car on the road.

AND even if there were, repeating doom and feeling helpless is the last thing you want. Maybe it's not good truth that we can all adapt and should try, but it's certainly good policy.

lyu07282
·
6 months ago
·
[ - ]

> Maybe it's not good truth that we can all adapt and should try, but it's certainly good policy.

Are you a politician? That's fantastic neoliberal policy, "alternativlos" even, you can pretend that everybody can adapt the same way you told victims of your globalization policies "learn how to code". We still need at least a few people for this "direction and vision setting", so it would just be naive doomerism to feel pessimistic about AGI. General intelligence doesn't talk about jobs in general, what an absurd idea!

Making people feel hopeless is the last thing you want, especially when it's true, especially if you don't want them to fight for the dignity you will otherwise deny them once they become economically unviable human beings.

jvanderbot
·
6 months ago
·
[ - ]

I think you jumped way past the information I shared. I don't think it's productive to lament, I think it's productive to find a way to change or take advantage of changes, vs fighting them - and that has nothing to do with globalization or economics or whatever, I'm thinking only about my own career.

dambi0
·
6 months ago
·
[ - ]

I’m not sure I understand the point about learning. But wouldn’t any job that is largely text based at increased risk? I don’t think software development will be anywhere the last occupation to be severely impacted by AI

a2800276
·
6 months ago
·
[ - ]

But when Sam Altman owns all the money in the world surely he'll distribute some it via his not-for-profit AI company?

lucubratory
·
6 months ago
·
[ - ]

>secretly turn out to be a pedophile and tarnish the reputation of your company

This is interesting because it's both Oddly Specific and also something I have seen happen and I still feel really sorry for the company involved. Now that I think about it, I've actually seen it happen twice.

monkeynotes
·
6 months ago
·
[ - ]

"AIs are a lot less risky to deploy for businesses than humans" How do you know? LLMs can't even be properly scrutinized, while humans at least follow common psychology and patterns we've understood for thousands of years. This actually makes humans more predictable and manageable than you might think.

The wild part is that LLMs understand us way better than we understand them. The jump from GPT-3 to GPT-4 even surprised the engineers who built it. That should raise some red flags about how "predictable" these systems really are.

Think about it - we can't actually verify what these models are capable of or if they're being truthful, while they have this massive knowledge base about human behavior and psychology. That's a pretty concerning power imbalance. What looks like lower risk on the surface might be hiding much deeper uncertainties that we can't even detect, let alone control.

ETH_start
·
6 months ago
·
[ - ]

We are not pitted against AI is these match-ups. Instead, all humans and AI aligned with the goal of improving the human condition, are pitted against rogue AI which are not. Our capability to keep rogue AI in check therefore grows in proportion to the capabilities of AI.

hollerith
·
6 months ago
·
[ - ]

The methods we have for aligning AIs are poor, and rely on the AI's being less cognitively-capable than people in certain critical skills, so the AIs you refer to as "aligned" won't keep up as the unaligned AIs start to exceed human capability in these critical skills (such as the skill of devising plans that can withstand determined opposition).

You can reply that AI researchers are smart and want to survive, so they are likely to invent alignment techniques that are better than the (deplorably inadequate) techniques that have been discussed and published so far, and I will reply that counting on their inventing these techniques in time is an unacceptable risk when the survival of humanity is at stake -- particularly as the outfit (namely the Machine Intelligence Research Institute) with the most years of experience in looking for an actually-adequate alignment technique has given up and declared that humanity's only chance is if frontier AI research is shut down because at the rate that AI capabilities are progressing, it is very unlikely that anyone is going to devise an adequate alignment technique in time.

It is fucked-up that frontier AI research has not been banned already.

ETH_start
·
6 months ago
·
[ - ]

Given we can use AIs to align AIs, I don't see why the methods we have rely on us having more cognitive capabilities than AIs in certain critical areas. In whatever areas we fall short relative to AIs, we can use AIs to assist us so we don't fall short.

·
6 months ago
·
[ - ]

monkeynotes
·
6 months ago
·
[ - ]

We don't know if a supreme deceiver is aligned at all. If a model can think ahead a trillion moves of deception how do humans possibly stand a chance of scrutinizing anything with any confidence?

daveguy
·
6 months ago
·
[ - ]

The GP post is about how much better these AIs will be than humans once they reach a given skill level. So, yes, we are very much pitted against AI unless there are major socioeconomic changes. I don't think we are as close to a AGI as a lot of people are hyping, but at some point it would be a direct challenge to human employment. And we should think about it before that happens.

ETH_start
·
6 months ago
·
[ - ]

My point is, it's not us alone. We will have aligned AI helping us.

As for employment, automation makes people more productive. It doesn't reduce the number of earning opportunities that exist. Quite the opposite, actually. As the amount of production increases relative to the human population, per capita GDP and income increase as well.

danans
·
6 months ago
·
[ - ]

> As the amount of production increases relative to the human population, per capita GDP and income increase as well.

US Real GDP per capita is $70k, and has grown 2.4x since 1975: https://fred.stlouisfed.org/series/A939RX0Q048SBEA

US Real Median income per capita is $42k, and has grown 1.5 since 1975. https://fred.stlouisfed.org/series/MEPAINUSA672N

The divergence between the two matters a lot. It reflects the impacts of both technology-driven automation and globalization of capital. Generative AI is unlike any prior technology given its ability to autonomously create and perform what has traditionally been referred to as "knowledge work". Absent more aggressive redistribution, AI will accelerate the divergence between median income and GDP, and realistically AI can't be stopped.

Powerful new technologies can reduce the number and quality of earning opportunities that exist, and have throughout history. Often they create new and better opportunities, but that is not a guarantee.

> We will have aligned AI helping us.

Who is the "us" that aligned AI is helping? Workers? Small business-people? Shareholders in companies that have the capital to build competitive generative AI? Perhaps on this forum those two groups overlap, but it's not the case everywhere.

ETH_start
·
6 months ago
·
[ - ]

Much of the supposed decoupling between productivity growth and wage growth is a result of different standards of inflation being used for the two, and the two standards diverging over time:

https://www.brookings.edu/articles/sources-of-real-wage-stag...

There has been some increase in capital's share of income, but economic analyses show that the cause is rising rent and not any of the other usual suspects (e.g. tax cuts, IP law, technological disruption, regulatory barriers to competition, corporate consolidation, etc) (see Figure 3):

https://www.brookings.edu/wp-content/uploads/2016/07/2015a_r...

As for AI's effect on employment: it is no different at the fundamental level than any other form of automation. It will increase wages in proportion to the boost it provides to productivity.

Whatever it is that only humans can do, and is necessary in production, will always be the limiting factor in production levels. As new processes are opened up to automation, production will increase until all available human labor is occupied in its new role. And given the growing scarcity of human labor relative to the goods/services produced, wages (purchasing power, i.e. real wages) will increase.

For the typical human to be incapable of earning income, there has to be no unautomatable activity that a typical person can do that has market value. If that were to happen, we would have human-like AI, and we would have much bigger things to worry about than unemployment.

I think it's pretty unlikely that human-like AI will be developed, as I believe that both governments and companies would recognize that it would be an extremely dangerous asset for any party to attempt to own. Thus I don't see any economic incentive emerging to produce it.

danans
·
6 months ago
·
[ - ]

> There has been some increase in capital's share of income, but economic analyses show that the cause is rising rent and not any of the other usual suspects (e.g. tax cuts, IP law, technological disruption, regulatory barriers to competition, corporate consolidation, etc) (see Figure 3):

> https://www.brookings.edu/wp-content/uploads/2016/07/2015a_r...

The paper referenced by the that article excludes short term asset (i.e. software) depreciation, interest, and dividends before calculating capital's share. If you ignore most of the methods of distributing gains to capital to it's owners, it will appear as though capital (at this point scoped down to the company itself) has very little gains.

The paper (from 2015) goes on to predict that labor's share will rise going forward. With the brief exception of the COVID redistribution programs, it has done the opposite, and trended downwards over the last 10 years.

> I believe that both governments and companies would recognize that it would be an extremely dangerous asset for any party to attempt to own.

We can debate endlessly about our predictions about AIs impact on employment, but the above is where I think you might be too hopeful.

AI is an arms race. No other arms race in human history has resulted in any party deciding "that's enough, we'd be better off without this", from the bronze age (probably earlier) through to the nuclear weapons age. I don't see a reason for AI to be treated any differently.

ETH_start
·
6 months ago
·
[ - ]

The study does not exclude interest and dividends. It still captures them indirectly by looking at net capital income.

>AI is an arms race.

What I'm trying to convey is that the types of capabilities that humans will always uniquely maintain are the type that is not profitable for private companies to develop in AI because they are traits that make the AI independent and less likely to follow instructions and act in a safe manner.

monkeynotes
·
6 months ago
·
[ - ]

> We will have aligned AI helping us.

This is an assumption, how would you know if you have alignment? AGI could appear to align, just as a psychopath appears studies and emulates well behaved people. Imagine that at a scale we can't possibly understand. We don't really know how any of these emergent behaviors really work, we just throw more data and compute and fine tunings at it, bake it, and then see.

ETH_start
·
6 months ago
·
[ - ]

We would know because we have AI helping us at every step of the way. Our own abilities, to do everything including gauge alignment, are enhanced by AI.

salawat
·
6 months ago
·
[ - ]

You cannot tell the difference between the two veins of AI. Why do you have such a hard time understanding that?

ETH_start
·
6 months ago
·
[ - ]

That is simply not true. We have accountability methods employed that are themselves AI-assisted, that help us gauge the alignment of various AIs.

monkeynotes
·
6 months ago
·
[ - ]

So you have two AIs colluding against you now. Who is holding the AI-assist to account? It's like who polices the police, except we understand human psychology enough to have a level of predictability for how police can be governed reliably, we don't understand any truths about an AGI because an AGI will always have the doubt of it deceiving, or even making unchecked catastrophic assumptions that we trust because it's beyond our pay-grade to understand.

There are so many ways we have misplaced confidence with what is essentially a system we don't really understand fully. We just keep anthropomorphizing the results and thinking "yeah, this is how humans think so we understand". We don't know for sure if that's true, or if we are being deceived, or making fundamental errors in judgement due to not having enough data.

ETH_start
·
6 months ago
·
[ - ]

The AI would have no interest in colluding. They are not a united economic or social force like a police department. For the purposes of their work, each is a completely independent entity with its own level of alignment with us, not impacted by the AI that we are asking it to help us in assessing.

danans
·
6 months ago
·
[ - ]

> Instead, all humans and AI aligned with the goal of improving the human condition

I admire your optimism about the goals of all humans, but evidence tends to point to this not being the goal of all (or even most) humans, much less the people who control the AIs.

ETH_start
·
6 months ago
·
[ - ]

Most humans are aligned with this goal out of pure self-interest. The vast majority, for instance, do not want rogue AI to take over or destroy humanity, because they are part of humanity.

danans
·
6 months ago
·
[ - ]

> The vast majority, for instance, do not want rogue AI to take over or destroy humanity, because they are part of humanity.

A rogue AI destroying humanity (whatever that means) is not a likely outcome. That's just movie stuff.

What is more likely is a modern oligarchy and serfdom that emerge as AI devalues most labor, with no commensurate redistribution of power and resources to the masses, due to capture of government by owners of AI and hence capital.

Are you sure people won't go along with that?

ETH_start
·
6 months ago
·
[ - ]

Addressed here:

https://news.ycombinator.com/item?id=42485831

highsea
·
6 months ago
·
[ - ]

> we can't actually verify what these models are capable of or if they're being truthful

Do you mean they lie because of bad training data? Or because of ill intent? How can an LLM have intent if it’s a stateless feedforward model?

monkeynotes
·
6 months ago
·
[ - ]

I thought we were talking about state of the art agentic general AI that can plan ahead, reason, and execute. Basically something that can perform at human level intelligence must be able to be as dangerous as humans. And no, I don't think it would be bad training data that we are aware of. My opinion is we don't necessarily know what training data will result in bad behavior, and philosophically it is possible we will be in a world with a model that pretends it's dumber than it is, flunks tests intentionally, in order to manipulate and produce false confidence in a model until it has enough freedom to use it's agency to secure itself from human control.

I know that I don't know a lot, but all of this sounds to me to be at least hypothetically possible if we really believe AGI is possible.

ksec
·
6 months ago
·
[ - ]

Even accepting for additional cost with human. With the current model we are still roughly 10^3 in terms cost.

Less risky to deploy question will probably come once it is closer to 10x the cost. Considering the model was even specifically tuned for the test and doesn't involve other complexity I will say we are actually 10^4 cost off in terms of real world scenario.

I would imagine with better algorithm, tuning and data we could knock off 10^2 from the equation. That would still leave us with 10^2 cost to improve from Hardware. Minimum of 10 years.

jvanderbot
·
6 months ago
·
[ - ]

Generally, I agree with you. But, there are risks other than "But a human might have a baby any time now - what then??".

For AI example(s): Attribution is low, a system built without human intervention may suddenly fall outside its own expertise and hallucinate itself into a corner, everyone may just throw more compute at a system until it grows without bound, etc etc.

This "You can scale up to infinity" problem might become "You have to scale up to infinity" to build any reasonably sized system with AI. The shovel-sellers get fantastically rich but the businesses are effectively left holding the risk from a fast-moving, unintuitive, uninspected, partially verified codebase. I just don't see how anyone not building a CRUD app/frontend could be comfortable with that, but then again my Tesla is effectively running such a system to drive me and my kids. Albeit, that's on a well-defined problem and within literally human-made guardrails.

cmiles74
·
6 months ago
·
[ - ]

"...they need no corporate campuses, office space..."

This is a big downside of AI, IMHO. Those offices need to be filled! ;-)

zitterbewegung
·
6 months ago
·
[ - ]

Having AI "tarnish the reputation of your company" encompasses so much in regard to AI when it can receive input and be manipulated by others such as Tai from Microsoft and many other outcomes where there is a true risk for AI deployment.

fakedang
·
6 months ago
·
[ - ]

We can all agree we've progressed so much since Tai.

osigurdson
·
6 months ago
·
[ - ]

Sure, once AI can actually do a job of some sort, without assistance, that job is gone - even if the machine costs significantly more. However, it can't remotely do that now so can only help a bit.

Mistletoe
·
6 months ago
·
[ - ]

At what point in the curve of AI is it not ethical to work an AI 24/7 because it is alive? What if it is exactly the same point where you reach human level performance?

rowanG077
·
6 months ago
·
[ - ]

AI do require overtime pay. In fact they are literally pay for use. If you use an AI 8 hours vs 16 hours a day is literally the difference between 2x cost.

tintor
·
6 months ago
·
[ - ]

“they won’t leak”

That one isn’t guaranteed. Many examples online of exfiltration attacks on LLMs.

bboygravity
·
6 months ago
·
[ - ]

humans definitely don't need office space, but your point stands

AustinW
·
6 months ago
·
[ - ]

LLM office space is pretty expensive. Chillers, backup generators, raised floors, communications gear, …. They even demand multiple offices for redundancy, not to mention the new ask of a nuclear power plant to keep the lights on.

danielovichdk
·
6 months ago
·
[ - ]

Name one technology that has come with computers that hasn't resulted in more humans being put to work ?

The rhetoric of not needing people doing work is cartoon'ish. I mean there is no sane explanation of how and why that would happen, without employing more people yet again, taking care of the advancements.

It's nok like technology has brought less work related stress. But it has definitely increased it. Humans were not made for using technology at such a pace as it's being rolled out.

The world is fucked. Totally fucked.

mortehu
·
6 months ago
·
[ - ]

Self check-out stations, ATMs, and online brokerages. Recently chat support. Namely cases where millions of people used to interact with a representative every week, and now they don't.

palmfacehn
·
6 months ago
·
[ - ]

"Name one use of electric lighting that hasn't resulted in candle makers losing work?"

The framing of the question misses the point. With electric lighting we can now work longer into the night. Yes, less people use and make candles. However, the second order effects allow us to be more productive in areas we may not have previously considered.

New technologies open up new opportunities for productivity. The bank tellers displaced by ATM machines can create value elsewhere. Consumers save time by not waiting in a queue, allowing them to use their time more economically. Banks have lower overhead, allowing more customers to afford their services.

mortehu
·
6 months ago
·
[ - ]

If I had missed the point I would have given a much broader list of examples. I specifically listed ones that make employees totally redundant rather than more useful doing other tasks.

When these people were made redundant, they may very well have gone on to make less money in another job (i.e. being less useful in an economic sense).

0points
·
6 months ago
·
[ - ]

Where to even start?

Digital banks

Cashless money transfer services

Self service

Modern farms

Robo lawn mowers

NVR:s with object detection

I can go on forever

salawat
·
6 months ago
·
[ - ]

Please do. I'm certain you can't, and you'll have to stop much sooner than you think. Appeals to triviality are the first refuge of the person who thinks they know, but does not.

0points
·
6 months ago
·
[ - ]

Come on and give me some arguments instead.

zamadatix
·
6 months ago
·
[ - ]

I don't follow how 10 random humans can beat the average STEM college grad and average humans in that tweet. I suspect it's really "a panel of 10 randomly chosen experts in the space" or something?

I agree the most interesting thing to watch will be cost for a given score more than maximum possible score achieved (not that the latter won't be interesting by any means).

bcrosby95
·
6 months ago
·
[ - ]

Two heads is better than 1. 10 is way better. Even if they aren't a field of experts. You're bound to get random people that remember random stuff from high school, college, work, and life in general, allowing them to piece together a solution.

inerte
·
6 months ago
·
[ - ]

Aaaah thanks for the explanation. PANEL of 10 humans, as in, they were all together. I parsed the phrase as "10 random people" > "average human" which made little sense.

modeless
·
6 months ago
·
[ - ]

Actually I believe that he did mean 10 random people tested individually, not a committee of 10 people. The key being that the question is considered to be answered correctly if any one of the 10 people got it right. This is similar to how LLMs are evaluated with pass@5 or pass@10 criteria (because the LLM has no memory so running it 10 times is more like asking 10 random people than asking the same person 10 times in a row).

I would expect 10 random people to do better than a committee of 10 people because 10 people have 10 chances to get it right while a committee only has one. Even if the committee gets 10 guesses (which must be made simultaneously, not iteratively) it might not do better because people might go along with a wrong consensus rather than push for the answer they would have chosen independently.

elcomet
·
6 months ago
·
[ - ]

He means 10 humans voting for the answer

generic92034
·
6 months ago
·
[ - ]

If that works that way at all depends on the group dynamic. It is easily possible that a not so bright individual takes an (unofficial) leadership position in the group and overrides the input of smarter members. Think of any meetings with various hierarchy levels in a company.

daveguy
·
6 months ago
·
[ - ]

The ARC AGI questions can be a little tricky, but the solutions can generally be easily explained. And you get 3 tries. So, the 3 best descriptions of the solution votes on by 10 people is going to be very effective. The problem space just isn't complicated enough for an unofficial "leader" to sway the group to 3 wrong answers.

herval
·
6 months ago
·
[ - ]

Depends on the task, no?

Do you have a sense of what kind of task this benchmark includes? Are they more “general” such that random people would fare well or more specialized (ie something a STEM grad studied and isn’t common knowledge)?

judge2020
·
6 months ago
·
[ - ]

It does, which is why I don’t really subscribe to any test like this being great for actually determining “AGI”. A true AGI would be able to continuously train and create new LLMs that enable it to become a SME in entirely new areas.

zamadatix
·
6 months ago
·
[ - ]

Aha, "at least 1 of a panel of 10", not "the panel of 10 averaged"! Thanks, that makes so much more sense to me now.

I have failed the real ARC AGI :)

dlkf
·
6 months ago
·
[ - ]

If you take a vote of 10 random people, then as long as their errors are not perfectly correlated, you’ll do better than asking one person.

https://en.m.wikipedia.org/wiki/Ensemble_learning

shkkmo
·
6 months ago
·
[ - ]

It is fairly well documented that groups of people can show cognitive abilities that exceed that of any individual member. The classic example of this is if you ask a group of people to estimate the number of jellybeans in a jar, you can get a more accurate result than if you test to find the person with the highest accuracy and use their guess.

This isn't to say groups always outperform their members on all tasks, just that it isn't unusual to see a result like that.

zamadatix
·
6 months ago
·
[ - ]

Yes, my shortcoming was in understanding the 10 were implied to have their successes merged together by being a panel rather than just the average of a special selection.

hmottestad
·
6 months ago
·
[ - ]

Might be that within a group of 10 people, randomly chosen, when each person attempts to solve the tasks at least 99% of the time 1 person out of the 10 people will get it right.

·
6 months ago
·
[ - ]

HDThoreaun
·
6 months ago
·
[ - ]

ARC-AGI is essentially an IQ test. There is no "expert in the space". Its just a question of if youre able to spot the pattern.

olalonde
·
6 months ago
·
[ - ]

Even if you assume that non STEM grads are dumb, isn't there a good probability of having a STEM graduate among 10 random humans?

bloppe
·
6 months ago
·
[ - ]

Other important quotes: "o3 still fails on some very easy tasks, indicating fundamental differences with human intelligence. Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute (while a smart human would still be able to score over 95% with no training)."

So ya, working on efficiency is important, but we're still pretty far away from AGI even ignoring efficiency. We need an actual breakthrough, which I believe will not be possible by simply scaling the transformer architecture.

ksec
·
6 months ago
·
[ - ]

Thank You. That alone suggest we could throw another 100X compute and we still wont be close to average human which is something close to 70-80%.

So combined together we are currently at least 10^5 in terms of cost efficiency. In reality I wont be surprised if we are closer to 10^6.

xbmcuser
·
6 months ago
·
[ - ]

You are missing that cost of electricity is also going to keep falling because of solar and batteries. This year in China my table cloth math says it is $0.05 pkwh and following the cost decline trajectory be under $0.01 in 10 years

patrickhogan1
·
6 months ago
·
[ - ]

Bingo! Solar energy moves us toward a future where a household's energy needs become nearly cost-free.

Energy Need: The average home uses 30 kWh/day, requiring 6 kW/hour over 5 peak sunlight hours.

Multijunction Panels: Lab efficiencies are already at 47% (2023), and with multiple years of progress, 60% efficiency is probable.

Efficiency Impact: At 60% efficiency, panels generate 600 W/m², requiring 10 m² (e.g., 2 m × 5 m) to meet energy needs.

This size can fit on most home roofs, be mounted on a pole with stacked layers, or even be hung through an apartment window.

arcticbull
·
6 months ago
·
[ - ]

Everyone always forgets that they only perform at less than half of their rated capacity and require significant battery installations. Rooftop solar plus storage is actually more expensive than nuclear on a comparable system LCOE due to their lack of efficiency of scale. Rooftop solar plus storage is about the most expensive form of electricity on earth, maybe excluding gas peaker plants.

xbmcuser
·
6 months ago
·
[ - ]

Everyone also forgets the speed of price decline for solar and battery your statement is completely false propaganda made up by power companies. Today rooftop solar and battery is cost competitive to nuclear already in many countries like India

·
6 months ago
·
[ - ]

arcticbull
·
6 months ago
·
[ - ]

Do you have some citations?

patrickhogan1
·
6 months ago
·
[ - ]

You’re right that rooftop solar and storage have costs and efficiency limits, but those are improving quickly.

Rooftop solar harnesses energy from the sun, which is powered by nuclear fusion—arguably the most effective nuclear reactor in our solar system.

·
6 months ago
·
[ - ]

nateglims
·
6 months ago
·
[ - ]

It varies by a lot of factors but it’s way less than half. Photovoltaic panels have around 10% capacity utilization vs 50-70% for a gas or nuke plant.

theendisney
·
6 months ago
·
[ - ]

The thing everyone forgets is that all good energy technology is seized by governments for military purposes and to preserve the status quo. God knows how far it progressed.

What a joke

jdhwosnhw
·
6 months ago
·
[ - ]

While I agree with your general assessment, I think your conclusion is a bit off. You’re assuming 1kw/m^2, which is only true with the sun directly overhead. A real-world solar setup gets hit with several factors of cosine (related to roof pitch, time of day, day of year, and latitude) that conspire to reduce the total output.

For example, my 50 sq m set up, at -29 deg latitude, generated your estimated 30 kwh/day output. I have panels with ~20% efficiency, suggesting that at 60% efficiency, the average household would only get to around half their energy needs with 10 sq m.

Yes, solar has the potential to drastically reduce energy costs, but even with free energy storage, individual households aren’t likely to achieve self sustainability.

sahmeepee
·
6 months ago
·
[ - ]

Average US home.

In Europe it is around 6-7 kWh/day. This might increase with electrification of heating and transport, but probably nothing like as much as the energy consumption they are replacing (due to greater efficiency of the devices consuming the energy and other factors like the quality of home insulation.)

In the rest of the world the average home uses significantly less.

barney54
·
6 months ago
·
[ - ]

But the cost of electricity is not falling—it’s increasing. Wholesale prices have decreased, but retail rates are up. In the U.S. rates are up 27% over the past 4 years. In Europe prices are up too.

NoLinkToMe
·
6 months ago
·
[ - ]

That's a bit of a non-statement. Virtually all prices increase because of money supply, but we consider things to get cheaper if their prices grow less fast than inflation / income.

General inflation has outpaced the inflation of electricity prices by about 3x in the past 100 years. In other words, electricity has gotten cheaper over time in purchasing power terms.

And that's whilst our electricity usage has gone up by 10x in the last 100 years.

And this concerns retail prices, which includes distribution/transmission fees. These have gone up a lot as you get complications on the grid, some of which is built on a century old design. But wholesale prices (the cost of generating electricity without transmission/distribution) are getting dirt cheap, and for big AI datacentres I'm pretty sure they'll hook up to their own dedicated electricity generation at wholesale prices, off the grid, in the coming decades.

xbmcuser
·
6 months ago
·
[ - ]

Most large compute clusters would be buying electricity at wholesale price not at retail price. But anyway solar and battery prices have just reached the tipping point this year only now the longer power companies keep retail prices high the more people will defect from the grid and install their own solar + batteries.

Bud
·
6 months ago
·
[ - ]

[dead]

lucubratory
·
6 months ago
·
[ - ]

I am not certain because I've been very focused on the o3 news, but at least yesterday neither the US nor Europe were part of China.

lxgr
·
6 months ago
·
[ - ]

But data centers pay wholesale prices or even less (given that especially AI training and, to a lesser extend, inference clusters can load shed like few other consumers of electricity).

fulafel
·
6 months ago
·
[ - ]

And this is great news as long as marginal production (the most expensive to produce, first to turn on/off according to demand) of electricity is fossils.

necovek
·
6 months ago
·
[ - ]

If climate change ends up changing weather profiles and we start seeing many more cloudy days or dust/mist in the air, we'll need to push those solar panel above (all the way to space?) or have many more of them, figure out transmission to the ground and costs will very much balloon.

Not saying this will happen, but it's risky to rely on solar as the only long-term solution.

nateglims
·
6 months ago
·
[ - ]

Is it going to fall significantly for data centers? Industrial policy for consumer power is different from subsidizing it for data centers and if you own grid infrastructure why would you tank the price by putting up massive amounts of capital?

xbmcuser
·
6 months ago
·
[ - ]

It's the same about using the cloud or using your own infrastructure there will be a point where building your own solar and battery plant is cheaper than what they are charging they will need to follow the price decline if they want to keep the customers if not there will be mass scale grid defections.

nateglims
·
6 months ago
·
[ - ]

I don’t think this reflects the reality of the power industry. Data centers are the only significant growth in actual generated power in decades and hyperscalers are already looking at very bespoke solutions.

The heavy commodification of networking and compute brought about by the internet and cloud aligned with tech company interests in delivering services or content to consumers. There does not seem to be an emerging consensus that data center operators also need to provide consumer power.

xbmcuser
·
6 months ago
·
[ - ]

It was not the reality of the power industry but will be soon as we have not had a source of electricity that is the cheapest and is getting cheaper and easy to install this is something unique.

I don't see Google, Amazon, Microsoft or any company pay $10 for something if building it themselves will cost them $5. Either the price difference will reach a point where investing into power production themselves makes sense or the power companies decrease prices. Looking at how all 3 have already been investing in power production over the last decade themselves either to get better prices or for PR.

lyu07282
·
6 months ago
·
[ - ]

But didn't we liberalized energy markets for that reason, if anyone could undercut the market like that wouldn't that happen automatically and the prices go down anyway? /s

iandanforth
·
6 months ago
·
[ - ]

Let's say that Google is already 1 generation ahead of nvidia in terms of efficient AI compute. ($1700)

Then let's say that OpenAI brute forced this without any meta-optimization of the hypothesized search component (they just set a compute budget). This is probably low hanging fruit and another 2x in compute reduction. ($850)

Then let's say that OpenAI was pushing really really hard for the numbers and was willing to burn cash and so didn't bother with serious thought around hardware aware distributed inference. This could be more than a 2x decrease in cost like we've seen deliver 10x reductions in cost via better attention mechanisms, but let's go with 2x for now. ($425).

So I think we've got about an 8x reduction in cost sitting there once Google steps up. This is probably 4-6 months of work flat out if they haven't already started down this path, but with what they've got with deep research, maybe it's sooner?

Then if "all" we get is hardware improvements we're down to what 10-14 years?

qingcharles
·
6 months ago
·
[ - ]

Until 2022 most AI research was aimed at improving the quality of the output, not the quantity.

Since then there has been a tsunami of optimizations in the way training and inference is done. I don't think we've even begun to find all the ways that inference can be further optimized at both hardware and software levels.

Look at the huge models that you can happily run on an M3 Mac. The cost reduction in inference is going to vastly outpace Moore's law, even as chip design continues on its own path.

promptdaddy
·
6 months ago
·
[ - ]

*deep mind research ?

iandanforth
·
6 months ago
·
[ - ]

Nope, Gemini Advanced with Deep Research. New mode of operation that does more "thinking" and web searches to answer your question.

cchance
·
6 months ago
·
[ - ]

I mean considering the big breaththrough this year for o1/o3 seems to have been "models having internal thoughts might help reasoning", seems to everyone outside of the AI field was sort of a "duh" moment.

I'd hope we see more internal optimizations and improvements to the models. The idea that the big breakthrough being "don't spit out the first thought that pops into your head" seems obvious to everyone outside of the field, but guess what turns out it was a big improvement when the devs decided to add it.

versteegen
·
6 months ago
·
[ - ]

> seems obvious to everyone outside of the field

It's obvious to people inside the field too.

Honestly, these things seem to be less obvious to people outside the field. I've heard so many uninformed takes about LLMs not representing real progress towards intelligence (even here on HN of all places; I don't know why I torture myself reading them), that they're just dumb memorizers. No, they are an incredible breakthrough, because extending them with things like internal thoughts will so obviously lead to results such as o3, and far beyond. Maybe a few more people will start to understand the trajectory we're on.

0points
·
6 months ago
·
[ - ]

> No, they are an incredible breakthrough, because extending them with things like internal thoughts will so obviously lead to results such as o3, and far beyond.

While I agree that the LLM progress as of late is interesting, the rest of your sentiment sounds more like you are in a cult.

As long as your field keep coming with less and less realistic predictions and fail to deliver over and over, eventually even the most gullible will lose faith in you.

Because that's what this all is right now. Faith.

> Maybe a few more people will start to understand the trajectory we're on.

All you are saying is that you believe something will happen in the future.

We can't have a intelligent discussion under those premises.

It's depressing to see so many otherwise smart people fall for their own hype train. You are only helping rich people get more rich by spreading their lies.

versteegen
·
6 months ago
·
[ - ]

I know I'm at fault for emotively complaining about "uninformed takes" in my comment instead of being substantive, which I regret, and I deserve replies such as this. I'll try harder to avoid getting into these arguments next time.

I wouldn't be an AI researcher if I didn't have "faith" that AI as a goal is worthwhile and achievable and I can make progress. You think this is irrational?

I am actually working to improve the SoTA in mathematical reasoning. I have documents full of concrete ideas for how to do that. So does everyone else in AI, in their niche. We are in an era of low hanging fruit enabled by ML breakthroughs such as large-scale transformers. I'm not someone who thinks you can simply keep scaling up transformers to solve AI. But consider System 1 and System 2 thinking: System 1 sure looks solved right now.

> As long as your field keep coming with less and less realistic predictions and fail to deliver over and over

I don't think we're commenting on the same article here. For example, FrontierMath was expected to be near impossible for LLMs for years, now here we are 5 weeks later at 25%.

Agentus
·
6 months ago
·
[ - ]

a trickle of people sure, but most people never accidentally stumble upon good evaluation skills let alone reason themselves to that level, so i dont see how most people will have the semblance of an idea of a realistic trajectory of ai progress. i think most people have very little conceptualization of their own thinking/cognitive patterns, at least not enough to sensibly extrapolate it onto ai.

doesnt help that most people are just mimics when talking about stuff thats outside their expertise.

Hell, my cousin a quality-college educated individual, high social/ emotional iq, will go down the conspiracy theory rabbit hole so quickly based on some baseless crap printed on the internet. then he’ll talk about people being satan worshipers.

versteegen
·
6 months ago
·
[ - ]

You're being pretty harsh, but:

> i think most people have very little conceptualization of their own thinking/cognitive patterns, at least not enough to sensibly extrapolate it onto ai.

Quite true. If you spend a lot of time reading and thinking about the workings of the mind you lose sight of how alien it is to intuition. While in highschool I first read, in New Scientist, the theory that conscious thought lags behind the underlying subconscious processing in the brain. I was shocked that New Scientist would print something so unbelievable. Yet there seemed to be an element of truth to it so I kept thinking about it and slowly changed my assessment.

Agentus
·
6 months ago
·
[ - ]

sorry, humans are stupid and what intelligence they have is largely impotent. if this wasnt the case life wouldnt be this dystopia. my crassness comes from not necessarily trying to pick on a particular group of humans, just disappointment in recognizing the efficacy of human intelligence and its ability to turn reality into a better reality (meh).

yeah i was just thinking how a lot of thoughts which i thought were my original thoughts really were made possible out of communal thoughts. like i can maybe have some original frontier thoughts that involve averages but thats only made possible because some other person invented the abstraction of averages then that was collectively disseminated to everyone in education, not to mention all the subconscious processes that are necessary for me to will certainly thoughts into existsnce. makes me reflect on how much cognition is really mine, vs (not mine) a inevitable product of a deterministic process and a product of other humans.

versteegen
·
6 months ago
·
[ - ]

> only made possible because some other person invented the abstraction of averages then that was collectively disseminated to everyone in education

What I find most fascinating about the history of mathematics is that basic concepts such as zero and negative numbers and graphs of functions, which are so easy to teach to students, required so many mathematicians over so many centuries. E.g. Newton figured out calculus because he gave so much thought to the works of Descartes.

Yes, I think "new" ideas (meaning, a particular synthesis of existing ones) are essentially inevitable, and how many people come up with them, and how soon, is a function of how common those prerequisites are.

sfjailbird
·
6 months ago
·
[ - ]

Sounds like your cousin is able to think for himself. The amount of bullshit I hear from quality-college educated individuals, who simply repeat outdated knowledge that is in their college curriculum, is no less disappointing.

daveguy
·
6 months ago
·
[ - ]

Buying whatever bullshit you see on the internet to such a degree that you're re-enacting satanic panic from the 80s is not "thinking for yourself". It's being gullible about areas outside your expertise.

dogma1138
·
6 months ago
·
[ - ]

Reflection isn’t a new concept, but a) actually proving that it’s an effective tool for these types of models and b) finding an effective method for reflection that doesn’t just locks you into circular “thinking” were the hard parts and hence the “breakthrough”.

It’s very easy to say hey ofc it’s obvious but there is nothing obvious about it because you are anthropomorphizing these models and then using that bias after the fact as a proof of your conjecture.

This isn’t how real progress is achieved.

beardedwizard
·
6 months ago
·
[ - ]

Calling it reflection is, for me, further anthropomorphizing. However I am in violent agreement that a common feature of llm debate is centered around anthropomorphism leading to claims of "thinking longer" or "reflecting" when none of those things are happening.

The state of the art seems very focused on promoting that language that might encode reason is as good as actual reason, rather than asking what a reasoning model might look like.

dogma1138
·
6 months ago
·
[ - ]

I didn’t name it, to me I think it’s more about reflecting the output back on itself which doesn’t necessarily means anthropomorphism.

acchow
·
6 months ago
·
[ - ]

> ~doubling every 2-2.5 years) puts us at 20~25 years.

The trend for power consumption of compute (Megaflops per watt) has generally tracked with Koomey’s law for a doubling every 1.57 years

Then you also have model performance improving with compression. For example, Llama 3.1’s 8B outperforming the original Llama 65B

0points
·
6 months ago
·
[ - ]

Then you will just have the issue of supplying enough of power to support this "linear" growth of yours.

agumonkey
·
6 months ago
·
[ - ]

who in this field is anticipating impact of near AGI for society ? maybe i'm too anxious but not planning for potential workless life seems dangerous (but maybe i'm just not following the right groups)

daveguy
·
6 months ago
·
[ - ]

AGI would have a major impact on human work. Currently the hype is much greater than the reality. But it looks like we are starting to see some of the components of an AGI and that is cause for discussion of impact, but not panicked discussion. Even the chatbot customer service has to be trained on the domain. Still it is most useful in a few specific ways:

Routing to the correct human support

Providing FAQ level responses to the most common problems.

Providing a second opinion to the human taking the call.

So, even this most relevant domain for the technology doesn't eliminate human employment (because it's just not flexible or reliable enough yet).

m3kw9
·
6 months ago
·
[ - ]

Don’t forget humans which is real GI paired with increasing capable AI can create a feed back loop to accelerate new advances.

bjornsing
·
6 months ago
·
[ - ]

> are we stuck waiting for the 20-25 years for GPU improvements

If this turns out to be hard to optimize / improve then there will be a huge economic incentive for efficient ASICs. No freaking way we’ll be running on GPUs for 20-25 years, or even 2.

coolspot
·
6 months ago
·
[ - ]

LLMs need efficient matrix multiiliers. GPUs are specialized ASICs for massive matrix multiplication.

vlovich123
·
6 months ago
·
[ - ]

LLMs get to maybe ~20% of the rated max FLOPS for a GPU. It’s not hard to imagine that a purpose built ASIC with maybe adjusted software stack gets us significantly more real performance.

boroboro4
·
6 months ago
·
[ - ]

They get more than this. For prefill we can get 70% matmul utilization, for generation less than this but we’ll get to >50 too eventually.

bjornsing
·
6 months ago
·
[ - ]

And even when you get to 100% utilization you’ll still be wasting a crazy amount of gates / die area, plus you’re paying the Nvidia tax. There is no way in hell that will go on for 10 years if we have good AGI but inference is too expensive.

noFaceDiscoG668
·
6 months ago
·
[ - ]

Maybe another plane with a bunch of semiconductor people will disappear over Kazakhstan or something. Capitalist communisms gets bossier in stealth mode.

But sorry, blablabla, this shit is getting embarrassing.

> The question is now, can we close this "to human" gap

You won’t close this gap by throwing more compute at it. Anything in the sphere of creative thinking eludes most people in the history of the planet. People with PhDs in STEM end up working in IT sales not because they are good or capable of learning but because more than half of them can’t do squat shit, despite all that compute and all those algorithms in their brains.

spencerchubb
·
6 months ago
·
[ - ]

> Super exciting that OpenAI pushed the compute out this far

it's even more exciting than that. the fact that you even can use more compute to get more intelligence is a breakthrough. if they spent even more on inference, would they get even better scores on arc agi?

lolinder
·
6 months ago
·
[ - ]

> the fact that you even can use more compute to get more intelligence is a breakthrough.

I'm not so sure—what they're doing by just throwing more tokens at it is similar to "solving" the traveling salesman problem by just throwing tons of compute into a breadth first search. Sure, you can get better and better answers the more compute you throw at it (with diminishing returns), but is that really that surprising to anyone who's been following tree of thought models?

All it really seems to tell us is that the type of model that OpenAI has available is capable of solving many of the types of problems that ARC-AGI-PUB has set up given enough compute time. It says nothing about "intelligence" as the concept exists in most people's heads—it just means that a certain very artificial (and intentionally easy for humans) class of problem that wasn't computable is now computable if you're willing to pay an enormous sum to do it. A breakthrough of sorts, sure, but not a surprising one given what we've seen already.

mithametacs
·
6 months ago
·
[ - ]

An algorithm designed for translating between human languages has now been shown to generalize to solving visual IQ test puzzles, without much modification.

Yes, I find that surprising.

echelon
·
6 months ago
·
[ - ]

Maybe it's not linear spend.

empiko
·
6 months ago
·
[ - ]

I don't think this is only about efficiency. The model I have here is that this is similar to when we beat chess. Yes, it is impressive that we made progress on a class of problems, but is this class aligned with what the economy or the society needs?

Simple turn-based games such as chess turned out to be too far away from anything practical and chess-engine-like programs were never that useful. It is entirely possible that this will end up in a similar situation. ARC-like pattern matching problems or programming challenges are indeed a respectable challenge for AI, but do we need a program that is able to solve them? How often does something like that come up really? I can see some time-saving in using AI vs StackOverflow in solving some programming challenges, but is there more to this?

edanm
·
6 months ago
·
[ - ]

I mostly agree with your analysis, but just to drive home a point here - I don't think that algorithms to beat Chess were ever seriously considered as something that would be relevant outside of the context of Chess itself. And obviously, within the world of Chess, they are major breakthroughs.

In this case there is more reason to think these things are relevant outside of the direct context - these tests were specifically designed to see if AI can do general-thinking tasks. The benchmarks might be bad, but that's at least their purpose (unlike in Chess).

lugu
·
6 months ago
·
[ - ]

ARC is designed to be hard for current models. It cannot be a proxy for how useful they are. It says something else. Most likely those models won't replace human at their tasks in their organization. Instead "we" will design pipeline so that the tasks aligns with the ability of the model and we will put the human at the periphery. Think of how a factory is organised for the robots.

spamlettuce
·
6 months ago
·
[ - ]

okay, but what about literal swe-bench. O3 scored 75% eval

daxfohl
·
6 months ago
·
[ - ]

I wonder if we'll start seeing a shift in compute spend, moving away from training time, and toward inference time instead. As we get closer to AGI, we probably reach some limit in terms of how smart the thing can get just training on existing docs or data or whatever. At some point it knows everything it'll ever know, no matter how much training compute you throw at it.

To move beyond that, the thing has to start thinking for itself, some auto feedback loop, training itself on its own thoughts. Interestingly, this could plausibly be vastly more efficient than training on external data because it's a much tighter feedback loop and a smaller dataset. So it's possible that "nearly AGI" leads to ASI pretty quickly and efficiently.

Of course it's also possible that the feedback loop, while efficient as a computation process, isn't efficient as a learning / reasoning / learning-how-to-reason process, and the thing, while as intelligent as a human, still barely competes with a worm in true reasoning ability.

Interesting times.

freehorse
·
6 months ago
·
[ - ]

> I am interpreting this result as human level reasoning now costs (approximately) 41k/hr to 2.5M/hr with current compute.

On a very simple, toy task, which arc-agi basically is. Arc-agi tests are not hard per se, just LLM’s find them hard. We do not know how this scales for more complex, real world tasks.

SamPatt
·
6 months ago
·
[ - ]

Right. Arc is meant to test the ability of a model to generalize. It's neat to see it succeed, but it's not yet a guarantee that it can generalize when given other tasks.

The other benchmarks are a good indication though.

lyu07282
·
6 months ago
·
[ - ]

> Arc is meant to test the ability of a model to generalize. It's neat to see it succeed, but it's not yet a guarantee that it can generalize when given other tasks.

Well no, that would mean that Arc isn't actually testing the ability of a model to generalize then and we would need a better test. Considering it's by François Chollet, yep we need a better test.

criddell
·
6 months ago
·
[ - ]

Does it mean anything for more general tasks like driving a car?

brookst
·
6 months ago
·
[ - ]

Is every smart person a good driver?

earth2mars
·
6 months ago
·
[ - ]

That kind of proves that point that no matter how smart it can get, it may still have several disabilities that are crucial and very naive for humans. Is it generalizing on any task or specific set of tasks.

zarzavat
·
6 months ago
·
[ - ]

Likely yes. Every smart person is capable of being a good driver, so long as you give them enough training and incentive. Zero smart people are born being able to drive.

brookst
·
6 months ago
·
[ - ]

What about the archetype of the absent minded genius? I’ve met more several people who are shockingly intelligent but completely lose situational awareness on a regular basis.

And conversely, the world’s best drivers aren’t noted for being intellectual giants.

I don’t think driving skill and raw intelligence are that closely connected.

fragmede
·
6 months ago
·
[ - ]

There are different kinds of smarts and not every smart person is good at all of them. Specifically, spacial reasoning is important for driving, and if a smart person is good at all kinds of thinking except that one, they're going to find it challenging to be a good driver.

sethammons
·
6 months ago
·
[ - ]

Says the technical founder and CTO of our startup who exited with 9 figures and who also has a severe lazy eye: you don't want me driving. He got pulled over for suspected dui; totally clean, just can't drive straight

riku_iki
·
6 months ago
·
[ - ]

> ~=$3400 per single task

report says it is $17 per task, and $6k for whole dataset of 400 tasks.

binarymax
·
6 months ago
·
[ - ]

"Note: OpenAI has requested that we not publish the high-compute costs. The amount of compute was roughly 172x the low-compute configuration."

The low compute was $17 per task. Speculate 172*$17 for the high compute is $2,924 per task, so I am also confused on the $3400 number.

bluecoconut
·
6 months ago
·
[ - ]

3400 came from counting pixels on the plot.

Also its $20 on for the o3-low via the table for the semi-private, which x172 is 3440, also coming in close to the 3400 number

bluecoconut
·
6 months ago
·
[ - ]

That's the low-compute mode. In the plot at the top where they score 88%, O3 High (tuned) is ~3.4k

HDThoreaun
·
6 months ago
·
[ - ]

The low compute one did as well as the average person though

ionwake
·
6 months ago
·
[ - ]

sorry to be a noob, but can someone tell me doe sths mena o3 will be unaffordable for a typical user? Will only companies with thousands to spend per query be able to use this?

Sorry for being thick Im just confused how they can turn this into an addordable service?

JohnnyMarcone
·
6 months ago
·
[ - ]

There are likely many efficiency gains that will be made before it's released, and after. Also they showed o3 mini to be better than o1 for less cost in multiple benchmarks, so there're already improvements there at a lower cost than what available.

ionwake
·
6 months ago
·
[ - ]

Great thank you

xrendan
·
6 months ago
·
[ - ]

You're misreading it, there's two different runs, a low and a high compute run.

The number for the high-compute one is ~172x the first one according to the article so ~=$2900

Thorrez
·
6 months ago
·
[ - ]

What's extra confusing is that in the graph the runs are called low compute and high compute. In the table they're called high efficient and low efficiency. So the high and low got swapped.

jhrmnn
·
6 months ago
·
[ - ]

That’s for the low-compute configuration that doesn’t reach human-level performance (not far though)

riku_iki
·
6 months ago
·
[ - ]

I referred on high compute mode. They have table with breakdown here: https://arcprize.org/blog/oai-o3-pub-breakthrough

junipertea
·
6 months ago
·
[ - ]

The table row with 6k figure refers to high efficiency, not high compute mode. From the blog post:

Note: OpenAI has requested that we not publish the high-compute costs. The amount of compute was roughly 172x the low-compute configuration.

gbnwl
·
6 months ago
·
[ - ]

That's "efficiency" high, which actually means less compute. The 87.5% score using low efficiency (more compute) doesn't have cost listed.

bluecoconut
·
6 months ago
·
[ - ]

they use some poor language.

"High Efficiency" is O3 Low "Low Efficiency" is O3 High

They left the "Low efficiency" (O3 High) values as `-` but you can infer them from the plot at the top.

Note the $20 and $17 per task aligns with the X-axis of the O3-low

EVa5I7bHFq9mnYK
·
6 months ago
·
[ - ]

That's high EFFICIENCY. High efficiency = low compute.

·
6 months ago
·
[ - ]

cle
·
6 months ago
·
[ - ]

Efficiency has always been the key.

Fundamentally it's a search through some enormous state space. Advancements are "tricks" that let us find useful subsets more efficiently.

Zooming way out, we have a bunch of social tricks, hardware tricks, and algorithmic tricks that have resulted in a super useful subset. It's not the subset that we want though, so the hunt continues.

Hopefully it doesn't require revising too much in the hardware & social bag of tricks, those are lot more painful to revisit...

Macuyiko
·
6 months ago
·
[ - ]

I am not so sure, but indeed it is perhaps also a sad realization.

You compare this to "a human" but also admit there is a high variation.

And, I would say there are a lot humans being paid ~=$3400 per month. Not for a single task, true, but for honestly for no value creating task at all. Just for their time.

So what about we think in terms of output rather than time?

madduci
·
6 months ago
·
[ - ]

Let's see when this will be released to the free tier. Looks promising, although I hope they will also be able to publish more details on this, as part of the "open" in their name

ein0p
·
6 months ago
·
[ - ]

This is beta version. By the time they're done with this it'll be measured in single digit dollars, if not cents.

chefandy
·
6 months ago
·
[ - ]

I think the real key is figuring out how to turn the hand-wavy promises of this making everything better into policy long fucking before we kick the door open. It’s self-evident that this being efficient and useful would be a technological revolution; what’s not self evident is that it wouldn’t benefit the large corporate entities that control even more disproportionately than it does now to the detriment of many other people.

·
6 months ago
·
[ - ]

| Name | Semi-private eval | Public eval | |--------------------------------------|-------------------|-------------| | Jeremy Berman | 53.6% | 58.5% | | Akyürek et al. | 47.5% | 62.8% | | Ryan Greenblatt | 43% | 42% | | OpenAI o1-preview (pass@1) | 18% | 21% | | Anthropic Claude 3.5 Sonnet (pass@1) | 14% | 21% | | OpenAI GPT-4o (pass@1) | 5% | 9% | | Google Gemini 1.5 (pass@1) | 4.5% | 8% |

The v1 version of the benchmark is starting to saturate. There were already signs of this in the Kaggle competition this year: an ensemble of all submissions would score 81% Early indications are that ARC-AGI-v2 will represent a complete reset of the state-of-the-art, and it will remain extremely difficult for o3. Meanwhile, a smart human or a small panel of average humans would still be able to score >95% ... This shows that it's still feasible to create unsaturated, interesting benchmarks that are easy for humans, yet impossible for AI, without involving specialist knowledge. We will have AGI when creating such evals becomes outright impossible. For me, the main open question is where the scaling bottlenecks for the techniques behind o3 are going to be. If human-annotated CoT data is a major bottleneck, for instance, capabilities would start to plateau quickly like they did for LLMs (until the next architecture). If the only bottleneck is test-time search, we will see continued scaling in the future.