The biggest bottleneck for this for the past two years imo wasn't the models, but the engineering and infra around it, and the willingness of companies to work with openaio directly. Now that they've grown and have a decent userbase, companies are much more willing to pay/or involve themselves in these efforts.
This has eventual implications outside user-heavy internet use (once we see more things built on the SDK), where we're gonna see a fork in the web traffic of human centric workflows through chat, and an seo-filled, chat/agent-optimized web that is only catered to agents. (crossposted)
Buying plane tickets for example. It’s not even that I don’t trust the AI or that I’m afraid it might make a mistake. I just inherently want to feel like I’m in control of these processes.
It’s the same reason I’m more afraid of flying than driving despite flying being a way safer mode of travel. When I’m flying I don’t feel like I’m in control.
I think the average person will happily pay the same price to OpenAI (being micro-brainwashed by the AI to buy things they don't need, i.e. ads). I feel confident OpenAI will be able to charge even more for ads than Google since OpenAI will be able to influence people even more strongly, and hide the ads even better.
There is a sizeable chunk of people who (perhaps foolishly) trust ChatGPT despite knowing it can produce errors. They use it because it does the "research" for them, and does so quickly. This presents a type of tech-agility that they themselves do not possess. So on balance they may be more tech-empowered by using a flawed AI ChatBot than they are by manually reading news, websites and blogs.
There is also an issue of trust. A novice reading the top 5 search results has no real idea if the information being presented is biased, error free, or even factual. Google's work to blend paid and organic placement also presents the flaw of dollars over quality. ChatGPT on the other hand presents a known level of trust to them.
A similar scenario plays out with the way novices are more trusting of apps that appear on a curated store rather than seeking out software via web searches.
I think that users on HN take for granted that they have outsized experience and skill in developing trust in the tech landscape, and have a mental list of news, websites and software providers that they deem trustworthy. This can lead to not understanding the motivation for relying on an AI ChatBot, or compartmentalising people who use those services as some kind simpleton.
Booking an emergency flight last time I had a family issue was a mind-fucking experience. I had to go through 10 screens trying to sell me stuff and constantly hiding the skip button in different places. Maybe HN will say that I "shouldn't have had a family emergency in the first place" but reality is realty.
And honestly it's not just booking websites, it's anything tech that they do. For example, the last checkin kiosk I used also had an incredibly convoluted path for the case where someone else booked my luggage but it was a different size.
And sooner or later these websites will implement new dark patterns to confuse the LLMs...
It could even work against the dynamic pricing algorithms airlines use to maximize revenue: if I have a tireless assistant exploring every possible combination to find the cheapest ticket, it’ll probably do a much better job than I ever could.
The problems come when vulnerable users are targeted using dark patterns. How AI dark patterns will evolve is very uncertain [1] however I suspect they will be extremely subtle and very effective.
What's the worst that can happen if someone vulnerable is persuaded to buy a flight by an AI. I don't know, maybe depression and bad credit after the chatbot's promises weren't met. If they're persuaded to buy a weapon, that's a different matter.
At least current advertising is somewhat public, although that's increasingly less true as ads get more targeted.
This is new territory where ads will be so extremely private it will be only known by the user (maybe they won't even notice) and someone reading the subpoenaed chat logs after a user does something terrible. Those chat logs will likely be inconclusive anyway.
[1] https://venturebeat.com/ai/darkness-rising-the-hidden-danger...
We used to get that through the services of a travel agency. Maybe we will soon have that luxury again?
Right now I cant imagine an AI (esp. chat) being more convenient for me than skyscanner or Google Hotels, but maybe I’m missing the imagination.
If all you want is the cheapest flight on a specific day, Skyscanner is really great. But what if you need to book a bus at the other end of your flight? Skyscanner is not going to help you with that, but ChatGPT might! It could search up different bus providers in your destination and cross-reference them against the available flights.
How much you trust ChatGPT to actually do this well is up to you. But I suspect a lot of people will trust it, and I would probably be willing to use it for low-stakes tasks at least.
I think if you really exactly know what you want the input in AI might be faster "book me on the flight tomorrow at 1pm from x to y on airline xyz -y" This I could imagine being faster, but it would still require verification by me to actually pay. I wonder if AI is faster in doing that given the added latency compared to me visiting airline xyz and doing the search manually (even perceive loading time taking in consideration) as it will be perceived less time if you are active.
And ChatGPT will answer whatever it wants
I see my mistake now. I evaluate based on how it could be useful for me. As a heavy computer user, familiar with shortcuts and user interfaces, interacting with UX works very good.
But for a lot of users text will be more natural and easier. I might be able to get the flight I want easiest with Skyscanner, but other users might not be and will come to a better result with texts.
It’s the same as I prefer documentations over Youtube tutorials, but it’s different on different stages.
If (when) companies want their things to be present in ChatGPT replies, they need to provide an AI-compatible way to get it. Just shoving a full-ass web page at it is inefficient and error-prone.
They have to either build a version of their site that's AI-accessible or provide an API (or MCP) for it to access the data.
Now that the API is built and the cost is paid, we can use it for non-AI uses.
Currently GPT gets you better answers than Google so people are gonna be going there first.
This experience is 10x better than online alternatives. AI agents can replicate this at marginal cost.
I understand an argument can be made that google is doing similar, but at least you can still search and end up on an actual site, rather than just play telephone via chatgpt. This concept is horrifying for so many reasons.
Even in that dire circumstance, I wish that the web versions keep up/are maintained, instead of being slowly deprecated, which happened for a lot of mobile-native versions of applications.
Going back to first principles, we need to recall that the internet is for the dissemination of cat pictures, and at the end of the day every technical and organizational change must be analyzed through the lens of its impact on the effective throughput of these pictures.
I suspect our future is going to be a lot more frustrating, both from AI screwups and the atrophied skills of humans
A decade ago, I used one of the hotel aggregator sites to reserve rooms for vacation, and as I call the hotel to double check something on my way to the airport, I find out that I don't actually have a reservation and my room is already occupied. They couldn't do anything about it, as it was the 3rd party aggregator's mistake.
Just getting the aggregator to admit, that no, even though their system says I have a reservation, the hotel confirmed it didn't exist took over an hour. I had to go through several layers of customer service, and I suspect different call centers, until someone called the hotel themselves and issued a refund.
It was miserable and stressful to do from the airport, I would have lost my mind if I had to deal with chatbots for what was already a terrible experience with an automated purchase.
I just can't let anything AI make decisions that have consequences, like spending money, buying anything, planning vacations, flights etc. It's so bad now (I've just tried) that I'm not sure if it will ever gain my trust.
ChatGPT has become one of the top-most browsed websites, and they want to capitalize on it even if 2% of the people actually trust the new integrations.
When we launched our mobile banking platform, one of the PM's there swore up and down that we should be piloting banking by text message. He was fabulously wrong at the time and in the end got a lot of things right.
There are a lot of applications that could fit in a text box provided that your not doing the work rather that your delegating it.
So perhaps chatbots are an excellent method for building out a prototype in a new field while you collect usage statistics to build a more refined UX - but it is bizarre that so many businesses seem to be discarding battle tested UXes for chatbots.
Thing is, for those who paid attention to the last chatBot hype cycle, we already knew this. Look at how Google Assistant was portrayed back in 2016. People thought you'd be buying starbucks via the chat. Turns out the starbucks app has a better UX
The only reason for the voice interface is to facilitate the production of a TV show. By having the characters speak their requests aloud to the computer as voice commands, the show bypasses all the issues of building visual effects for computer screens and making those visuals easy to interpret for the audience, regardless of their computing background. However, whenever the show wants to demonstrate a character with a high level of computer mastery, the demonstration is almost always via the touchscreen (this is most often seen with Data), not the voice interface.
TNG had issues like this figured out years ago, yet people continue to fall into the same trap because they repeatedly fail to learn the lessons the show had to teach.
Maybe this is how we all get our own offices again and the open floor plan dies.
"...and that is why we need the resources. Newline, end document. Hey, guys, I just got done with my 60 page report, and need-"
"SELECT ALL, DELETE, SAVE DOCUMENT, FLUSH UNDO, PURGE VERSION HISTORY, CLOSE WINDOW."
Here's hoping this at least gets us back to cubes.
>changes bass to +4 because the unit doesn't do half increments
“No volume up to 35, do not touch the EQ”
>adjusts volume to 4 because the unit doesn’t do half increments
> I reach over, grab my remote, and do it myself
We have a grandparent that really depends on their Alexa and let me tell you repeatedly going “hey Alexa, volume down. Hey Alexa, volume down. Hey Alexa, volume down,” gets really old lol we just walk over and start using the touch interface
This general concept (embedding third parties as widgets in a larger product) has been tried many times before. Google themselves have done this - by my count - at least three separate times (Search, Maps, and Assistant).
None have been successful in large part because the third party being integrated benefits only marginally from such an integration. The amount of additional traffic these integrations drive generally isn't seen as being worth the loss of UX control and the intermediation in the customer relationship.
A UX is better and another app or website feels like the exact separation needed.
Booking flights => browser => skyscanner => destination typing => evaluation options with ai suggestions on top and UX to fine-tune if I have out of the ordinary wishes (don’t want to get up so early)
I can’t imagine a human or an AI be better than is this specialized UX.
Not an agent, but I've seen people choose doctors based on asking ChatGpt for criteria and the did make those appointments. Saved them countless web interfaces to dig through.
ChatGpt saved me so much money by searching for discount coupons on courses.
It even offered free entrance passwords on events I didn't know had such a thing (I asked it where the event was and it also told me the free entrance password it found on some obscure site).
I've seen doctors use ChatGpt to generate medical letters -- Chat Gpt used some medical letters python code and the doctors loved the result.
I've used ChatGpt to trim an energy bill to 10 pages because my current provider generated a 12 page bill in an attempt to prevent me from switching (because they knew the other provider did not accept bills of more than 10 pages).
Combined with how incredibly good codex is, combined with how easily chat gpt can just create throw away one-time apps, no way the whole agent interface doesn't eat a huge chunk of the traditional UX software we are used to.
At least in my domains, the "battle-tested" UX is a direct replication of underlying data structures and database tables.
What chat gives you access to is a non-structured input that a clever coder can then sufficiently structure to create a vector database query.
Natural language turns out to be far more flexible and nuanced interface than walls of checkboxes.
Have you ever worked in a corporation? Do you really think that Windows 8 UI was the fruit of years of careful design? What about Workday?
> but it is bizarre that so many businesses seem to be discarding battle tested UXes for chatbots
Not really. If the chatbot is smart enough then chatbot is the more natural interface. I've seen people who prefer to say "hey siri set alarm clock for 10 AM" rather than use the UI. Which makes sense, because language is the way people literally have evolved specialized organs for. If anything, language is the "battle tested UX", and the other stuff is temporary fad.
Of course the problem is that most chatbots aren't smart. But this is a purely technical problem that can be solved within foreseeable future.
It's quicker that way. Other things, such as zooming in to an image, are quicker with a GUI. Bladerunner makes clear how the voice UI is poor for this compared to a GUI.
Imagine going to a shop and browsing all the aisles vs talking to the store employee. Chatbot is like the latter, but for a webshop.
Not to mention that most webshops have their categories completely disorganized, making "search by constraints" impossible.
Also, the chatbot is just not going to have enough context, at least not in it's current state. Why those measurements? Because that's how much room you have, you measured. Why black? Because your couch is black too (bad choice), and you're trying to do a theme.
That's kind of a lot to explain.
I don't think it's necessary to resort to evolutionary-biology explanations for that.
When I use voice to set my alarm, it's usually because my phone isn't in my hand. Maybe it's across the room from me. And speaking to it is more efficient than walking over to it, picking it up, and navigating to the alarm-setting UI. A voice command is a more streamlined UI for that specific task than a GUI is.
I don't think that example says much about chatbots, really, because the value is mostly the hands-free aspect, not the speak-it-in-English aspect.
Most of the practical day to day tasks on the Androids I've used are 5-10 taps away from a lock screen, and get far less dirty looks from those around me.
If I use the touchscreen I have to:
1 unlock the phone - easy, but takes an active swipe
2 go to the clock app - i might not have been on the home screen, maybe a swipe or two to get there
3 set the timer to what I want - and here it COMPLETELY falls down, since it probably is showing how long the last timer I set was, and if that's not what I want, I have to fiddle with it.
If I do it with my voice I don't even have to look away from what I'm currently doing. AND I can say "90 seconds" or "10 minutes" or "3 hours" or even (at least on an iPhone) "set a timer for 3PM" and it will set it to what I say without me having to select numbers on a touchscreen.
And 95% of the time there's nobody around who's gonna give me a dirty look for it.
I don’t know any people that do Siri except the people that have really bad eyes
I do that all the time with Siri for setting alarms and timers. Certain things have extremely simple speech interfaces. And we've already found a ton of them over the last decade+. If it was useful to use speech for ordering an uber, it would've been worth it for me to learn the specific syntax Alexa wanted.
Do I want to talk to a chatbot to get a detailed table of potential flight and hotel options? Hell no. It doesn't matter how smart it is, I want to see them on a map and be able to hover, click into them, etc. Speech would be slow and awful for that.
Ah yes, it's just a small detail. Don't worry about it.
-diehard CLI user
and if the apps are trusting ChatGPT to send them users based on those sort of queries, it's only a matter of time before ChatGPT brings the functionality first-party and cuts out the apps - any app who believes chat is the universal interface of the future and exposes their functionality as a ChatGPT app is signing their own death warrant.
It's just like Google and websites, but much more insidious. If they can get your data, they'll subsume your function (and revenue stream).
This is exactly the same playbook as has already been played multiple times in the past(and currently playing) by existing companies.
These companies initially laid out red carpets for such builders, but once they themselves had enough apps, they started to tighten the rope, and then gradually shifted to complete 100% control and extortion in the name of "security" or other made-up-excuse.
No-more walled garden. If something like this has to come (which I truly believe is helpful), it should be buiild on open-web and open protocols, not controlled by single for-profit company (ironical since OpenAI is technically non-profit).
I'm not sure that claim is justified. The primary agentic use case today is code generation, and the target demographic is used to IDEs/code editors.
While that's probably a good chunk of total token usage, it's not representative of the average user's needs or desires. I strongly doubt that the chat interface would have become so ubiquitous if it didn't have merit.
Even for more general agentic use, a chat interface allows the user the convenience of typing or dictating messages. And it's trivially bundled with audio-to-audio or video-to-video, the former already being common.
I expect that even in the future, if/when richer modalities become standard (and the models can produce video in real-time), most people will be consuming their outputs as text. It's simply more convenient for most use-cases.
One way to consider it that I like as an EE working in the energy model realm; consider the geometry of an oscilloscope.
Electromagnetism to be carved up into equations that recreate it.
Geometric generators that create bulk structure and allow for changing min/max parameters to achieve desired result.
Consider a hardware system that boots and offers little more than blender and photoshop like parameter UI widgets to manipulate whatever segment of the geometry that isn't quite right.
Currently we rely on an OS paradigm that is basically a virtual machine to noodle strings. The future will be a vector virtual machine that lets users noodle coordinates.
Way less resource intensive to think of it all as sync of memory matrix to display matrix and jettison all the syntax sugar developers stuck with string munging OS of history.
I could see chat apps becoming dominant in Slack-oriented workplaces. But, like, chatting with an AI to play a song is objectively worse than using Spotify. Dynamically-created music sounds nice until one considers the social context in which non-filler music is heard.
There's a whole bizarre subculture in computing that fails to recognize what it is about computers that people actually find valuable.
Everyone wants the next device category. They covet it. Every other company tries to will it into existence.
Getting an AI to play "that song that goes hmm hmmm hmmm hmmm ... uh, it was in some commercials when I was a kid" tho
Absolutely. The point is this is a specialised and occasional use case. You don't want to have to go through a chat bot every time you want to play a particular song just because sometimes you might hum at it.
The closest we've come to a widely-adopted AR interface are AirPods. Critically, however, they work by mimicing how someone would speak to a real human by them.
Also their playlists are made by real people (mostly...), so they don't completely suck ass.
Also, following the Beatport top 100 tech house playlist, and hearing how many tracks aren't actually tech house makes me wonder about who makes that particular playlist.
That's how I feel about a lot of AI stuff.
Like... It's neat. It's a fun novelty. It makes a good party trick. It's the software equivalent of a knick knack.
Like 90% of the pixel AI features. There's some good ones in there, sure, but most of them you play around with for a day and then forget exist.
This isn't me making a cute little website in my free time. This is thousands of developers, super computers out the wazoo, and a huge chunk of the western economy.
Like, a snowglobe is cute. They don't do much, but they're cute. I'd buy one for ten dollars.
I would not buy a snowglobe for 10 million dollars.
Other app-like interfaces like NotebookLM can be useful, for me one or two real uses a week.
Then there is engineering small open models into larger systems to do structured data extraction, etc.
I am skeptical about the current utility of agentic systems, MCP, etc. - even though I like to experiment.
Someone else said that at least the didn’t go on and on about AGI today - a nice thing. FOMO chasing ASI and AGI will drive us bankrupt, and produce some useful results.
I’m building a tool that helps you solve any type of questionnaire (https://requestf.com) and I just can’t imagine how I could leverage Apps.
It would be awesome to get the distribution, but it has to also make sense from the UX perspective.
Out of curiosity, why iff?
e.g. Coursera can send back a video player
I remember reading some not-Neuromancer book by William Gibson where one of his near-future predictions was print magazines but with custom printed articles curated to fit your interests. Which is cool! In a world where print magazines were still dominant, you could see it as a forward iteration from the magazine status quo, potentially predictive of a future to come. But what happened in reality was a wholesale leapfrogging of magazines.
So I think you sometimes get leapfrogging rather than iteration, which I suspect is in play as a possibility with AI driven apps. I don't think apps will ever literally be replaced but I think there's a real chance they get displaced by AI everything-interfaces. I think the mitigating factor is not some foundational limit to AI's usefulness but enshittification, which I don't think used to consume good services so voraciously in the 00s or 2010s as it does today. Something tells me we might look back at the current chat based interfaces as the good old days.
We are at a moment where we're trying to figure out how to design good interfaces, but very soon after that the moment of "okay, now let's start selling with them" will come and that's really what we're going to be left with.
In that regard, things like adblockers which now a days can be used to mitigate some of these defects you talk about are probably going to be much more difficult to implement in a chat-app interface. What are we going to do when we ask an agent for something and it responds with an ad rather than the relevant information we're seeking? It seems to me like it's going to be even more difficult to be in control for the user.
But I think it's going to be like Kagi, you'll pay for a subscription to a good-enough one, but the main companies will try to make their proprietary ones too feature rich and too convenient so that you'll have no choice but to use their enshittified version. What we have now might be a golden age that we will miss having.
But, for better or worse, I do think what's coming may be a paradigm where they are effectively one big omniscient super-app.
I imagine the Star Trek vision is pretty accurate. You occasionally talk to the computer when it makes sense, but more often than not you’re still interacting with a GUI of some kind.
I’m not very bullish on people wanting to live in the ChatGPT UI, specifically, but the concept of dynamic apps embedded into a chat-experience I think is a reasonable direction.
I’m mostly curious about if and when we get an open standard for this, similar to MCP.
What users want, which various entities religiously avoid providing to us, is a fair price comparison and discovery mechanism for essentially everything. A huge part of the value of LLMs to date is in bypassing much of the obfuscation that exists to perpetuate this, and that's completely counteracted by much of what they're demonstrating here.
The former is like a Waymo, the latter is like my car suddenly and autonomously deciding that now is a good time to turn into a Dollar Tree to get a COVID vaccine when I'm on my way to drop my kid off at a playdate.
The problem with this approach is precisely that these apps/widgets have hard-coded input and output schema. They can work quite well when the user asks something within the widget's capabilities, but the brittleness of this approach starts showing quickly in real-world use. What if you want to use more advanced filters with Zillow? Or perhaps cross-reference with StreetEasy? If those features aren't supported by the widget's hard-coded schema, you're out of luck as a user.
What I think it much more exciting is the ability to completely create generative UI answers on the fly. We'll have more to say on this soon from Phind (I'm the founder).
That said, I used it a lot more a year ago. Lately I’ve been using regular LLMs since they’ve gotten better at searching.
For a concrete example, think a search result listing that can be broken down into a single result or a matrix to compare results, as well as a filter section. So you could ask for different facets of your current context, to iterate over a search session and interact with the results. Dunno, I’m still researching.
Have you written somewhere about your experience with Phind in this area?
Now that models have gotten much more capable, I'd suggest to give the executing model as much freedom with setting (and even determining) the schema as possible.
Chat paired to the pre-built and on-demand widgets address this limitation.
For example, in the keynote demo, they showed how the chat interface lets you perform advanced filtering that pulls together information from multiple sources, like filtering only Zillow housers near a dog park.
I think most software will follow this trend and become generated on-demand over the next decade.
The only place I can see this working is if the LLM is generating a rich UI on the fly. Otherwise, you're arguing that a text-based UX is going to beat flashy, colourful things.
Conservational user interfaces are opaque; they lack affordances. https://en.wikipedia.org/wiki/Affordance
I immediately knew the last generation of voice assistants was dead garbage when there was no way to know what it could do, they just expected you to try 100 things, until it worked randomly
> Affordances represent the possibilities in the world for how an agent (a person, animal, or machine) can interact with something. Some affordances are perceivable, others are invisible. Signifiers are signals. Some signifiers are signs, labels, and drawings placed in the world, such as the signs labeled “push,” “pull,” or “exit” on doors, or arrows and diagrams indicating what is to be acted upon or in which direction to gesture, or other instructions. Some signifiers are simply the perceived affordances, such as the handle of a door or the physical structure of a switch. Note that some perceived affordances may not be real: they may look like doors or places to push, or an impediment to entry, when in fact they are not.
With Norman's definition, if a conversational interface can perform an action, it affords that action. The fact that you don't know that it affords that action means there's a lack of a signifier.
As you say, this is a matter of definition, I'm just commenting on Norman's specific definition from the book.
Personally I don't hope thats the future.
For a large number of tasks that cleanly generalize into a stream of tokens, command line or chat is probably superior. We'll get some affordances like tab auto completion to help remember the name of certain bots or mCP endpoints that can be brought in as needed...
But for anything that involves discovery, graphical interaction feels more intuitive and we'll probably get bespoke interfaces relevant to that particular task at hand with some sort of partially hidden layers to abstract away the token stream?
I was really hoping Apple would make some innovations on the UX side, but they certainly haven’t yet.
They want to be the platform in which you tell what you want, and OAI does it for you. It's gonna connect to your inbox, calendar, payment methods, and you'll just ask it to do something and it will, using those apps.
This means OAI won't need ads. Just rev share.
If OpenAI thinks there’s sweet, sweet revenue in email and calendar apps, just waiting to be shared, their investors are in for a big surprise.
Ads are defenitely there. Just hidden so deeply in the black box which is generating the useful tips :)
One could be for example: from people asking online which tools they should use to build something and being constantly recommended to do it with Next.js
Another could be: how many of the code that was used to train the LLM is done in Next.js
Generally, the answer is probably something along the lines of "next.js is kind of the most popular choice at the time of training".
In my (non-lawyer) understanding, each message potentially containing sponsored content (which would be every message, if the bias is encoded in the LLM itself,) would need to be marked as an ad individually.
That would make for an odd user interface.
You may have started seeing this when LLMs seem to promote things based entirely on marketing claims and not on real-world functionality.
More or less, SEO spam V2.
They obviously want both. In fact they are already building an ad team.
They have money they have to burn, so it makes sense to throw all the scalable business models in the history, eg app store, algo feed, etc, to the wall and see what stick.
[0] https://www.nber.org/system/files/working_papers/w34255/w342...
[1] This is an example. Which model was the best when is not important.
OpenAI’s moat will only come from the products they built on top. Theoretically their products will be better because they’ll be more vertically integrated with the underlying models. It’s not unlike Apple’s playbook with regard to hardwares and software integration.
A lot of the fundamental issues with MCP are still present: MCP is pretty single-player, users must "pull" content from the service, and the model of "enabling connections" is fairly unintuitive compared to "opening an app."
Ideally apps would have a dedicated entry point, be able to push content to users, and have some persistence in the UI. And really the primary interface should be HTML, not chat.
As such I think this current iteration will turn out a lot like GPT's.
Once a service can actively involve you and/or your LLM in ongoing interaction, MCP servers start to get real sticky. We can safely assume the install/auth process will also get much less technical as pressure to deliver services to non-technical users increases.
Is there any progress on that front? That would unlock a lot of applications that aren't feasible at the moment.
Edit: Sampling is a piece of the puzzle https://modelcontextprotocol.io/specification/2025-03-26/cli...
I also see a lot of discussion on Github around agent to agent (a2a) capabilities. So it's a big use case, and seems obvious to the people involved with MCP.
In 2024, iOS App Store generated $1.3T in revenue, 85% of which went to developers.
I'm genuinely surprised these companies went with usage-based versus royalty pricing.
Edit: yes I understand it is correct, but still it sounds like an insane amount
That 1T figure is real, but it includes things like if you buy a refrigerator using the Amazon iOS app.
It is now evident why Flash was murdered.
I had a lot of hopes after the Adobe buyout that Flash would morph into something based around ActionScript (ES4) and SVG. That didn't happen. MS's Silverlight/XAML was close, but I wasn't going to even consider it without several cross-platform version releases.
I was as well. It wasn't as bad as people describe it. It was an amazing platform, HTML5 just recently caught up.
In retrospective, Adobe should have open sourced it.
>MS's Silverlight/XAML was close
Hahahahahha, yeah sure! That tells me everything I need to know.
As for Silverlight, I mean the technology itself was closer to where I wanted to see Flash go. I'm not sure why you're laughing at that.
edit: as for not being as bad as people describe it... you could literally read any file on the filesystem... that's a pretty bad "sandbox" ... It was fixed later, but there were different holes along the way, multiple times.
This is a stupid conspiracy given Apple decided not to support Flash on iPhone since before Jobs came around on third-party apps. (The iPhone was launched with a vision of Apple-only native apps and HTML5 web apps. The latter's performance forced Cupertino's hand into launching the App Store. Then they saw the golden goose.)
HTML5 was new and not widely supported, the web was WAY more fragmented back then, to put things in perspective, Internet Explorer still had the largest market share, by far. The only thing that could provide the user with a rich interactive experience was Flash, it was also ubiquitous.
Flash was the biggest threat to Apple's App Store; this wasn't a conspiracy, it was evident back then but I can see why it is not evident to you in 2025. Jobs open letter was just a formal declaration of war.
Yes. It was a bad bet on the open web by Apple. But it was the one they took when they decided not to support Flash with the original iPhone's launch.
> Flash was the biggest threat to Apple's App Store
Flash was not supported since before there was an App Store. Since before Apple deigned to tolerate third-party native apps.
You can argue that following the App Store's launch, Apple's choice to not start supporting Flash was influenced by pecuinary interests. But it's ahistoric to suggest the reason for the original decision was based on interests Cupertino had ruled out at the time.
Connecting these apps will, at times, require authentication. Where it does not require payment, it's a fantastic distribution channel.
Why would I use a chat to do what could be done quicker with a simple and intuitive button/input UX (e.g. Booking or Zillow search/filter)? Chat also has really poor discoverability of what I can actually do with it.
Another commenter suggested a hotel search function:
> Find me hotels in Capetown that have a pool by the beach .Should cost between 200 dollars to 800 dollars a night
ChatGPT can already do this. Similarly, their own pizza lookup example seems like it would exist or nearly exist with current functionality. I can't think of a single non-trivial app that could be built on this platform - and if there are any, I can't think of any that would be useful or not in immediate danger of being swallowed by advances to ChatGPT.
I built this 18 months ago at an OTA platform. We parse the query and identify which terms are locations, which are hotel features, which are room amenities etc. Then we apply those filters (we have thousands of attributes that can be filtered on, but cannot display all of them in the UI) and display the hotel search results in the regular UI. The input query is also through the normal search box.
This does not need and should not be done in a chatbot UX. All the implementation is on the backend and the right display is the already existing UI. This is semantic search and it comes as a standard capability in ElasticSearch, Supabase etc. Though we built our own version.
e.g. if the user asks "Find hotels in Capetown [...] that have availability for this christmas or new year": if your backend, or the response format that you're forcing the LLM to give, doesn't have the ability to do an OR on the date range, you can't give results that the user wants, so the LLM tries to do as best it can, and the user ends up getting only hotels which are available for both Christmas and new year (thus missing some that have availability for one or the other), or the LLM does some other unwanted thing. For us, users would even ask "June or August", and then got July included because that was the closest thing the backend / UI could do.
So this approach is actually less flexible than a chat interface, where the LLM can figure out "Ah, I need to do two separate hotel search MCP calls, and then merge the results to not show the same hotel twice".
They also have this new design gui for visual programming of agents, with boxes and arrows.
It's going to be a hybrid of all these. Obviously the more explicit work done for interoperability, the easier it is, but the gaps can be bridged with the common sense of the AI at the expense of more time and compute. It's like, a self driving car can detect red lights and speed limit signs via cameras but if there are structured signals in smart infrastructure, then it's simpler and better.
But it's always interesting to see this dance between unstructured and structured. Apparently any time one gets big, the other is needed. When theres tons of structured code, we want AI common sense to cut through it because even if it's structured, it's messy and too complicated. So we generate the code. Now if we have natural language code generators we want to impose structure onto how they work, which we express in markup languages, then small scripts, then large scripts that are too complex and have too much boilerplate so we need AI to generate it from natural language etc etc
I tried buying a special kind of lamp this weekend, all LLMs and google sucked at this. The conversation did not help in finding more fine grained results.
Custom GPTs (and Gemini gems) didn't really work because they didn't have any utility outside the chat window. They were really just bundled prompt workflows that relied on the inherent abilities of the model. But now with MCP, agent-based apps are way more useful.
I believe there's a fundamentally different shift going on here: in the endgame that OpenAI, Anthropic et al. are racing toward, there will be little need for developers for the kinds of consumer-facing apps that OpenAI appears to be targeting.
OpenAI hinted at this idea at the end of their Codex demo: the future will be built from software built on demand, tailored to each user's specific needs.
Even if one doesn't believe that AI will completely automate software development, it's not unreasonable to think that we can build deterministic tooling to wrap LLMs and provide functionality that's good enough for a wide range of consumer experiences. And when pumping out code and architecting software becomes easy to automate with little additional marginal cost, some of the only moats other companies have are user trust (e.g. knowing that Coursera's content is at least made by real humans grounded in reality), the ability to coordinate markets and transform capital (e.g. dealing with three-sided marketplaces on DoorDash), switching costs, or ability to handle regulatory burdens.
The cynic in me says that today's announcements are really just a stopgap measure to: - Further increase the utility of ChatGPT for users, turning it into the de facto way of accessing the internet for younger users à la how Facebook was (is?) in developing countries - Pave the way for by commoditizing OpenAI's complements (traditional SaaS apps) as ChatGPT becomes more capable as a platform with first-party experiences - Increase the value of the company to acquire more clout with enterprises and other business deals
But cynicism aside, this is pretty cool. I think there's a solid foundation here for the kind of intent-based, action-oriented computing that I think will benefit non-technical people immensely.
Can’t say I'm unhappy to see the authoritarian duopoly of the existing app stores challenged.
One question that comes to mind is how will multiple providers of similar products and services be recommended/discovered? Perhaps they wont be recommended, but just listed instead as currently done by search engines. Is AISO our future - AI Search Optimization?
The docs mention returning resources, and the example is returning a rust file as a resource, which is nonsensical.
This seems similar to MCP UI in result but it's not clear how it works internally.
More: https://github.com/openai/openai-apps-sdk-examples?tab=readm...
In the current implementation, it makes an iframe (or webview on native) that loads a sandboxed environment which then gets another iframe with your html injected. Your html can include meta field whitelisted remote resources.
Convenience-wise probably this model is more viable, and things will get centralized to the AI apps. And the nested utilities will be walled gardens on steroids. Using custom software and general computing (in the manner of the now discontinued sideloading on Android) will get even further away for the average person.
This time will be different?
I personally prefer well curated information.
The LLM will do the curation.
I hope their GUI integration will be eventually superseded by native UI integration. I remember such well thought out concepts dating back to 2018 (https://uxdesign.cc/redesigning-siri-and-adding-multitasking...).
Ideally, users should be able to describe a task, and the AI would figure out which tools to use, wire them together, and show the result as an editable workflow or inline canvas the user can tweak. Frameworks like LlamaIndex’s Workflow or LangGraph already let you define these directed graphs manually in Python where each node can do something specific, branch, or loop. But the AI should be able to generate those DAGs on the fly, since it’s just code underneath.
And given that LLMs are already quite good at generating UI code and following a design system (see v0.app), there’s not much reason to hardcode screens at all. The model can just create and adapt them as needed.
Really hope Google doesn’t follow OpenAI down this path.
(Also read the documentation, they specifically mention that you can tell it to create new flow paths)
"Find me hotels in Capetown that have a pool by the beach .Should cost between 200 dollars to 800 dollars a night "
However, it might be useful for people who do want to use that instead.
I don't see how this is a significant upgrade over the many existing hotel-finder tools. At best it slightly augments them as a first pass, but I would still rather look at an actual map of options than trust a stream of generated, ad-augmented text.
The UI 'cards' will naturally becoming ever increasing, and soon you end up back with a full app within ChatGPT or ChatGPT just becomes an app launcher.
The only advantage I can see is if ChatGPT can use data from other apps/ chats in your searches e.g. find me hotels in NYC for my upcoming trip (and it already knows the types of hotels you like, your budget and your dates)
Instead of the user wasting time, ChatGpt can come up with the recommendations.
Of course ads will be there and this is good. A bad thing would be if they took a bunch of traffic from google and then gave no way to promote your products.
That would lead to companies closing and layoffs and economy decline.
Lots of folks (myself included) are reporting it doesn't: https://github.com/openai/openai-apps-sdk-examples/issues/1
While Apps do sound and look like the future, I feel like we're headed down the same road as the App and Google Play stores with this. Sooner or later OpenAI is going to use this to take a cut $$ of the payments going through the system. Which they most likely need and deserve, but still any time you close off part of the web it makes the web less open and free.
Sure, this helps app partners access their large user base and grows their functionality too - but the end game has to be lock-in with a 30% tax right?
To me it seems like a strategic shift from pure AI research and the AGI snake oil to other supposed tangible stuff.
In short, the AI revolution is mostly over, and we seem to be back in the realm of software.
so, best of luck to OAI. we'll see how this plays out
Per the docs: 'Every app comes from a verified developer who stands behind their work and provides responsive support'
That's thinly veiled corporate speak for, Fortune 500 or GTFO
Sure, but deploying a website or app doesn't mean anyone's going to use it, does it?
I could make an iOS app, I could make a website, I could make a ChatGPT app... if no one uses it, it doesn't matter how big the userbase of iOS, the internet, or ChatGPT is...
It has the potential to bridge the gap between pure conversation and the functionality of a full website.
I can block adds on a search engine. I cannot prevent an LMM from having hidden biases about what the best brand of vodka or car is.
But as with everything, as new technologies emerge, you can devise legal loopholes that don't totally apply to you and probably need regulation before it's decided that "yeah, actually, that does apply to me".
For example, React and TypeScript were hard to set up initially. I deferred learning them for years until the tooling improved and they were clearly here to stay. Likewise, I'm glad I didn't dive into tech like LangChain and CoffeeScript, which came and went.
You can see the hype cycle's timeline in HN's Algolia search: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
The big hype wave has finished now (we still have the "how dare you criticise our technology bros" roaming around though), the tooling is maturing now. It's almost time for me to actually get my feet wet with it :)
I'd much rather see a thriving ecosystem full of competition and innovation than a more stagnant alternative.
On a more serious note, it remains to be seen if this even sticks / is widely embraced.
Of course, part of it was due to the fact that the out-of-the-box models became so competent that there was no need for a customized model, especially when customization boiled down to barely more than some kind of custom system prompt and hidden instructions. I get the impression that's the same reason their fine-tuning services never took off either, since it was easier to just load necessary information into the context window of a standard instance.
Edit: In all fairness, this was before most tool use, connectors or MCP. I am at least open to the idea that these might allow for a reasonable value add, but I'm still skeptical.
> I get the impression that's the same reason their fine-tuning services never took off either
Also, very few workloads that you'd want to use AI for are prime cases for fine-tuning. We had some cases where we used fine tuning because the work was repetitive enough that FT provided benefits in terms of speed and accuracy, but it was a very limited set of workloads.can you share anymore info on this. i am curious about what the usecase was and how it improved speed (of inference?) and accuracy.
Disclaimer: this was in the 3.5 Turbo "era" so models like `nano` now might be cheap enough, good enough, fast enough to do this even without FT.
It feels like OpenAI's mission has changed from "We want to do do AGI" to
"it'll be easier to do AGI with a lot of money, so let's make a lot of money first" to
"we have a shot at becoming bigger than Google and stealing their revenue. Let's do that and maybe do AGI if that ever works out"
So far, it seems that if you give an LLM a few tools to create projects and other entities, they seem to be very good at using them. The user gets the option of chat driven ui for our app, with not that much work for limited features.
Currently building internal MCP servers to make that easy. But I can imagine having a public one in the future.
Now, I realize that the best argument for MCP vs function calls in my case, is that I want to allow external products/agents/chatbots to interface with my app. MCP is that standard. I will implement very carefully, but that's what I need to do.
“CEO” Fidji Simo must really need something to do.
Maybe I’m cynical about all of this, but it feels like a whole lot of marketing spin for an MCP standard.
I'mma call it now just for the fun of it: This will go the way of their "GPT" store.
There are plenty of brokers that will add immense value to ChatGPT for free and if users go there looking for something, it's only a matter of time.
Right now, I only like using the chat interface to answer questions I can't quite form into searches, but I also don't go directly to a chat bot to book dinner reservations. However, if I'm using the service to riff on ideas for a romantic thing to do with my partner, and it somehow leads me to resturant reservations, I do think I would engage with it and come back to ChatGPT in the future for novel interactions like that.
MCP standardizes how LLM clients connect to external tools—defining wire formats, authentication flows, and metadata schemas. This means apps you build aren't inherently ChatGPT-specific; they're MCP servers that could work with any MCP-compatible client. The protocol is transport-agnostic and self-describing, with official Python and TypeScript SDKs already available.
That said, the "build our platform" criticism isn't entirely off base. While the protocol is open, practical adoption still depends heavily on ChatGPT's distribution and whether other LLM providers actually implement MCP clients. The real test will be whether this becomes a genuine cross-platform standard or just another way to contribute to OpenAI's ecosystem.
The technical primitives (tool discovery, structured content return, embedded UI resources) are solid and address real integration problems. Whether it succeeds likely depends more on ecosystem dynamics than technical merit.