The releases themselves may turn out to be interesting, of course, and then there may be something substantive to have a thread about. The best submission would be to pick the most interesting release once it shows up.
The "launch week" pattern isn't great for HN, because we end up with a bunch of follow-ups that we have to downweight [3], and there's no guarantee that the largest thread(s) will be about the most interesting element(s) in the sequence. But startups do it anyway so we'll adapt.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
[3] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
Most interested to see their inference stack, hope that’s one of the 5. I think most people are running R1 on a single H200 node but Deepseek had much lower RAM per GPU for their inference and so had some cluster based MoE deployment.
But yeah, would be interesting to see how they optimized for that.
H20 is the next iteration of the "sanctions" that I believe also limited the "cores" but left the on-chip memory intact, or slightly higher (from the new generation).
> We believe DeepSeek has access to around 10,000 of these H800s and about 10,000 H100s. Furthermore they have orders for many more H20’s, with Nvidia having produced over 1 million of the China specific GPU in the last 9 months.
that's as dumb as saying coca cola have acccess to all offices of Berkshire Hathaway.
likewise, all comments praising deepseek history are also misleading as the company barely exists for a year.
everything is opaque marketing being repeated. just drop the off topic bla bla bla and focus on the facts and code in front of you.
thanks for coming to my ted talk.
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful. The basic idea is to make your substantive points thoughtfully, regardless of how wrong anyone else is or you feel they are.
Claiming they have access to 5x this amount is not such a bold claim?
What claims from the semianalysis article do you think are false? And based on what evidence?
> love your joke!
> Get a real life please.
> Too bad that you didn't learn this back in school.
Read the site guidelines, as you’re repeatedly resorting to personal attacks in your comments on discussions around DeepSeek or China.
They claim that a small Chinese hedge fund could acquire $1bln in GPUs, with no state support, including many sanctioned chips, then trained a model optimized for a far smaller server compute size, and that they have a source at this very small fund who is willing to admit to export violations. A 40bln param active model is exactly the size you would expect from a server of the size they claim.
What’s more likely - that semianalysis made it up like they have a bunch of other things, or that all the above is true?
For a few months H800 wasn't sanctioned and that's when they bought them.
You aren't paying per request/GPU access, CCP is.
It's telling that "South Korea has accused Chinese AI startup DeepSeek of sharing user data with the owner of TikTok in China." - source: >https://www.bbc.com/news/articles/c4gex0x87g4o
Bytedance, which has had a CCP government official on their board for years: >https://www.reuters.com/technology/bytedance-says-china-unit...
Deepseek's claims that they used old unsanctioned gpus are probably totally fabricated as well (side point-giving signapore f35s was probably a mistake): >https://www.tomshardware.com/tech-industry/deepseek-gpu-smug....
I mean it's not like an entity that bypasses sanctions would ever be open about it, as doing so would immediately result in more sanctions and the closing of loopholes. What does the CCP have to gain? What does it have to gain by stealing hundreds of billons of western IP in the past? 4 things: Power, prestige, riches, and the means to keep their power. This has been going in since at least 2004 (see Nortel case: https://globalnews.ca/news/7275588/inside-the-chinese-milita...)
The US winning the AI race was a clear threat to those 4 things.Hurting investor sentiment by a) distilling a model which cost billions to develop, and b)spreading propaganda and muddying the waters about costs, gpus, etc, helps them to narrow the gap. Making it open source was not done out of the goodness of their hearts, but out of self interest - another attempt to deflect from their actions (further muddying the waters) and divide the public against taking any further punitive action against the state (given the connection re: SK claims-tiktok algorithms were probably on overdrive spreading their bs) .
Probably counts as announcement of announcement? Let’s wait for the actual repo drops before discussing them, especially because there are no details about what will be open sourced other than
> These are humble building blocks of our online service: documented, deployed and battle-tested in production.
But on the other hand, compare this announcement in a README.md file in a GitHub repo with this slideware approach of EU https://openeurollm.eu/
If I had to bet on someone providing some value, unfortunately I wouldn't bet on Europe.
I'm saying this as a European, deeply convinced that Europe is a good place to live. I've also worked for a couple of EU funded research projects, so I have some background experience on the outcome of these projects.
If they are OK to let the EU project fail, they need to consider what the world will be. Europe has never been composed of dwarfs, but that's what every single EU country has become in the past 50 years.
Without US influence, away from big players, with less and less performant economies and industries, without a plan, with a difficult neighbour to address... it's going to be extremely difficult for Europe.
How do CERN, ESA and Airbus fit into this worldview? They are unquestionably giants in their respective fields, from my POV.
I'm fully cognisant of SpaceX vs Arianne in reusability, but that is cancelled out by outcomes/culture at Boeing vs Airbus. Broadly speaking general, Europe is either number 1 or number 2 (behind the US) in engineering and hard sciences; there's no reason to give up because they fall to #4 or 5 in some fields like software, or "AI" specifically, especially since I'm convinced that the irrational exuberance in the US will come to an end when (not if) investors start demanding the illusive ROI on AI investments. When the music stops, a lot of the "advanced" AI companies that look amazing now will be insolvent, but Euro projects will still be funded.
"Because every line shared becomes collective momentum that accelerates the journey. Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation" is a great phase.
Pragmatic as china is, they may actually see the long term value of being open research leaders to short term profit. They are not as bound to immediate and constant growth as we are, their horizons do not change so dramatically every 4 years to say the least.
Don't many respected developers care deeply about their research being open source? I'm no expert but I've read many an article (maybe I'll buy your bridge, too) that suggests willingness to open source research holds at least some weight in some researchers choosing their company. It strikes me as at least possible some of that is earnest, sure even deepseek isn't open source open source, no training set etc. but it feels like they deserve the benefit of the doubt.
All that said I'm still a student, a master of none, so cannot speak first hand to any of this. Just offering another point of view
LLMs have been more legitimate "blockchain" when most CIO magazines had these essays with "What's your blockchain strategy?" kind of stuffed material.
AI bubble will burst and will burst hard. By end of 2026 at max.
> chatgpt recently crossed 400M WAU, we feel very fortunate to serve 5% of the world every week, 2M+ business users now use chatgpt at work, and reasoning model API use is up 5x since o3 mini launch
When cost approaches zero, use cases increase exponentially.
I think this is a very, very naive assumption.
The founder is a quant with involvements in domestic investments and market design and pricing for decades - in China.
As seen with the case of Jack Ma, after you cross a certain level, there is no such thing as "not involved with politics" in China.
Liang knows exactly what he's doing.
> During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Some industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. One of Liang's business partners said they initially did not take Liang seriously and described their first meeting as seeing a very nerdy guy with a terrible hairstyle who could not articulate his vision. Liang simply said he wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group.
> During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China.
> On 20 January 2025, Liang was invited to the Symposium with Experts, Entrepreneurs and Representatives from the Fields of Education, Science, Culture, Health and Sports (专家、企业家和教科文卫体等领域代表座谈会) hosted by Premier Li Qiang in Beijing. Liang, being considered as an industry expert, was asked to provide opinions and suggestions on a draft for comments of the annual 2024 government work report.
> On 17 February 2025, Liang along with the heads of other Chinese technology companies attended a symposium hosted by President Xi Jinping at the Great Hall of the People in Beijing.
Whether he intended to or not initially, what happens with DeepSeek is now out of this man's hand and will be 100% influenced by politics.
The chip bans and dual use nature of the technology have catapulted Liang to the first row of CCP tech strategists' attention, for sure.
The moat is the products that can be built. The moat is always the product - because a differentiated product can't be a commodity. And an LLM is not a product.
Google and MSFT and Meta have already "won" because they have profitable products they can build LLMs onto. Every other company seems to be burning cash to build a product, and only ChatGPT is getting the brand recognition to realistically compete.
Building an LLM is like building a database. Sure a good one unlocks new uses, but consumers aren't buying something for the database. Meanwhile enterprise customers will shop around and drive the price of a commodity down while open source alternatives grow from in-house uses to destroy moats.
Even hardware isn't a true moat. Only Google has strong vertical integration with their TPUs, and that gives them a lead. BUT Microsoft, AWS, Meta and a whole bunch of startups are building out custom silicon which will surely put pressure on them and Nvidia to keep innovating and earning that price edge.
For products that still need a UI you could claim that LLM operators take over, so that's still a tax you pay to the incumbents as you interact with a product. It's sort of like we take the money which was paid to SQL operators and engineers and instead pay it to the hyperscalers.
Users of LLMs don’t quite have an equivalent employee to a DBA, but neither do most customers of AWS DynamoDB or RDS or whatever.
Many use cases of LLMs won’t be chat bots like ChatGPT. They’re be tools for automated summarizations, classifications, etc. They’ll be automated assistance and basic tool calling, etc. They’ll perform OCR and documentation analysis. Automated translations etc.
Good enough + open (and free) is a very appealing proposition.
What does that mean?
*disclaimer; i am an expert of nothing
that's why they can open source their model and be fine because running this shit is actually hard, let alone maintaining SLA for millions of users??
Where we're going, we don't need moats.
The idea is: if we reach true AGI first, we are going to own ALL THE MONEY!
Which erroneously assumes that models can't be siphoned off/recreated, as deepseek proved possible and even reasonably doable. Which in turn fundamentally shows that both openai and anthropic very likely have basically no moat.
I can almost smell another AI winter arriving, once all those valuations meet reality.
I think businesses that rely on new AI models are very different.
Personally I don’t think even a true open source release would erase the downsides of the model incorporating CCP propaganda and censorship. I would prefer control of megacorps to control of an untrustworthy dictatorship.
> In economics, the Jevons paradox occurs when technological advancements make a resource more efficient to use (thereby reducing the amount needed for a single application); however, as the cost of using the resource drops, if the price is highly elastic, this results in overall demand increases causing total resource consumption to rise.
[1]:https://aiproem.substack.com/p/ai-at-the-speed-of-light-tenc...
Truly admireable on their part and a great paradigm for others. Reasons for this doesn't really matter to me but I can't help but wonder if somehow they were obliged or otherwise indebted to follow this route.
My not-so-innocent guess is that they are looking to crowd-source their online platform (the front-end essentially) in order to reduce costs. Still acceptable though as they made the model open weight and partially re-producible.
Each and every contribution to open source community will be helpful. Thanks DeepSeek!
Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
Exactly which part of their writings comes off as arrogant to you? The only point in Amodei's article[0] that could be remotely be interpreted as arrogant is this:
All of this is to say that DeepSeek-V3 is not a unique breakthrough or something that fundamentally changes the economics of LLM’s; it’s an expected point on an ongoing cost reduction curve. What’s different this time is that the company that was first to demonstrate the expected cost reductions was Chinese.
Maybe I'm different, but it really does sound reasonable judgement to me.[0]: https://darioamodei.com/on-deepseek-and-export-controls#deep...
I mean strategically this could be the first use of open source in this way.
"OpenAI threatens to revoke o1 access for asking it about its chain of thought"
https://news.ycombinator.com/item?id=41534474
Not only did DeepSeek opensource their model, they also showed the user chain-of-thought right up front, which everyone else rushed to emulate when they saw how much users liked it.
irony
Mistral has been holding the line on that topic remarkable well.
i can almost hear sam altman and dario amodei cry every time deepseek does something amazing.
Unlike the other counterpart which believes that "AGI" means: "raising billions of dollars to achieve $100BN of profits to their investors". (Which is complete nonsense).
While not totally "open source" by the strictest definition, it is at least better than having no model released with no mention of the architecture on the system card or paper and just vague comments about the 'performance'.
Ladies and gentlemen, this is closer towards being an better "Open AI". Unlike the other alleged $157BN "non-profit" scam.
I think you know which one really is beneficial to humanity and is the real "Open AI".
But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
So it doesn't matter when there are multiple players competing to destroy each others in this race to zero.
> But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
It is unrealistic to close it up and hope that no-one catches up and releases a better AI model for free since the cat's already out of the bag and the progress of these AI models cannot be delayed, stopped or gate-kept for long.
By that time, someone will release a more powerful AI model for free.
I also think that the companies that are doing that have a different idea on how to make money. Facebook's competitive edge lies in all the people using their social media, and for the Chinese, I think their edge lies in manufacturing physical products, so they try to commodify the software component.
Which is in stark contrast to the US, who have a world-beating software and silicon industry, but are merely competent in other areas, so it makes sense for them to want to avoid that.
Why enter the market now when AI is already commoditized? DeepSeek is making US investors regrets investing so much to get a tiny lead over them, but they're also making future, large investments much harder to justify when you can rely on existing open-sourced models
Foundation Models aren't defensible. It'll force VCs to allocate on other stuff (the new buzz is "the application layer")
The giant players are more than happy to keep their models open if no one even tries to compete.
"None are more hopelessly enslaved than those who falsely believe they are free." ~~ Goethe
Who knows what any of then might do in the future? For now I'm cheering for Deepseek, Meta and anyone publishing open models as I strongly believe that the potential "danger" of AI in the hands of everyone is far outstripped by the concrete dangers of AI dictated by a select small group of corps/gov symbionts.
If we ignore that, we will let PR teams play us every time they claim altruism while serving themselves. It doesn’t mean Deepseek can’t also have good motives, but we must be clear that undercutting OpenAI while simultaneously building community goodwill is a smart move on their part to shift the market in their favor.
I wish it was easier to learn about media literacy
Can't fool me twice. Not yet, wait a couple of years.
Google also made a lot of money from other businesses that aren't AI models, until they started selling AI models, just as DeepSeek now does.
The reality is that DeepSeek is a full company, that was funded as a spin-off from the original business (a hedge fund that used its large GPU stockpile to pick stocks via ML). The company DeepSeek is owned by the hedge fund CEO not the hedge fund. It exists as a business aiming to make money, not as a pet project for another business.
You don't need to be known by the general public to take advantage of tax schemes involving "donating" money
Friedrich Engels: The Condition of the Working-class in England
The focus of Engels' criticism when he made these statements was on *capitalist production relations*, where capitalists control the means of production and obtain profits by exploiting the surplus labor time of workers. This is precisely what DeepSeek and open-source initiatives are challenging. They are turning the means of production from the private property of capitalists into public property.
I hope you did not intentionally misquote this passage.
Regardless of free software, capitalists control the means of production and obtain profits by exploiting the surplus labor time of workers.
Free software may make it more obvious though, at least for some.
Basically they pocketed all profits with other people footing all the risks.
Of course they want money, lots of money, tons of money is required for hiring engineers and paying for its hardware. However, your claim that DeepSeek's exists is to make money is just your guess back by nothing else but your wild guess.
DeepSeek CEO Liang Wenfeng himself is an engineer, he is the co-author/developer of the DeepSeek model, he helped but not listed as a core contributor. Obviously that is not a smart strategy to spend your CEO hours to maximize your $ return. His interview a few months ago actually gave answers to all these, he is seeking for AGI. That is the motivation, that is why DeepSeek exists.
No business exists not to make money because that is a charity. It’s not a charity, because a charity is not a business, and DeepSeek is a business. I don’t care to quibble about how interesting they are in being a lucrative business, but simply that they’re a business.
My point wasn’t to question their motives about profit vs AGI (why would these be mutually exclusive btw), but to challenge the notion that it’s some side project from a random business. It’s a company with dedicated resources and staff.
For a smaller player, open-sourcing might be a strategic move. It would likely go unnoticed if a small Chinese company released a model "almost as good as" ones from the top US players. But releasing it as open source is a game-changer.
However, open source isn't just for small players. Microsoft develops Visual Studio Code and Meta develops PyTorch - to name a few examples out of hundreds. In these cases, it's also PR - they can afford it, and it doesn't compete with their core business.
There's a story about someone asking the Dalai Lama whether all altruism is actually a form of egoism, since we do good things to feel better. He responded that if that's the case, we need more of this type of egoism. (I can't find the exact source, but it aligns with his quote "Being wisely selfish means taking a broader view and recognizing that our own long-term individual interest lies in the welfare of everyone.")
So yes, I want to see more of this kind of PR.
Some companies will play on opensource, some will play on pricing, some on quality.
Almost all of the open source companies which do good eventually start an enterprise / paid division as well.
I just wish this smear campaign against them stops sometime soon.
The Chinese government only supports companies that are in line with industrial policies and are facing difficulties that require assistance. This is because such companies struggle to obtain financing from the society. The aim is to support the entire industry, not a specific company. If a company holds a leading position, it does not need to receive any "resources" from the government; it can acquire sufficient resources from the society.
China 10y bond yield is at <2%, this is a very low financing cost.
So, like, for example, AI companies who are very upfront about not being able to get their hands on as many chips as they'd like?
government is not good at smuggling chips without getting attention, better try to contact some dealers in singapore or malaysia
LLM's are not that different than programming languages. Imagine Guido van Rossum charging $200 so you can use Python...
https://news.ycombinator.com/newsguidelines.html
https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
I've followed him closely since ~2016 so I can say this with some conviction. He's exactly the same guy he was back then. He even talks of the exact same things with the same excitement. Sure, "American boots on MARS!" instead of just "boots on Mars" like he did after the inauguration, but it's quite clear he has seen US falling apart as a existential risk for the more lofty goals especially SpaceX has for Humanity. https://www.youtube.com/watch?v=wubITdJ_MCw
Its sad that you fell for it then. Read Phillip Long's post on him, not someone who follows him but someone who has worked with him for years. It should be eye opening in the kind of man he is.
There will be no Mars terraforming, his goal is being the worlds first trillionaire. The emperor has no clothes, the companies run despite him not because of him and the cult of personality only appeals to people who somehow still fall for it.
Here's a take by people who have had actual direct contact with him. https://www.reddit.com/r/SpaceXLounge/comments/k1e0ta/eviden...
The arguments against his capability to lead cross-field technical operations should be disproven by his successes that he has proven several times in sequence. The argument of him being a fraud is basically hinging on him rolling d20 several times in a row, and only acceptable to those not knowing his personality and attributing his actions to malice (through self-projection of the viewer). Philip's arguments tell as much.
He's done plenty enemies while at it! Wouldn't really expect anything else being as disruptive as raw autism in fixing the species might be. They'll fade.
He is actively helping take health care from poor people. He is firing thousands of people with families, mortgages and medical bills without cause. He is closing our national parks. All so he can personally have a tax cut.
His ex-wife is frantically posting for him to help with the healthcare of their own son in his replies. He can't even manage his family I don't think he has the betterment of humanity on his mind.
Musk does a lot of things at a very high level publicly so I think it's an even easier task. I'm sure you'll disagree but I believe it's this false narrative and who's creating it that you should be doubting.
Many people don't have a problem with a lot of what Musk has done. He's not perfect and does make mistakes which he openly admits like any sane rational person should. I do believe his good intent is there and he generally tries to right wrongs.
I'm watching closely what he does and sometimes I have my doubts. If I ever see him actually cross a line I'll change my mind. For now, most of the narrative has been pretty typical fake news and timeless partisan disagreement on methods of governance.
The vision that sees this as bad is obviously tainted by corruption, and is so not worth of care especially as the people leaving their jobs will have a damn good golden parachute.
At least try to argue on the same level.
get a grip. Research how financially mad / "irresponsible" that was.
Do you really think the people he's getting rid of are material to the mission of the agencies themselves, given even those missions are as relevant as they were when they were founded?
X sure is doing well with 20% of the crew regadless of the doomsayers screaming how it would crash at the time xD
Even 10+ year old Teslas are a good investment btw, especially if you're going for the 100% environmental angle. I recommend researching the endurance of their batteries.
Calling AfD neo-nazis while their beliefs are something Germans ardently against their bad history a short time ago would be country-wide is not very informed.
Any argument of "Democracy being interfered with" helplessly just sounds like loser talk. Like, if someone sells and idea and people vote for it, only an antidemocratic mindset would be so against that. Sorry. "Not again" all you want. Anyone could push that. Recommend looking at forces against freedom of speech and their relation to bad history instead :)
Imagine no more human interactions just a permanent flood of meaningless thoughtless word salad.
I think the Chinese are perfect to introduce such a product very inline with what they usually produce.
Get ready for web3.o