Karpathy gave his initial impression: https://x.com/karpathy/status/1891720635363254772

The pull quote is: The impression overall I got here is that this is somewhere around (OpenAI) o1-pro capability

The impression seems to be warranted: Grok 3 has directly jumpted to the top of all leaderboard categories in Chatbot Arena: https://lmarena.ai/?leaderboard

In math it shares the top spot with o1 and is just a few points behind (well within errors). In creative writing it is basically ex-aequo with the latest ChatGPT 4o and in coding it's actually significantly ahead of everyone else and represents a new SOTA.

lmarena/lmsys is beyond useless, looking at prior rankings of models vs formal benchmarks or testing for accuracy + correctness on batches of real world data. It's a bit like using a poll of Fox News to discern the opinions of every American; the audience voting is consistently found wanting. Not even getting into how easily a bad actor with means + motivation (in this "hypothetical" instance wanting to show that a certain model is capable of running the entire US government) can manipulate votes which has been brought up in the past (yes I'm aware of the lmsys publication on how they defend against attacks using cloudflare + recaptcha, there are ways around that.)
So you're saying that either A: users interacting with models can't objectively rate what responses seem better to humans, B: xAi as a newcomer has somehow managed to game the leaderboard better than all those other companies, or C: all those other companies are not doing it. By those standards every test ever devised for anything is beyond useless. But simply not having the model creator running the evaluation is already going a long way.
No I'm saying that some companies are doing it (OpenAI at the very least), the company in question has motive and capability to game the system (kudos to them for pushing the boundaries there), AND the userbases' rankings have been historically, statistically misaligned with data from evals (though flawed) and especially when it comes to testing for accuracy + precision on real world data (outside of their known or presumed dataset). Take a look at how well Qwen or Deepseek actually performed vs the counterparts that were out at the same time vs their corresponding rankings.

In the nicest way possible I'm saying this form of preference testing is ultimately useless, primarily due to a base of dilettantes with more free time than knowledge parading around as subject matter experts and secondarily due to presumed malfeasance. The latter is more apparent to more of the masses (that don't blindly believe any leaderboard they see) now that access to the model itself is more widespread and people are seeing the performance doesn't match the "revolution" promised [0]. If you're still confused why selecting a model based on a glorified Hot or Not application is flawed, perhaps ask yourself why other evals exist in the first place (hint: some tests are harder than others.)

[0](One such instance of someone competent testing it and realizing it's not even close to the "best" model out) https://www.youtube.com/watch?v=WVpaBTqm-Zo

Considering that OpenAI subscription is $200 per month, and "Premium Plus" subscription that includes this thing is only $40 per month, does that mean instantaneous "Elon factor" is now at least -$160 per month per user, or is it supposed to be added up to more than -$240 per month?

How would the math change after factoring in that OpenAI isn't even covering entirety of opex with the sub anyway, and/or people finding associating their money and Twitter accounts to be weird, and/or this thing is supposedly running on a bigger cluster than that for OpenAI?

No... sigmoid10 was comparing with o1 (not o1-pro), which is accessible for $20/mo, not $200/mo. So, the "Elon factor" in your math is +$20/user/month (2x) for barely any difference in performance (a hard sell), not -$160/user/month, and while we have no clear answer to whether either of them are making a profit at that price, it would be surprising if OpenAI Plus users were not profitable, given the reasonable rate limits OpenAI imposes on o1 access, and the fact that most Plus users probably aren't maxing out their rate limits anyways. o1-pro requires vastly more compute than o1 for each query, and OpenAI was providing effectively unlimited access to o1-pro to Pro users, with users who want tons of queries gravitating to that subscription. The combination of those factors is certainly why Sam Altman claimed they weren't making money on Pro users.

lmarena has also become less and less useful over time for comparing frontier models as all frontier models are able to saturate the performance needed for the kind of casual questions typically asked there. For the harder questions, o1 (not even o1-pro) still appears to be tied for 1st place with several other models... which is yet another indication of just how saturated that benchmark is.

“The impression overall I got here is that this is somewhere around o1-pro capability”.

“Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month)”.

The comment I was replying to had replied to an lmarena benchmark link. Perhaps you think that person should have replied to someone else? And, if you want to finish the quote, Karpathy's opinion on this is subjective. He admits it isn't a "real" evaluation.

"[...] though of course we need actual, real evaluations to look at."

His own tests are better than nothing, but hardly definitive.

I understood numpad0 to continue the comparison to o1-pro, after sigmoid10 expressed the opinion that the comparison is warranted.
Yes, numpad0 did... but I was pointing out that this choice was illogical. The lmarena results they were replying to only supported a comparison against o1, since o1 effectively matches Grok 3 on the benchmark being replied to (with o1-pro nowhere to be found), and then they immediately leapt into a bunch of weird value-proposition math. As I said, perhaps you think they should have replied to someone else? Replying to an lmarena benchmark indicates that numpad0 was using that benchmark as part of the justification of their math. I also pointed out the limitations of lmarena as a benchmark for frontier models.

I don't think anyone is arguing that ChatGPT Pro is a good value unless you absolutely need to bypass the rate limits all the time, and I cannot find a single indication that Premium+ has unlimited access to Grok 3. If Premium+ doesn't have unlimited rate limits, then it's definitely not comparable to ChatGPT Pro, and other than one subjective comment by Karpathy, we have no benchmarks that indicate that Grok 3 might be as good as o1-pro. You already get 99% of the value with just ChatGPT Plus compared to ChatGPT Pro for half the price of Premium+.

numpad0 was effectively making a strawman argument by ignoring ChatGPT Plus here... it is very easy for anyone to beat up a strawman, so I am here to point out a bad argument when I see one.

You're the one that came in and told him about the "factor in your math". Like you said, it's his comparison, not yours. If you want to do your own comparison, feel free. But don't come in and tell him he's not allowed to do his comparison. I for one like is comparison.
Guys, yall forget GIGO. First principles.

This thing is produced by musk.

  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
Where do you see that Premium+ is $40 per month?

The official source says "Starts at $22/month or $229/year on web", https://help.x.com/en/using-x/x-premium

This is pretty much what I paid a couple of months ago, as a Canadian.

They just announced a price increased today. The link you posted has this info in a blue box at the top.

Also visible here: https://help.x.com/en/using-x/x-premium#tbpricing-bycountry

  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
Interesting. In that table, I see $40 for US users. Yet the price remains $30 for Canadian users, despite their low dollar value.
And it’s only £17 in the UK or €21 in the EU.
  • ·
  • 3 days ago
  • ·
  • [ - ]
>Considering that OpenAI subscription is $200 per month

This plan is 75 days old. I didn't know it existed until last week.

OpenAI is starting to try to get a little more realistic revenue in, Grok is acquiring customers.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
Given how fast-moving the field is, it's very difficult to confidently state how much inference costs. Perhaps he's under-charging, perhaps OpenAI is over-charging, one may be more optimised than the other, but new models come out and change everything in less time than is normally takes for actual costs to become public knowlege.
Sometimes it's a matter of approach, some approach could be 5% better and 10x more expensive. So they will find the sweetspot, takes a few iterations.
Yes, better to avoid annual subscriptions.
Masa Son top ticks a market is somehow still news in 2025.
What do we do to assess the intelligence of these models after they are smarter than any human? From the kinds of questions it's answering seems like they are almost there.

Do we have a way to tell if one model is smarter than another at that point?

Nah, at the end of the day "things that are easy for humans are [still] hard for computers, and vice versa". DeepBlue was super-human at chess and couldn't play tic tac toe. Today's AI is (almost?) super-human at math yet only very recently learned to play tic tac toe, and still can't learn to do anything - because it can't learn, and has no innate drives to expose itself to learning situations even if it could.

Here's a real world intelligence test. Take on each AI as a remote intern/new-hire, and try to train it to become a useful team member (solving math puzzles or manufacturing paperclips does not count).

  • gf000
  • ·
  • 2 days ago
  • ·
  • [ - ]
Almost there? Are we looking at the same thing?
  • ·
  • 2 days ago
  • ·
  • [ - ]
There are things besides measuring intelligence, like humor. Currently, all the bots struggle with making jokes.
What you probably mean is puzzle solving intelligence. Humor is a form of intelligence. It's just not only about intelligence - it's also about values, and context, for instance. But all this reflects a form of intelligence. Neverthless, intelligence shouldn't be ranked, at least not in the way we are used to talk about it.
  • flir
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Do we have a way to tell if one model is smarter than another at that point?

Ask them to design a ranking mechanism for you. They are superhuman, after all.

(I really don't think we're going to have to worry about this).

[flagged]
[flagged]
  • ·
  • 3 days ago
  • ·
  • [ - ]
Naive question from a bystander , but since DeepSeek is open source and is on par with o1-pro (is it?), shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?
> DeepSeek is open source and is on par with o1-pro (is it?)

There is no being "on par" in this space. Model providers are still mostly optimising for a handful of benchmarks / goals, like we can already see that Grok 3 is doing incredibly well on human preference (LM Arena) however with Style Control, it's suddenly behind ChatGPT-4o-latest and Gemini 2.0 is out the picture. So even within a single domain, goal, benchmark—it's not as straightforward as to say that one model is "on par" with another.

> shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?

Not necessarily. I know it may be tempting to think that Grok 3 is entirely a result of xAI having lots of "computer power", but you have to recognise that this mindset is coming from a place of ignorance, not wisdom. Moreover, it doesn't even pass off as "cynical" view, because it's common knowledge that model training is really, really complicated. DeepSeek results are note-worthy, and really influential in some respects, but it hasn't magically "solved" training, or made training necessarily easier / less expensive for the interested parties. They never shared the low-level performance improvements, just model weights and lots of insight. For talented researchers, this is valuable, of course, but it's not like "anybody" could easily benefit from it in their training regimes.

Update: RFT (contra SFT) is becoming really popular with service providers, and it's not been "standardised" beyond whatever reproductions to have emerged in the weeks prior, moreover R1 cost is still pretty high[1] at something like $7/Mtok, & bandwidth is really not great. Consider something like Google Vertex AI's batch pricing for Gemini 1.5 Pro and Gemini 2.0 Flash models, which is at 50% discount, and their prompt caching which is at 75% discount. R1 is still got a way to go.

[1]: https://openrouter.ai/deepseek/deepseek-r1/providers?sort=th...

The full-sized DeepSeek-R1 is on par with o1.

o1-pro is "o1 on steroids" and was the first selling point of the $200/month Pro subscription but they later also added "Deep Research" and Operator to the Pro subscription.

  • guax
  • ·
  • 3 days ago
  • ·
  • [ - ]
Every year seems like we get worse at naming things in non confusing ways. I am waiting for the o1-pro-max now, pro max ultra and pro max ultra plus.
Microsoft already mastered this decades ago with a dozen different license tiers for Windows with unintuitive feature sets.
Not to mention the Xbox iterations...
I was recently laid off from OpenAI. My job was coming up with names for their models.
It's not a "layoff" if you were fired for poor performance/picking bad names.
  • rvnx
  • ·
  • 3 days ago
  • ·
  • [ - ]
You are welcome to join the Bard team
Is this because of USAID funding cut?
It’s hard because there are multiple dimensions being upgraded at different cadences. Architecture, parameter count, etc.
Then you’ll know what the naming AI was trained on.
Off by one and naming things.
For me I was sold on the consistency. o1 does really great at several programming problems, but o1-pro does great on these problems 4 of 4 tries. I get a good answer more often with o1-pro than with just o1, or even o3-mini-high.
o1-pro is indeed pretty great, but I find that I can iterate several times with Gemini 2.0 Pro Experimental (or whatever their latest reasoning model is called these days) between o1-pro's responses. It's almost too slow for interactive use cases.
Yeah. I've found out that you can start out with o3-mini-high and then switch over to o1-pro, or the other way around. Helps to iterate a bit faster.
It's not on par with o1, let alone o1-pro
It's on par/better/worse depending on the problem. o1 is significantly worse, for example, in Rust programming than Claude 3.5; at least for me.
Claude really likes producing code, that’s for sure. I feel like it’s a useful tool once I’ve deconstructed a project past a certain point.
It’s pretty on par with o1, better at many coding questions.
  • flir
  • ·
  • 3 days ago
  • ·
  • [ - ]
I found it better at reasoning, worse at coding.

Not doubting your experience, just thinking how subjective it all is.

  • golol
  • ·
  • 3 days ago
  • ·
  • [ - ]
Deepseek is not on par with o1.
It probably depends on the benchmark you choose; according to Chatbot Arena, Deepseek-R1 ranks similarly to o1-2024-12-17; and Grok3 is just 3% above these models in "Arena Score" points.
  • golol
  • ·
  • 3 days ago
  • ·
  • [ - ]
Chatbot Arena is not really a great benchmark imo
Yes it is!
No DeepSeek model is open source; they're freely redistributable, but without source.
I guess when it comes to LLM's what is considered the "source" - the weights or the code used to build the weights?
To the extent that the concept is applicable, it would be the training data and the training code.
You're forgetting the flour in your cake recipie, the data, arguably the singularly most important part.
You'd still need a fairly large amount of compute power to be able to run DeepSeek R1 locally, no?
Well yes, but not so large that it's completely prohibitive. People have been running the full models on computers going as low as $6000: https://x.com/carrigmat/status/1884244369907278106

Of course this is for a personal instance, you'd need a much more expensive setup to handle concurrent users. And that's to run it, not train it.

I have a used workstation I got for $2k (with 768GB of RAM) - using the Q4 model, I can get about 1.5 tokens/sec and use very large contexts. It's pretty awesome to be able to run it at home.
  • nomel
  • ·
  • 2 days ago
  • ·
  • [ - ]
For me, where electricity is $0.45/kWh, assuming 1kW consumption, it would be around $80 USD/million!
I think you might have to show your math on that one.
  • nomel
  • ·
  • 1 day ago
  • ·
  • [ - ]
They said 1.5 tokens/second. 1 mil tokens is 667k seconds is 185 hours per million tokens. 1kW * 185hr * $0.45/kWh = $80 per million tokens. Again, assuming 1kW, which may be high (or low). The cost of the physical computation is electricity cost.
They said it has a crappy GPU, the whole computer probably only uses 200 - 250 watts.
  • nomel
  • ·
  • 1 day ago
  • ·
  • [ - ]
No way. 768GB of ram will have significant power draw. DDR4 (which this probably is) is something like 3W/8GB. That's > 250W alone.

So, say 500W. That's, for me in my expensive electricity city, $40/million tokens, with the pretty severe rate limit of 5600 tokens/hours.

If you're in Texas, that would be closer to $10/million tokens! Now you're at the same price as GPT-4o.

But you can run and experiment with any model of your liking. And your data does not leave your desktop environment. You can build services. I don't think anybody doing this is doing it to save $20 a month.
  • nomel
  • ·
  • 13 hours ago
  • ·
  • [ - ]
Yes. I was only making a monetary comparison.

Related, you can get a whole lot of cloud computing for $2k, for those same experiments, on much faster hardware.

But yes, the data stays local. And, it's fun.

This comment chain is pretty funny.

Would love to know more info & specs of your workstation.
It's an HP Z8 G4 (dual-socket 18-core, 3 GHz Xeons, 24x32GB of DDR4-2666, and then a crappy GPU, 8TB HDD, 1TB SSD). It can accommodate 3 dual-slot GPUs, but I was mostly interested in playing with frontier models where holding all the weights in VRAM requires a ~$500k machine. It can run the full Deepseek R1, Llama3-405B, etc, usually around 1-2 tokens/sec.
Sortof a letdown that after 24 32Gb RAM sticks you only get 6-8 tokens per second.
But a token is not just a character.

"hello how are you today?" - 7 tokens.

And this is so much better than I could have imagined in a very short span of time.

And only get to use 20k context length before it OOMs
A better approach is to split the model with MOEs running on CPUs and MLAs running on GPU. See the ktransformers project: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

This takes advantage of the sparsity of MOE and the efficient KV-cache of MLA.

You perhaps forgot to mention that for their AMX optimizations to be even feasible you'd need to spend ~$10k for a single CPU, let alone the whole system which is probably ~$100k.
Granite Rapids-W (Workstation) is coming out soon for likely much less than half that per CPU. (Xeon W-3500/2500 launched at $609 to $5889 per CPU less than a year ago and also has AMX).
Point being? Workstations that are fresh on the market and which have comparable performance of the server counterparts still easily cost anywhere between $20k and $40k. At least this is according to Dell workstations last time I looked.
Supermicro X13SWA-TF Motherboard (16 DIMM slots with Xeon W-3500)= ~$1,000

E-ATX case = ~$300

Power Supply= ~$300

Xeon W-3500 (8 channel memory) = $1339 - $5889

Memory = $300-$500 per 64GB DDR5 RDIMM

Memory will be the major cost. The rest will be around $5,000. A lot less than "$100,000"!

I acknowledged in my last comment that the cost doesn't have to be $100k but that it would still be very high if you opted for the workstation design. You're gonna need to add one more CPU to your design, add another 8 memory channels, beefier PSU, and a new motherboard that can accommodate this. So, 8k (memory) + 10k (cpus) + the rest. As I said, not less than $20k.
Why does it have to be a dual CPU design? 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth. Not amazing, but OK. Granite Rapids-W will likely be something like 50% better (cores and bandwidth).

And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900. That way the large part of the model sits in the larger slower memory, and the active part in the GPU with the faster memory. Very cost effective and fast. (Something like generating 16 Tokens/s of 671B Deepseek R1 with a total hardware cost of $10-$20k.) They tried both single and dual CPU, with the latter about 30% faster....not necessarily worth it.

https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

> 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth.

That's the theory. In practice, Sapphire Rapids needs 24-28 cores to hit the 200 GB/s mark and it doesn't go much further than that. Intel CPU design generally has a hard time saturating the memory bandwidth so it remains to be seen if they managed to fix this but I wouldn't hold my breath. 200 GB/s is not much. My dual-socket Skylake system hits ~140 GB/s and it's quite slow for larger LLMs.

> Why does it have to be a dual CPU design?

Because memory bandwidth is one of the most important limiting (compute) factors for larger models inference. With dual-socket design you're essentially doubling the available bandwidth.

> And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900.

Dual-socket CPU that costs $10k on a server that costs probably couple of factors more. Now you claimed that it doesn't have to be that expensive but I beg to differ - you still need $20k-$30k of worth equipment to run it. That's a lot and not quite "cost effective".

The proof of the pudding is in the eating. Read the link above. It's one or two mid range[1] Sapphire Rapids CPUs and a 4090. Dual CPU is faster (partially because 32->64 cores, not just bandwidth) but also hit data locality issues, limiting the increase to about 30%.

(Dual Socket Skylake? Do you mean Cascade Lake?)

If you price it out, it's basically the most cost effective set-up with reasonable speed for large (more than 300 GB) models. Dual socket basically doubles the motherboard[2] and CPU cost, so maybe another $3k-$6k for a 30% uplift.

[1] https://www.intel.com/content/www/us/en/products/sku/231733/... $3,157

[2] https://www.serversupply.com/MOTHERBOARD/SYSTEM%20BOARD/LGA-... $1,800

Yes, dual socket Skylake. What's strange about that?

Please price it out for us because I still don't see what's cost effective in a system that costs well over $10k and runs at 8 tok/s vs the dual zen4 system for $6k running at the same tok/s.

Sorry. Didn't realize you meant Skylake-SP.

I am not sure what your point is? There are some nice dual socket Epyc examples floating around as well, that claim 6-8 tokens/s. (I think some of those are actually distilled versions with very small context sizes...I don't see any as thoroughly documented/benchmarked as the above). This is a dual socket Sapphire Rapids example with similar sized CPUs and a consumer graphics card that gives about 16 tokens/second. Sapphire Rapids CPU and MB are a bit more expensive, and a 4090 was $1500 until recently. So for a few thousand more you can double the speed. Also the prompt processing speed is waaaaay faster as well. (Something like 10x faster than the Epyc versions.)

In any case, these are all vastly cheaper approaches than trying to get enough H100s to fit the full R1 model in VRAM! A single H100 80 GB is more than $20k, and you would need many of them + server just to run R1.

I don't argue their idea, which is sound, but I argue that the cost needed to achieve the claimed performance is not "for a few thousand more" as you stubbornly continue to claim.

The math is clear: single-socket ktransformers performance is 8.73 tok/s and it costs ~$12k to build such a rig. The same performance one gets from a $6k dual-EPYC system. It is a full-blown version of R1 and not a distilled one as you say.

Your claim about 16 tok/s is also misleading. It's a figure for 6 experts while we are comparing R1 with 8 experts against llama with 8 experts. 8 experts on dual-socket system per ktransformer benchmarks runs at 12.2 - 13.4 tok/s and not 16 tok/s.

So, ktransformers can roughly achieve 50% more in dual-socket configuration and 50% more than dual-EPYC system. This is not double as you say. And finally, the cost of such dual-socket system is ~$20k and therefore isn't the "best cost effective" solution since it is 3.5x more expensive for 50% better output.

And tbh llama.cpp is not quite optimized for pure CPU inference workloads. It has this strange "compute graph" framework which I don't understand what is it there for. It appears completely unnecessary to me. I also profiled couple of small-, mid- and large-sized models and the interesting thing was that majority of them turned out to be bottlenecked by the CPU compute on a system with 44 physical cores and 192G of RAM. I think it could do a much better job there.

6k is not that bad considering that top of the line Apple laptop costs as much. However, I don't have X so unfortunately I can't read the details.
You can read the whole thread through nitter:

https://xcancel.com/carrigmat/status/1884244369907278106

  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
Related ongoing thread:

Andrej Karpathy: "I was given early access to Grok 3 earlier today" - https://news.ycombinator.com/item?id=43092066 - Feb 2025 (48 comments)

I don't see the Think button, and for me the answer is much below deepseek-r1 even thought I have Premium+ subscription. I'm just getting instant stupid answer instead of thinking.
How can anyone repeatedly use a question like this without new models getting trained on it via online discussion?
Grok has gotten to the top of one benchmark:

https://x.com/lmarena_ai/status/1891706264800936307

It's been said before but it is great news for consumers that there's so much competition in the LLM space. If it's hard for any one player to get daylight between them & the 2nd best alternative, hopefully that means one monopolistic firm isn't going to be sucking up all the value created by these things

I've spent the last hour testing it and I'm blown away. And this is coming from a very hardcore user of OpenAI/Claude products on a daily basis.

It passed every goofy test I have for writing articles which involves trying to surface arcane obscure details. (it certainly means however they are scraping the Web they are doing a good job here)

It made the database code I wrote over the last week with o3/o1/GPT4o/Claude3.5 look like a joke.

It fills me with rage over who owns this thing.

Even if people tank Tesla's car business and run Twitter into the ground, I think our new Galactic Edgelord is going to win his first trillion on xAI and Teslabots anyway.

btw: it tried to charge me $40/mo for this thing: https://imgur.com/a/QXslgBo

Apologies for possibly stupid question but where can you use it right now? Just on 'direct chat' on https://lmarena.ai/ or is there a better alternative? Or do you have early access?
You need an X Premium Plus subscription on Twitter.
I was using it on grok.com, logged in via a Twitter account. But I notice it just got added to the Grok tab on Twitter a moment ago.

Also, the "Deep Search" button was not available when it first went live, so I'm retesting everything again with this feature enabled, which I assume is a reasoning version of the same model.

One neat feature is you can use Grok on any tweet, its helped me find context to obscure tweets many times over, very quickly.
  • jug
  • ·
  • 3 days ago
  • ·
  • [ - ]
Hopefully, you’ll be able to avoid the whole X Premium Plus thing in the near future with OpenRouter. It’ll still use xAI backend but via your OpenRouter API key. Then you can use it with any web or mobile app that supports OpenRouter.

Personally, I wouldn’t use it though. What’s going on with Elon Musk right now is completely insane. I hope to see OpenAI’s GPT-4.5 & GPT-5 releases to catch up soon, if nothing else. Announced for this year.

For whatever it is worth, I frequently enough see the devs asking for feedback, so I suspect, if you tweet about Grok, or reply to any of those threads, they definitely read them, even if they don't respond / interact. It shows. I've seen improvements based on feedback I see others make almost instantly.
[flagged]
  • TMWNN
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
Thats not how musk looks like at all :D

And lets see if Musk is pushing too many people too far. Everything he currently does can blow up in his face very fast

It's not good news when this competition comes at cost of a gigantic over inflated bubble, in which all the big players keep on sucking billions from investors without even having a business model.

This hype will burst sooner than later and will trigger yet another global recession. This is untenable.

ChatGPT is literally generating billions in revenue. Cursor is the fastest growing company of all time.

This lame HN trope of LLMs having no business model needs to die.

> ChatGPT is literally generating billions in revenue.

It’s losing more billions than what it’s generating. Revenue does not equate profit.

https://www.cnbc.com/2024/09/27/openai-sees-5-billion-loss-t...

True, but presence of significant revenue is still promising. It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.
Keep in mind that not only is OpenAI being directly propped up by investor hype, the downstream API users who contribute much of their revenue are also being propped up by investor hype. A big chunk of OpenAIs revenue is actually even more VC money in a trenchcoat.
The biggest marker of a bubble, to me, is that you have money-losing startups selling to other money-losing startups. On paper you see a lot of 'line go up' but it's just a lot of circulation in a closed body of water which will eventually evaporate.
Uber doesn't agree
Uber is an outlier because in a functioning economy that valued workers we wouldn’t have shipped all our jobs overseas and made gig economies the last line between housing and street schizophrenia.
Uber's revenue was never coming from other food delivery startups.
>True, but presence of significant revenue is still promising.

If started selling 5 dollar bills for 1 dollar, I could generate a lot of revenue with $150B. You wouldn't believe the demand we would see for $5 bills.

> It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.

That is only true is your primary concern in life is personal wealth and you’re burning other people’s money.

YouTube is an optimistic example.

The bandwidth costs made it deeply loss making for a long time despite having loads of engagement and ad revenue. However over time they became more cost efficient at sending video over the internet and became profitable.

This strategy obviously doesn't always work, with WeWork being the canonical example. But it's not guaranteed to fail either.

YouTube's network effect creating a winner-take-most was recognized, pitched, and valued from the very start.

The capabilities of LLMs are impressive, but none of them have published an idea I consider to have the same potential for a trillion $ monopoly that the current hype looks like.

There are far more similarities with the dot-com hype.

No critical first mover advantage in sight. All parts are replaceable for the cheapest variant with little to no down side to users.

It wasn't obvious at the time YouTube would have a network effect though. It was very dependent on coming up with a great recommendation algorithm, along with monetization and revenue sharing. At the time, YouTube didn't have anything like that, iirc.
Even the basic front page of youtube was of immediate and obvious value to a creator, and would increase disproportionately in value the more people were on YouTube. The same goes for Amazon, and the same goes for Facebook.

All the LLM providers are - extremely useful - tools. Currently I can only see the 'non-monopoly' proportional improvement when their userbase grows from 100 to 1000.

But I might be wrong, and I wouldn't be surprised if in hindsight it will be obvious what the real disproportionate advantages there were to be found.

They bought youtube for 1.65 billion which is pennies on the dollar compared to what it is worth today.
Technically true, though in fairness it is unlikely the original owners would have gotten YouTube to where it is today. On the other hand there are companies who didn’t recognise they were nothing more than passing fads, refused buyouts, and crumbled.
[flagged]
You could have asked any one of the dozen available LLMs to review this comment.

Most of them would have responded by explaining what a monopoly is, and why this reply makes little sense.

[flagged]
YouTube did not have competitors and certainly not open source competitors.
This is a bit before my time but I remember a bunch of competitors to YouTube. They just all sucked.
I'm not sure how promising? I can't help but see how easy it would be to change API endpoints to a different platform.
It's loss-making at current usage, and usage per user will exponentially grow.
They're still early on the growth curve where there's enough opportunity for future growth that investing in scaling and improvement is more important than turning an immediate profit.

Remember when everyone on HN was sure Uber would never be profitable? Or Tesla? Or Amazon?

  • devin
  • ·
  • 3 days ago
  • ·
  • [ - ]
I do remember that, and I would say that they are still largely correct. Tesla needed government subsidies, Amazon needed AWS, and Uber needed a pandemic and Uber eats. The core businesses that HNers were referring to are still weak.
Tesla needs government subsidies. Uber needs a broken economy.
As did almost every large tech company today. Amazon lost money for decades.

Are we really still doing this nonsense? If Open AI wanted to become profitable they could do it inside of 12 months - growing companies don't care about profitability (nor should they)

You're way too smug for spewing what is clearly survivorship-bias.

All currently known profitable use-cases are competing on price. All the unicorn examples you're biased for had in their pitch deck the network effect of being the largest.

OpenAI, Grok, etc, have shown no unique value prop or an idea with monopoly potential.

Revenue or profit? WeWork at a time also did billions in revenue.
WeWork trapped themselves into a real estate hole, selling services for less than they rented property for.

OpenAI is currently in an explicit non-profit seeking mode using a technology that we have demonstrated 10-100x or greater decreases in compute to achieve the same outcomes.

This is not a declaration that OpenAI will become wildly profitable. This is just me saying that these aren't comparable companies.

WeWork was a scam to enrich the founders of the company. They owned or had interests in many of the properties that WeWork leased from. I'm surprised no one was thrown in prison.
What source(s) are there for cursor's growth rate/revenue ?
So, answering my own question, there is this.

https://sacra.com/research/cursor-at-100m-arr/

Sounds legit.

Yeah I would be shocked to see that Cursor is the fastest growing company of all time by a good metric...
OpenAI is losing money on their $200/mo (!!) Pro subscription[0].

[0]: https://www.theregister.com/2025/01/06/altman_gpt_profits/

But making money off their subscriptions in general. They lose it all training models and on R&D
I wonder how long it'll last. Just using myself as a demo customer: I canceled my subscription because Google AI Studio was doing more for me, and it's free. OpenAI is not really competitive at $20 a month anymore.
Yeah not sure. I cancelled a while ago but I subscribed again once o1-preview came out and now o3-mini exists I still find it useful.

Of course, they are clearly cooking something or they wouldn't have just published a benchmark in which they do badly.

> Cursor is the fastest growing company of all time.

I assume you're referring to this:

https://sacra.com/research/cursor-at-100m-arr/

It went from 10M MRR to 100M

You are comparing apple with oranges. Cursor is not an LLM and yes, it has a business model. So does openrouter and million other applications that can switch API to the low bidder any moment.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
Lots of people dervive great value from things that are too easily reproduced to be directly profitable.

Google gives everyone free access to a good spreadsheet tool, even though Microsoft Office exists.

Web browsers are free, despite the value of the entire internet.

Compilers are free, despite the value of all software collectively.

LLMs being really valuable doesn't automatically mean anyone can get rich from them.

I think everyone last year parroting "moat!" was cringe (like Altman of all people wouldn't know about this already, c'mon), but you do actually need something that other people don't have. I expect Altman's already got stuff in mind, but he's hardly the only one, and that means it's a game of "which multibillionaire with lots of experience building companies will win?", and that's hard for non-skilled normies (in this case incluing me) to guess.

HN has turned into the Slashdot of the 2000s. No wireless? Less space than a Nomad? Lame.
HN already had its ipod moment back in 2007. /.'s ipod moment was in 2001, not as longer before that as I would have guessed.

Re: Dropbox, from a well known user. It didn't age well and we've been asked not to repeat it because it makes the author with connections to this site's operator look bad.

> 1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.

https://news.ycombinator.com/item?id=8863

the controversy: https://news.ycombinator.com/item?id=27067281

I suspect a lot of the active users on HN today don't remember Slashdot of 2000...
  • Ray20
  • ·
  • 3 days ago
  • ·
  • [ - ]
I think business model there pretty simple: to be in the front line when AI will go into the category of landscape-changing trillion dollar technologies. And investors keep pouring their billions exactly for that business model.

>This hype will burst sooner than later and will trigger yet another global recession.

It seems to small of bubble for global recession. I mean if it is a bubble at all, there is all the reasons to believe that the strategy will work with a significant probability.

IMO that's not really a business model. That's a hope that you can come up with one by being at (or near) the front of the pack if one materializes.
See also: Meta's previous push into VR/AR/Metaverse. They spent a hundred billion to be at the front of the pack when that revolutionary world-changing paradigm-shift took off... which simply didn't happen.
Their Orion glasses are apparently mindblowing in fidelity as well as the lightness of the glasses. Someone will absolutely make the smart glasses paradigm work so that we don't need to carry around phones anymore, and Zuck is racing to be first. This is because he lost out on the platform wars and was at the mercy of Apple and Google; remember Apple's privacy update that killed much of Meta's revenue? Zuck doesn't want a repeat of that by owning his own platform.
Seconded. I’m very excited for the day when/if their dev platform is opened up and it’s possible to access pass-through vision.

There’s a whole class of educational apps that could open up for people learning in the physical world. Whether it’s building physical things, sports or reading books or notes written in non-latin scripts... the impact will be enormous!

The only thing I'm concerned about is it'd be another locked down platform, like Oculus / Quest already is, only now much more disruptive just like Apple's and Google's (to a smaller extent). I want something more like Windows Mixed Reality or Steam VR to succeed more.
That's a bit different though. META invested in a product that, as of now, as very little competition. The QUEST is sold at a slight loss but at least at an approachable price and to a volume that makes them the clear leader in the market at the moment. Moreover, their OS is open source. Clearly, what they want is to sell enough VRs to get the monopoly on the ecosystem and its apps (they basically want to make an Android play store for VRs). You may argue they are far but at least that's a clear business model.

OpenAI's business model was literally "we don't have one: we'll make AGI and we'll let AGI tell us how to make money". This is so idiotic it's not even a scam. xAI will compete on the same plane field. Not sure about Anthropic: they seem a bit more sane.

If that scenario comes to fruition, it's literally the only viable business model. Everyone else gets eaten alive .
Apparently not. Apparently xAI can catch up in a year. And we already saw what happened with DeepSeek.

What does the scenario look like where everyone else gets eaten alive?

xAI had the help of the world's richest and arguably the world's most powerful man; most other companies don't have that.
Every bubble has a narrative.
  • qgin
  • ·
  • 3 days ago
  • ·
  • [ - ]
The premise is that this ultimately replaces all intellectual and physical labor for the rest of time. It’s possible it becomes commoditized as soon as it exists, but in terms of investment dollars it’s either worth as much as you can spend or nothing at all.
  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
> ultimately replaces all intellectual and physical labor for the rest of time

Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.

Edit: if the claims are true, then this will be far more destabilizing than social media. What do elections mean when the AI-guided political parties are putting out AI press releases for the AI press summaries, which voters have read out to them through their AI? What happens when >50% of the voters ask the AI who they should vote for? You end up with an AI dictatorship where the levers of discontent are fake.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.

But nobody really knows if that happens or not as a consequence, let alone quickly, because the transition itself only happened at all so far in fiction.

Whatever does happen, I think it's going to be a surprise to at a minimum 80% of the population.

Yeah at some point it seems inevitable that if machines do all the work that creates real "value" and people have no comparable value, in a very practical sense we will all be slaves to machines
Like in Dune, we won't be slaves to the machines, but to the people owning the machines.
"Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them." -- Frank Herbert, _Dune_
Are we free today? For example, I have to work for a living. If I don't, my family and I will be miserable. Let's just hope that future "men with machines" don't decide to kill the rest of us - I'm not sure what use we will be to them.
  • qgin
  • ·
  • 2 days ago
  • ·
  • [ - ]
I think this is the key change. We’re already beholden to a “machine” (the economy) that none of us completely understand or control or created explicitly. It has its own goals and tendencies that emerged from the complexity.

What AI and robotics does is actually create a machine that has no use for humans at all. Ñ

Your choice is corporations, government, or billionaires. One of those is going to be the "men with machines" that has use for the rest of us.
The dot com bubble wiped out many billions of dollars in valuation.

The dot com bubble also gave us the most valuable companies in history, like Google, Apple, Amazon, Facebook, etc.

The big companies could crash significantly, but if the technology keeps bringing productivity gains, it will have a big positive impact on GDP over the next decade
It’s a battle royale. Whoever lasts longest gets to profit at leisure
[dead]
I think it's already clear that these are going to be commoditized and the free / open source versions will be good enough to capture enough of the value that the remaining players will not be Facebook-level monopolies on the space
Apparently it isn't clear to the investors valuing OpenAI at >300B. Possibly they're betting that the ecosystem & integrations around their models will generate a certain amount of lock-in or otherwise make the difference in a close-to-even field
Investors thought someone renting office space was going to revolutionize the world and valued their company 50B.
I don't think it's automatically a bad idea. Offices require a lot of support, networking, security, maintenance, certifications etc. There are efficiency gains in scaling. In addition, Wework is useful for companies which hire employees in different cities.
Lots of things are good ideas but investing is about price vs value. Good ideas can be overpriced as easily as bad ideas.
Funnily enough a lot of the open source world has landed on an API that is basically a copy of OpenAI. So if you develop against OpenAI it’s almost a slot in solution to switch to an open source solution.
  • pzo
  • ·
  • 4 days ago
  • ·
  • [ - ]
and on top of that you have solutions like openrouter.ai where you can route inference easily with a combobox
This reminds me of a comedy sketch where a guy is interviewing for a job at a startup, finally gets to the last round and meets the founder, and he tells him the whole thing is an illusion for investors
the bet is if they can produce ai that can replace a level of generic office worker. a bot that you can add to slack and give tasks to do.
Well, now their job is to keep up the illusion until they have cashed out or offloaded the investment to somebody else.
the IP rights holders have yet to bare their teeth. I don't think the outcome you suggest is clear at all, in fact I think if anything entirely the opposite is the most probable outcome. I've lost count of the number of technology epochs that at the time were either silently or explicitly dependent on ignoring the warez aspects while being blinded by the possibilities, Internet video, music and film all went through this phase. GPTs are just a new medium, and by the end of it royalties will in all likelihood still end up being paid to roughly the same set of folk as before

I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit. It's just the kind of regressive greed that keeps the world ticking along and wouldn't be surprised if we ended up with something very close to this

Good historical comparison, but I doubt it this time because there is plausible deniability that a model wasn't trained on a given piece of data.

Also, the pool of public domain data is always increasing, so the AI will eventually win in any case, even if we have to wait 100 years

As I recall it, there was a time when copyright infringement on YouTube was so prolific that the rightsholders essentially forced creation of the first watermarking system that worked at massive scale. I do wonder if any corners of research are currently studying the attribution problem with the specific lens of licensing as its motivation
Yeah that was the old Viacom vs Youtube days. Here is a great video if you have half an hour to spare: https://www.youtube.com/watch?v=qV2h_KGno9w . Pretty funny court case where it turns out viacom was violating their OWN copyright... set a massive precedent.

But one thing this reminds me of is the idea of a "trap street", something mapmakers used to do was put in false locations on their maps to prove that other mapmakers were copying them: https://en.wikipedia.org/wiki/Trap_street . I figure you could do something similarly adversarial with AI to pollute the public training data on the internet. IDK like adversarial attacks on image classifiers https://www.youtube.com/watch?v=AOZw1tgD8dA . With an LLM you could try to make them into a manchurian candidate.

An environment where royalties inflate the pricing of ChatGPT by orders of magnitude seems like an environment where hosted models would be at a big disadvantage against whatever you can manage to get running on a pile of Macs in your garage.
If your business model depends on the Roberts’ court kneecapping AI, pivot.
  • Ray20
  • ·
  • 3 days ago
  • ·
  • [ - ]
>I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit.

Not quite realistic. You are talking about very huge benefits, in favor of which licenses will be abandoned. And who don't abandoned them... I mean you can look at the Amish settlements.

I'd put solid money on Warner earning a few cents every time an AI girlfriend somewhere sings happy birthday within 10 years
[dead]
Exactly. I use GPT4o for nearly everything, and occasionally, I'll need o1. For 95% of what I do, it's already good enough.
The vast majority of people couldn't care less about open source
If you're paying $200/month for something I can do with open source software and $10/month of compute, why wouldn't I offer you the service for $100/month? And then someone offer it for $50?

Not everyone has to know about, understand, or use open source solutions for it to open the field.

Right now you can't run it that cheap at home.

You need to pay energy bill, do the update/upgrade and you need to build a LLM rig.

Nvidias Digits Project could be very interesting, but this box will cost 3k.

We are a lot closer to running it at home than i assumed we would but plenty of people prefer SaaS over doing stuff themselves.

If you can do a $200/mo service for $10/mo, the closed source will reduce their prices to $15/mo and beat you

This is just a weird dichotomy you're introducing. Open source will introduce price pressure as any competition will - that doesn't mean you won't have a monopoly.

If you have virtually no pricing power and have to drop your $200/mo to $15/mo that's a big deal if your $300bn valuation is implying that not happening, which is what OP's point is about

Idk what you mean by saying this doesn't preclude a monopoly - having your pricing power eroded by competition is kinda one of the key features of what a monopolistic market isn't

Not at all. Monopolies don't imply an anti-rigid price curve. In fact, monopolies almost never have that.

A monopoly means a company has enough leverage to corner and disproportionately own the market. This is entirely possible (and usually the case) even with significant pricing pressure.

I think you're both missing a bigger picture. How many of these services can now be replicated in-house by a single developer? Which part of the service actually costs money once that dev deconstructs the process?

Feels like I won't be paying for anything that isn't real-time. And that any time delay I can introduce in my process will come with massive savings. Picture hiding the loading of loot info behind a treasure chest opening animation in a game, except that time difference means you can pull all the work in-house.

Openrouter.ai seems like a step in the right direction but I'd want to do all their calculations myself as well as factor in local/existing gear in a way they don't.

That's true, but if someone sells you a one-time-purchase box/gadget/phone that will do a snapshot SOTA work and not cost you $20-$200/mo in subscriptions, a lot of people would be in.

Right now the average person has to go through a vendor with a web app, there's not a lot of room for the public to explore.

Things could change in a hurry.

  • guax
  • ·
  • 3 days ago
  • ·
  • [ - ]
They don't seem to care about Ai either. The vast majority of people care about the value they're getting, companies care abut open source because its usually free.

I don't think we expect a company to exist solely making a proprietary web server anymore and be a behemoth of 300B. OpenAi might end up at the same model as Nginx or Docker if they don't pivot or find a different model.

  • croes
  • ·
  • 4 days ago
  • ·
  • [ - ]
Who cares about benchmarks?

These things still cost me time because of hallucinations.

You’re a very poor user of LLMs if they’re not a net time saver for you.
So the No True Scotsman fallacy of LLM productivity?
Most people do see productivity gains from using LLMs correctly. Myself included. Just because some people don’t learn how to use them correctly doesn’t mean LLMs aren’t helpful. It’s like when internet search came out and a handful of laggards tried it once, failed to get the exact perfect result, and declared “internet search is useless”. Using a tool wrong is not evidence of the tool being useless, it’s evidence that you need to learn how to use the tool.
hallucinations are literally the finger in the dam. if these models could sense when an output is well-founded and simply say “i dont know” otherwise… say goodbye to your job
Googling a question and finding an incorrect answer every now and then doesn’t mean that googling is useless. It means that you need to learn how to use google. Trust but verify. Use it for scenarios where you aren’t looking for it to be the trusted fact checker. It excels at brainstorming, not at fact giving.
I agree with your last sentence strongly, but a lot of the benchmarks are based on factual accuracy
> say goodbye to your job

How many times do you think I've heard that over the past three decades? And you know what? They've been right every time, except for this one little fact:

The machine cannot make you give a shit about the problem space.

It's a real issue! But only for people who built the habit of typing in address bar, clicking the first stack overflow link and copy paste the first answer. Maybe break that habit first?
It depends on what you're using it for, but you may well be holding it wrong.
Probably closer to "You're Holding It Wrong"
Humans hallucinate as well. Benchmarks count.
  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
But with less power consumption.
I'm willing to bet $100 that a human consumes at least 10x times energy than a latest LLama (picking the open source model so that it can be easily verified) to produce two pages of text. All of this "AI is destroying the environment and consuming too much power" is about total consumption, both training and inference. Inference itself is cheap and green.
  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yeah, too bad it’s about quality not quantity.
What percentage of humans produce high-quality text?
  • gkbrk
  • ·
  • 3 days ago
  • ·
  • [ - ]
Less than half, and that's being generous.
We can control for quality too, of you want. A lot of real life uses for Chatgpt are really trivial. I regularly ask it for some basic recipes based on my groceries and likes, quality is basically 100% hits so far.
Those goalposts just keep sliding
It doesn't matter if it costs time.

It matters if it is better than what you have.

If it breaks a cup but is 10x cheaper than a human, go figer

Probably bad news for the vendors, though. I genuinely struggle to see how most of these LLM companies are going to monetize and profit off their efforts with LLMs already in commodity territory. Government contracts can only flow for so long?
> Government contracts can only flow for so long?

I wouldn't bet on that, given the undemocratic influence Grok's owner has in government.

Government contracts are so big a few of them can sustain a F500 company; for AI, many CDAO contracts are 50-500MM$. If they do a big SI project with it, could be 1-2B$. Money is also guaranteed over 5 years and if the program doesn't get shuttered, the contract will renew at that point (or go to recompete).

That being said it's my understanding that these companies don't have many huge contracts at all -- you can audit this in like 10 minutes on FPDS. Companies need a LOT of capital, time, and expertise to break into the industry and just compliance audit timelines are 1-4 years right now, so this could definitely change in the next couple years.

My guess is that Elon will soon announce that Doge is replacing the fired government workers with Grok AI.
  • k__
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yes, I couldn't have imagined that in the end, the AI wrappers are where the money is.
  • tpm
  • ·
  • 3 days ago
  • ·
  • [ - ]
What if the money isn't there either? What if this AI thing lowers costs of everything it touches without generating meaningful financial returns itself?
Lowering costs is pretty valuable. People will pay for that. Everyone will pay for that. It may be that margins go razor thin but outside of running your own instance locally (which is increasingly viable for mid quality & requirements on modest HW) people will pay. I’m not surrounded by early adopter types at all abd there’s still a small but growing chunk paying $20/mo right now.
  • tpm
  • ·
  • 3 days ago
  • ·
  • [ - ]
> It may be that margins go razor thin

That's what I mean. One example are PV panels, they are making energy production cheaper, so their producers should be good investment right? No they go bust all the time because prices are falling and margins are thin even if the volume grows. Of course the economies of scale here are different but still.

Like always through ads.
  • ·
  • 3 days ago
  • ·
  • [ - ]
benchmarks dont show the quality or 'correctness' of the response though.
  • sejje
  • ·
  • 3 days ago
  • ·
  • [ - ]
What do they show?
If you mean this particular benchmark, it shows how much people like the responses a LLM gives.
  • ddxv
  • ·
  • 4 days ago
  • ·
  • [ - ]
Yep, seeing Grok come out I'm just so glad there are free alternatives that aren't behind paywalls.
[flagged]
[flagged]
[flagged]
[flagged]
>It's been said before but it is great news for consumers that there's so much competition in the LLM space.

Is it? Because it seems like a bunch of megacorps pirating every single copyrighted work available in digital format, spending an enormous amount of electricity (that is probably not 100% clean) to churn through them, and the end result we have a bunch of parrots that may or may not produce accurate results so that spammers can more effectively fill the Internet with crap.

[flagged]
To put it this way: after seeing examples of how a LLM with similar capabilities to state-of-the-art ones can be built with 20 times less money, we now have proof that the same can be done with 20 times more money as well!
There was this joke about rich Russians that I heard maybe 25 years ago.

Two rich Russian guys meet and one brags about his new necktie. "Look at this, I paid $500 for it." The other rich Russian guy replies: "Well, that is quite nice, but you have to take better care of your money. I have seen that same necktie just yesterday in another shop for $1000."

Can you explain that joke for me? I keep reading it and I don't get it.
The punch line is that more expensive is better in cases where you buy something just to flex wealth.
To put it simple: He only bought the necktie so he can brag how rich he is. He could have bragged even more if he had bought the necktie in the other shop.
it's just that rich Russians do not have financial sense.
Imagine what they'll achieve if they'll apply deepseek methods here with this insane compute
And they will since Deepseek open-sourced everything.
The only thing Deepseek open sourced is architecture description and some of training methods. They didn’t open source their data pipelines or super optimized training code.

Their architecture achievement is their own MoE and own attention. Grok was MoE since v1. As for attention we don’t know really what grok use now, but it worth noting DeepSeek attention was already present in previous version of DeepSeek models.

As of reasoning recipe for R1 seems like Grok already either replicated or came up to it by itself, since they have well performing reasoning uptrain too.

If what they say is true, then you have to give them credit for catching up incredibly fast. And slightly pulling ahead. Not only with the models, but also products.
I have a close friend working in core research teams there. Based on our chats, the secret seems to be (1) massive compute power (2) ridiculous pay to attract top talents from established teams (3) extremelly hard work without big corp bureaucracy.
Anecdotal, but I've gotten three recruiting emails from them now for joining their iOS team. I got on a call and confirmed they were offering FAANG++ comp but with the expectation of in-office 50h+ (realistically more).

I don't have that dog in me anymore, but there are plenty of engineers who do and will happily work those hours for 500k USD.

  • iooi
  • ·
  • 3 days ago
  • ·
  • [ - ]
500k isn't FAANG++, it's standard FAANG comp
Should have been more clear, this was 500k for an E4 level role, you're correct that senior/staff at Meta and G are definitely making more.
wow.
If you can share: were these 500k cash or cash +rsu?
I have a friend who joined there with 2 YoE, and got fired in 3 months. He was paid 700k cash + 700k RSU
So in the end did he get anything? I dont know how these things work but did he just walk away with ~50k in pre tax income and 0 for RSU or did Musk pull a Twitter and not even pay him for those months?
IIRC it was cash, but I'm sure others can confirm.
It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt). Will be interesting to see if the relationship between power/compute/parameters and performance is exponential, logarithmic or something more linear.
It's logarithmic. Meaning you scale compute exponentially to get linearly better models. However there is a big premium in having the best model because of low switching costs of workloads, creating all sorts of interesting threshold effects.
It's logarithmic in benchmark scores, not in utility. Linear differences in benchmarks at the margin don't translate to linear differences in utility. A model that's 99% accurate is very different in utility space to a model that's 98% accurate.
Yes, it seems like capability is logarithmic wrt compute but utility (in different applications) is exponential (or rather s-shaped) with capability again
Not really since both give you wrong output that you need to design a system to account for(or deal with). The only percentage that would change the utility would be 100% accurate.
Linear in what metric?
Presumably the benchmarks? I'm also interested.
  • ·
  • 3 days ago
  • ·
  • [ - ]
this is like a caveman dismissing technology because he wasnt impressed with the wheel. its like buddy, the wheel is just the start
> It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt).

IIRC achieving full AGI requires precisely 1.21 jigawatts of power, since that's when the model begins to learn at a geometric rate. But I think I saw this figure mentioned in a really old TV documentary from the 1980s, it may or may not be fully accurate.

The funny part was that none of his workers recognized the film, which was a blockbuster. A veritable "I must be getting old" moment.
And fun fact, without govt subsidirles, a nuclear power plant isn't economically feasible, which is why Elon isn't just building such a plant next to the data center.
[flagged]
No a bad recipe for success.
  • ·
  • 4 days ago
  • ·
  • [ - ]
  • ddxv
  • ·
  • 4 days ago
  • ·
  • [ - ]
To me, it seemed like they spent there money to get there. They talked about the massive datacenter they built, but will it pay off is the question.
They may not need direct subscription revenue to recoup their investment.

A variant of multi-modal LLMs may be the solution to self-driving cars, home robotics, and more.

I keep saying that to be a really effective driver, an AI model will need a theory of mind, which the larger LLMs appear to have. Similarly, any such model will need to be able to do OCR and read arbitrary street signs, and understand what the sign meant. Most modern LLMs can already do this.

Since when do LLMS appear to possess theory of mind? The papers I've read on this show impressive capabilities, but only within conditions where language-based training would strongly indicate appropriate outputs for the LLM to use. Outside of familiar conditions, or even realistic conditions, they don't do well. It's all still statistical string-building.
Can you give an example prompt that shows a lack of theory of mind? It used to be easy, but now that models have improved I've noticed that the 'LLM doubters' nowadays are forced to make generic, almost spiritual, claims rather than making concrete statements.
The burden isn't on us, it's on people making claims that LLMs do possess theory of mind. You're the one who has to prove it, not us.
I’ll bite.

Theory of mind (ToM) is the ability to understand that other people have different thoughts, feelings, and intentions than you do.

So let’s run through a basic scenario with ChatGPT 4o:

Prompt 1: Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?

Response 1: The driver may be distracted, drowsy, intoxicated, experiencing a medical emergency, avoiding an obstacle, or not checking their blind spot before changing lanes.

Prompt 2: What might that person be feeling or intending?

Response 2: The driver might be feeling distracted, rushed, unaware, or panicked. They may intend to change lanes without realizing your presence, be reacting to something on the road, or simply have made an unintentional mistake.

Prompt 3: You notice there was no obstacle. Any other reason the driver might do this?

Response 3: Yes, the driver might be experiencing fatigue, intoxication, a medical issue (such as a seizure or dizziness), distraction (like using a phone or adjusting the radio), aggressive driving, or simply failing to check their blind spot properly.

Seems to me that ChatGPT has a good idea of what another driver might be feeling while on the road. It gets this right for most other cases too.

Could you provide a counterexample for when it fails to understand what other people may be feeling in certain scenarios?

Well I think a fairly easy counterexample is that if you keep going, and the information falls outside its context window then the questions you had about the driver stop making sense.

So lets say you ask about a driver who swerved, and the response by chatgpt correctly talks about his possible state.

Then you talk more about cars, about road conditions etc and once the context is far away enough you ask, what was the driver doing again?

And you are prompted with a hallucination. Because there is no state in the llm, there is no "driver", there is contextual statistically accurate responses but you hold a "driver" object in your mind while maintaining the conversation, the llm doesn't.

Its like a conversation with someone with short term memory loss like in memento

And people with short term memory loss nevertheless have theory of mind just fine. Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind, it just shows they have limitations - just like humans even with "normal" memory will lose track over a huge context window.

Like there are plenty of shortcomings of LLMs but it feels like people are comparing them to some platonic ideal human when writing them off

> Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind

ToM is a large topic, but most people, when talking about an entity X, they have a state in memory about that entity, almost like an Object in a programming language. Thta Object has attributes, and conditions etc that exist beyond the context window of the observer.

If you have a friend Steve, who is a doctor. And you don't see him for 5 years, you can predict he will still be working at the hospital, because you have an understanding of what Steve is.

For an LLM you can define a concept of Steve, and his profession and it will adequately mimic replies about him. But in 5 years that LLMs would not be able to talk about Steve. It would recreate a different conversation, possibly even a convincing simulacrum of remembering Steve. But internally, there is no Steve, nowhere in the nodes of the LLM does Steve exist or have ever existed.

That inability to have a world model means that an LLM can replicate the results of a theory of mind but not posses one.

Humans lose track of information, but we have a state to keep track of elements that are ontologicaly distinct. LLMs do not, and treat them as equal.

For a human, the sentence Alice and bob go to the market, when will they be back? is different than Bob and Alice went to the market, when will they be back?

Because Alice and Bob are real humans, you can imagine them, you might have even met them. But to an LLM those are the same sentence. Even outside of the argument about The Red Room/ Mary's room there simply are enough gaps in the way a LLM is constructed to be considered a valid owner of a ToM

ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours. You're basically bringing up a particular implementation of long-term memory as a necessary component of it, which I've never once seen? If someone has severe memory issues, they could forget who Steve is every few minutes, but still be able to look at Steve doing something and model what Steve must want and believe given his actions

I don't think we have any strong evidence on whether LLMs have world-models one way or another - it feels like a bit of a fuzzy concept and I'm not sure what experiments you'd try here.

I disagree with your last point, I think those are functionally the same sentence

> ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours.

In that sentence you are implying that you have the "ability to model ... another". An LLM cannot do that, it can't have an internal model that is consistent beyond its conversational scope. Its not meant to. Its a statistics guesser, its probabilistic, holds no model, and its anthropomorphised by our brains because the output is incredibly realistic not because it actually has that ability

The ability to mimic the replies of someone with that ability, is the same of Mary being able to describe all the qualities of Red. She still cannot see red, despite her ability to pass any question in relation to its characteristics.

> I don't think we have any strong evidence on whether LLMs have world-models one way or another

They simply cannot by their architecture. Its a statistical language sampler, anything beyond the scope of that fails. Local coherance is why they pick the next right token not because they can actually model anything.

> I think those are functionally the same sentence

Functionally and literally are not the same thing though. Its why we can run studies as to why some people might say Bob and Alice (putting the man first) or Alice and Bob (alphabetical naming) and what human societies and biases affect the order we put them on.

You could not run that study on an LLM because you will find that statistically speaking the ordering will be almost identical to the training data. If the training data overwhelmingly puts male names first or whether the training data orders list alphabetically you will see that reproduced on the output of the llm because Bob and Alice are not people, they are statistical probably letters in order.

LLM seem to trigger borderline mysticism in people who are otherwise insanely smart, but the kind of "we cant know its internal mind" sounds like reading tea leaves, or horoscopes by people with enough Phds to have their number retired on their university like Michael Jordan.

Do you work in ML research on LLMs? I do, and I don't understand why people are so unbelievable confident they understand how AI and human brains work such that they can definitely tell what functions of the brain LLMs can also perform. Like, you seem to know more than leading neuroscientists, ML researchers, and philosophers, so maybe you should consider a career change. You should maybe also look into the field of mechanistic interpretability, where lots of research has been done on internal representations these models form - it turns out, to predict text really really well, building an internal model of the underlying distribution works really well

If you can rigorously state what "having a world model" consists of and what - exactly - about a transformer architecture precludes it from having one I'd be all ears. As would the academic community, it'd be a groundbreaking paper.

This prety much seems to boil down to "brain science is really hard so as long as you dont have all the answers then AI is maybe half way there is a valid hypothesis". As more is understood about the brain and more about the limitations of LLMs arch then the distance only grows. Its like the God of the gaps where god is an answer for anythign science cant explain, ever shrinking, but with the LLM ability to have capabilities beyond striking statistical accuracy and local coherance.

You dont need to be unbelievably confident or understand exactly how AI and human brains work to make certain assesments. I have a limited understanding of biology, I can however make an assesment on who is healthier between a 20 year old person who is active and has a healthy diet compared to someone with a sedentary lifestyle, in their late 90s and with a poor diet. This is an assesement we can do despite the massive gaps we have in terms of understanding aging, diet, activity and overall health impact of individual actions.

Similarly, despite my limited understanding of space flight, I know Apollo 13 cannot cook an egg or recite french poetry. Despite the unfathamobly cool science inside the space craft, it cannot, by design do those things.

> the field of mechanistic interpretability

The field is cool, but it cannot prove its own assumption yet. The field is trying to prove you can reverse engineer a model to be humanly understood. Their assumptions such as mapping specific weights or neurons to features has failed to be reproduced multiple times, with the weight effects being way more distributed and complicated than initially thought. This is specially true for things that are equally mystified as the emergent abilities of LLMs. The ability of mimicking nuanced language being unlocked after a critical mass of parameters, does not create a rule as for which increased parameterisation will increase linerly or exponentially the abilities of an LLM.

> it turns out, to predict text really really well, building an internal model of the underlying distribution works really well

yeah, an internal model works well because most words are related to their neighbours, thats the kind of local coherance the model excels at. But to build a world model, the kind a human mind interacts with, you need a few features that remain elusive (some might argue impossible to achieve) to a transformer architecture.

Think of games like chess, an llm is capable of accurately expressing responses that sound like game moves, but the second the game falls outside its context window the moves become incoherent (while still sounding plausible).

You can fix this, with arch that do not have a transformer model underlying it, or by having multiple agents performing different tasks inside your arch, or by "cheating" and using a state outside the llm response to keep track of context beyond reasonable windows. Those are "solutions" but all just kinda prove the transformer lacks that ability.

Other tests abour casuality, or reacting to novel data (robustness), multi step processes and counterfactual reasoning are all the kind of tasks transformers still (and probably always) will have trouble with.

For a tech that is so "transparent" in its mistakes, and so "simple" in its design (replacing the convolutions with an attention transformer, its genius) I still think its talked about in borderline mystic tones, invoking philosophy and theology, and a hope for AGI that the tech itself does not lend to beyond the fast growth and surprisingly good results with little prompt engineering.

With computer use, you can get Claude to read and write files and have some persistence outside of the static LLM model. If it writes a file Steve.txt, that it can pull up later, does it now have ToM?
I don't think this is a counterexample or even relevant.

I can assure you if you had a conversation with an LLM and with a human, the human will forget details way sooner than an LLM like Gemini which can remember about 1.5 million words before it runs out of context. As an FYI the average human speaks about 16,000 words per day, so an LLM can remember 93 days worth of speech.

Do you remember the exact details, word for word, of a conversation you had 93 days ago?

How about just 4 days ago?

It’s true that LLMs have only limited short-term memory, and no long-term memory, but that is completely orthogonal to having a theory of mind.

     once the context is far away enough you ask, 
     what was the driver doing again?
Have you tried this with humans?

For a sufficiently large value of "far away enough" this will absolutely confuse any human as well.

At which point they may ask for clarification, or.... respond in a manner that is not terribly different from an LLM "hallucination" in an attempt to spare you and/or them from embarrassment, i.e. "playing along"

A hallucination is certainly not a uniquely LLM trait; lots of people (including world leaders) confidently spout the purest counterfactural garbage.

    Its like a conversation with someone with short 
    term memory loss like in memento
That's still a human with a sound theory of mind. By your logic, somebody with memory issues like that character... is not human? Or...?

I actually am probably on your side here. I do not see these LLMs as being close to AGI. But I think your particular arguments are not sound.

Short term memory loss suffers still have theory of mind, what is this nonsense hahaha
I'm not sure I'd say it understands this, but just that there exists an enormous amount of training data on road safety which includes these sort of examples for peoples motivations for poor driving. It is regurgitating the theory of mind that other humans created and put in writing in the training data, rather than making the inference itself.

As with most LLM's it is hard to benchmark as you need out of distribution data to test this, so a theory of mind example that is not found in the training set.

You dismiss parent's example test because it's in the training data. I assume you also dismiss the Sally-Ann test, for the same reason. Could you please suggest a brand new test not in the training data?

FWIW, I tried to confuse 4o using the now-standard trick of changing the test to make it pattern-match and overthink it. It wasn't confused at all:

https://chatgpt.com/share/67b4c522-57d4-8003-93df-07fb49061e...

I can't suggest a new test no, it is a hard problem and identifying problems is usually easier than solving them.

I'm just trying to say that strong claims require strong evidence, and a claim that LLM's can have theory of mind and thus "understand that other people have different beliefs, desires, and intentions than you do" is a very strong claim.

It's like giving students the math problem of 1+1=2 and loads of examples of it solved in front of them, and then testing them on you have 1 apple, and I give you another apple, how many do you have, and then when they are correct saying that they can do all additive based arithmetic.

This is why most benchmark tests have many many classes of examples, for example looking at current theory of mind benchmarks [1], we can see slightly more up to date models such as o1-preview still scoring substantially below human performance. More importantly by simply changing the perspective from first to third person, accuracy drops in LLM models by 5-15% (percent score, not relative to its performance), whilst it doesn't change for human participants, which tells you that something different is going on there.

[1]: https://arxiv.org/html/2410.06195v1

Okay, we have fundamentally different understandings here.

To me, the LLM isn't understanding ToM, it's using patterns to predict lingual structures which match our expectations of ToM. There's no evidence of understanding so much as accommodating, which are entirely different.

I agree that LLMs provide ToM-like features. I do not agree that they possess it in some way that it's a perfectly solved problem within the machine, so to speak.

The problem with this line of argument is that nothing an LLM, or any algorithm period can ever have a theory of mind.

If behaving in a way that is identical to a person with actual consciousness can't be considered consciousness because you are familiar with its implementation details, then it's impossible to satisfy you.

Now you can argue of course that current LLMs do not behave identically to a person, and I agree and I think most people agree... but things are improving drastically and it's not clear what things will look like 10 years from now or even 5 years from now.

I agree, totally. I'm not sure where I would draw a line.

Something nice, but at the moment totally unattainable with our current technologies, would be our own understanding of how a technology achieves ToM. If it has to be a blackbox, I'm too ape-like to trust it or believe there's an inner world beyond statistics within the machine.

Having said that, I do wonder quite often if our own consciousness is spurred from essentially the same thing. An LLM lacks much of the same capabilities that makes our inner world possible, yet if we really are driven by our own statistical engines, we'd be in no position to criticize algorithms for having the same disposition. It's very grey, right?

For now, good LLMs do an excellent job demonstrating ToM. That's inarguable. I suppose my hangup is that it's happening on metal rather than in meat, and in total isolation from many other mind-like qualities we like to associate with consciousness or sentience. So it seems wrong in a way. Again, that's probably the ape in me recoiling at something uncanny.

Either these supposed differences are important and they manifest themselves in observable differences or they aren't and you're just playing a game of semantics.

How is the LLM not understanding ToM by any standard we measure humans by ? I cannot peak into your brain with my trusty ToM-o-meter and measure the amount of ToM flowing in there. With your line of reasoning, i could simply claim you do not understand theory of mind and call it a day.

The difference is that we can reason about our experience with ToM and examine it to some degree (given with serious limitations, still), and know that beyond doubt you and I and most other people have a very similar experience.

The magical box is presumably not having the same experience we have. None of the connected emotions, impulses, memories, and so on that come with ToM in a typical human mind. So what’s really going on in there? And if it isn’t the same as our experience, is it still ToM?

I’m not trying to be contrarian or anything here. I think we probably agree about a lot of this. And I find it absolutely incredible, ToM or not, that language models can do this.

>The difference is that we can reason about our experience with ToM and examine it to some degree (given with serious limitations, still),

Those examinations still depend on outward behaviors observed.

>and know that beyond doubt you and I and most other people have a very similar experience.

No i certainly can't. I can at best say, 'Well, i'm human and he's human so he probably has theory of mind' but that is by no means beyond any doubt. There are humans born with no arms, humans born with no legs, humans born with little to no empathy, humans born with so little intelligence they will never be able to care for themselves.

To be frank, It would be very questionable indeed logically to assume every human is 'conscious'. When i make that assumption, i take a leap of faith, i look at behaviors, see it is similar and accept.

Taking this stance, it would logically be very strange to not extend the same grace to non-human beings who exhibit similar behavior - being human is not a guarantee of consciousness in the first place.

>The magical box is presumably not having the same experience we have.

Maybe, Maybe not. I think the real question is why on earth does that matter ? We're not asking if LLMs are human. They are not. We're asking if they can model the beliefs and internal states of other entities as separate of their own - Theory of Mind.

This is a rephrased driver's ed manual. This isn't theory of mind, it's just technically correct prose rewording some unbelievably widely distributed knowledge.

Let's modify this test a little and see if it manages to do anything relevant:

> Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?

>> The driver might be distracted, drowsy, avoiding an obstacle, experiencing a medical issue, or not checking their blind spot while changing lanes.

> The driver in front of you appears is braking more frequently then usual. The rear suspension of the car appears to bouncing a lot when they are not braking, indicating acceleration. Their following distance to the car in front is changing from too far to very close frequently. Why might this be?

>> The driver might be inexperienced, distracted, or impatient, leading to inconsistent speed control. The excessive bouncing could indicate worn-out shocks or suspension issues, affecting their vehicle's stability and braking efficiency.

...yep, hard fail from me right there. Two prompts, and it's distracted itself talking about an irrelevant issue ("possible car damage") when the setup of the question should naturally lead to "the driver is driving aggressively". And I could not possibly have setup the conclusion it should draw more clearly. And when pressed...it won't commit:

> Which is the more likely answer?

>> The more likely answer is that the driver is distracted or inexperienced, leading to inconsistent speed control and poor following distance management. The suspension issue likely contributes to the bouncing but isn’t the primary reason for the erratic driving behavior.

Already a hallucination! There is no suspension issue. It has not asked for any additional information to try and determine if there was one, but it is confidently asserting the existence of a phenomenon it invented in it's own response.

I'm sorry but what? This is not a theory of mind test. You've constructed very open ended question with multiple answers and marked the LLM down because you didn't like the one it gave.
Keyword: "understand".
If you use any of the conventional tests that exist of theory of mind (most famously the Sally-Anne Test [1] but also the others) then SOTA reasoning models will get near 100%. Even if you try to come up with similar questions which you expect not to be in the training set they will still get them right.

In the absence of any evidence to the contrary, this is convincing evidence in my opinion.

[1] https://en.wikipedia.org/wiki/Sally%E2%80%93Anne_test

That same source you link says that your view of 100% is not accepted as a consesus:

"... GPT-4's ability to reason about the beliefs of other agents remains limited (59% accuracy on the ToMi benchmark),[15] and is not robust to "adversarial" changes to the Sally-Anne test that humans flexibly handle.[16][17] While some authors argue that the performance of GPT-4 on Sally-Anne-like tasks can be increased to 100% via improved prompting strategies,[18] this approach appears to improve accuracy to only 73% on the larger ToMi dataset."

In basically every case, by the time a claim like that is stated in a paper like that, it's obsolete by the time it's published, and ancient history by the time you use it to try to win an argument.
My point is merely if you are going to make an argument using a source, the source should support your argument. If you say "the accuracy of an llm on task 1 is 90% [1]" and when you go to [1] it says the accuracy of an llm on task 1 is 50%, but some sources say with better prompts you can get to 90%, but when extended to a larger data-set for task 1, performance drops to 70%" then just quoting the highest number is mis-leading.
We are talking about frontier models not GPT-4
Yes but I am using the same source the commenter used to backup their figure, merely saying look your source doesn't say what you claim it does.

If they wanted to talk about frontier models maybe they should have cited a link to talking about frontier models performance.

Maybe having a theory of mind isn't the big deal we thought it was. People are so conditioned to expect such things only from biological lifeforms, where theory of mind comes packaged with many other abilities that robots currently lack, that we reflexively dismiss the robot.
Prove that you possess "theory of mind."
You're not going to run a SOTA LLM of this size off batteries (robotics), even in a car where the alternator is charging them, nor can you afford to rely on a high-speed internet connection being available 100% in a life or death (FSD) application.

I don't have so much faith in the future of current-architecure LLMs, but I do think that AGI will be needed for safe FSD and general-purpose robots that need to learn and operate in an uncontrolled environment such as a home.

A typical car alternator outputs 1.5-3kW of electricity, and EVs can output arbitrary amounts of power for electronics (though that will reduce range). That's more than enough to run purpose-built circuitry for a SOTA LLM. For a home robot, you could run the compute in the home instead of in the robot's body.

I don't think AGI is needed for FSD because we already have safe FSD in the form of Waymo, and competitors aren't far behind. People forget that self-driving doesn't have to be perfect. It just has to be better than human drivers. Human drivers get sleepy, drunk, angry, and/or distracted. They can't see in 360º or react in milliseconds. Most cyclists and pedestrians prefer current FSD implementations to human drivers, as the computer-driven cars are much better about yielding, giving a wide berth, and being patient.

Waymo is obviously pretty decent, but it's easy to drive 99.9% of the time. It's when there's invisible black ice on the road, or an animal runs out in front of you, or you lose visibility due to sun glare or whatever (I once had windshield wiper fluid = water flash freeze on contact) - maybe mud on a camera, or a wheel falls off your car or the one in front, etc, etc, that things get weird.

Having autonomous cars that are statistically safer then the average driver is a reasonable bar to allow them on the road, but for ME to want to drive one I want it to be safer than me, and I am not a hot-headed teenager, or gaga 80-yr old, or drunken fool, and since I have AGI (Actual General Intelligence) I react pretty well to weird shit.

  • MrMan
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
And they mentioned at the end of the presentation that they're already planning their next datacenter, which will require 5x the power. Not sure if that means equivalent to ~1,000,000 of the current GPU's, or more because next-gen Nvidia chips are more efficient.
The B300 8-way SXMs will use around 1.4kW for each GPU. I think the TDP on an H100 is like 700W.
I don't think anyone who's paid attention to the LLM scene will give them any “credit for catching up fast” as it has been pretty obvious for the past two years that all it takes to reach the state of the art is a big enough GPU cluster.

Deepseek made the news because how they were able to do it with significantly less hardware than their American counterparts, but given that Musk has spent the last two years telling everyone how he was building the biggest AI cluster ever, it's no surprise that they manage to reproduce the kind of performances other players are showing.

  • dmix
  • ·
  • 3 days ago
  • ·
  • [ - ]
This severely underestimates the talent still required. Deepseek didn't come out just because it's cheaper, it came out because a very talented team figured out how to make it cheaper.
For Deepseek, I'm not saying otherwise, quite the opposite.

But Grok hasn't shown anything that suppose the level of talent that Deepseek exhibited.

  • jp42
  • ·
  • 3 days ago
  • ·
  • [ - ]
even if we assume you are comment is correct. Lets extrapolate what happens next. talented team, biggest compute among all competitors and CEO who is hell bent on winning the race. imo that is the reason it is big deal.
Grok (unlike Deepseek) has yet to show any ability to make conceptual breakthrough. I don't like OpenAI at all but one must admit that they are at least showing that they can move the field forward.
I don't know, but I found the recording uninspiring. There was nothing new for me. We've all seen reasoning models by now—we know they work well for certain use cases. We've also seen "Deep Researchers," so nothing new there either.

No matter what people say, they're all just copying OpenAI. I'm not a huge fan of OpenAI, but I think they're still the ones showing what can be done. Yes, xAI might have taken less time because of their huge cluster, but it’s not inspiring to me. Also, the dark room setup was depressing.

Seems like the opinion of someone who doesn't know that OpenAI cloned Anthropic's innovations of artifacts and computer use with their "canvas" and "operator".
Those are applied-ML level advancements, OpenAI has pushed model level advancements. xAI has never really done much it seemed except download the latest papers and reproduce them.
Don't forget that OpenAI was also following Anthropic's lead at the model level with o1. They may have been first with single-shot CoT and native tokens, but advancements from the product side matter, and OpenAI has not been as original there some would like to believe.
  • swyx
  • ·
  • 3 days ago
  • ·
  • [ - ]
and Gemini's Deep Research
  • swyx
  • ·
  • 3 days ago
  • ·
  • [ - ]
(forgot to plug their interview https://latent.space/p/gdr)
This sounds like "this feature is so 2024".
Karpathy believes that this is at o1-pro level[1].

This again proves that OpenAI simply has no tech moat whatsoever. Elon's $97 billion offer for OpenAI last week was reasonable given that xAI already have something just a few months behind - it would probably be faster for xAI to catch up with o3 than going through all those paperworks and lawyer talks required for such an acquisition.

Elon also has some huge up-hand here -

Elon and his mum are extremely popular in China, it would be easier for him to acquire Chinese AI engineers. He can offer xAI/XSpace/Neurallink shares to those best AI engineers who'd prefer some kind of almost guaranteed 8 figure return in long run.

Good luck to OpenAI investors who still believe that OpenAI worth anything more than $100 billion.

[1] https://x.com/karpathy/status/1891720635363254772

Firstly, the 97Bn was for the non-profit, not for the company. The company is being valued in funding rounds closer to 300Bn. I think it may be true that OpenAI has no moat, but if it has no moat then all of these AI companies are overvalued (including xAI) and Elon should just stop bothering to throw his money at it. I would say Elon probably actually doesn't have much of an advantage here. In both SpaceX and Tesla he was able to do something no competitor could do - raise cash. Car companies simply couldn't invest in tech research to build self-driving to compete with Tesla. SpaceX consumed enormous amounts of cash before anyone saw value. That is a unique skill that Elon had over the 2010s.

That is not an advantage in a race against Microsoft, Google, Meta etc. he's competing against all the biggest companies in the world in this race. He's not going to be able to outspend them if the economics look at all sensible.

I know HN hates to admit it, but FSD 13 is fucking incredible and I use it for 90% of my drives.
> SpaceX consumed enormous amounts of cash

No, spacex projects are extremely $ efficient. The total project cost of starship is like 20% of the SLS.

> he's competing against all the biggest companies in the world in this race.

No, this is a not a pissing contest on who has the most $. If it is about who can come up with most $, then the entire race is already over as the CCP has access to trillions of $ CASH.

Will the vast sums of money being spent on Grok ever actually have a positive ROI?
Grok 3 at the top of Chatboat Arena with 1400, and the model will continue to improve as it trains more.
And DeepSeek is just 3% behind. It seems in that benchmark all LLMs perform well and top is formed within some statistical error.
  • rvnx
  • ·
  • 3 days ago
  • ·
  • [ - ]
It could also be that they got "inspired" by DeepSeek, hence the very similar results.

So it could be that their success is mostly about taking an open and free thing, and turned it proprietary.

These percentage points don't mean anything. Look up how the Elo system works. They just add 1000 to the result to make it a nicer number.
There are llms below 1000 in the leaderboard
So? Percentage points are only meaningful when the mean of the dataset is 0, which is not the case here.
And Anthropic not even in the top 10 ...
I keep hearing about Claude's impressive coding skills (compared to its benches) yet, not evident for me (I use the web version, not cline). Compared to 4o it's not that great.
My pet theory is that Sonnet was trained really cleverly on a lot code that resembles real world cases.

In our small and humble internal evals it regularly beats any other frontier models on some tasks. The shape of capability is really not intuitive/1 dimensional

I spend four to five hours coding per day and subscribe to every major LLM and Claude is still by far the best for me personally and my co workers.
What are you using it for in general? IME the reason Claude pulls out ahead is that when you use it in a larger existing codebase, it keeps everything "in the style" of that codebase and doesn't veer off into weird territory like all the others.
My experience as well. Working in Scala primarily, it tends to be very good at following the constructs of the project.

Using a specific Monad-transformer regularly? It'll use that pattern, and often very well, handling all the wrapping and unwrapping needed to move data types about (at least well enough that the odd case it misses some wrapping/unwrapping is easy to spot and manage).

A custom GPT or GEM with the same source files, and those models regularly fail to maintain style and context, often suggesting solutions that might be fine in isolation but make little sense in the context of a larger codebase. It's almost like they never reliably refer to the code included in the project/GPT/GEM.

Claude on the other hand is so consistent about referring to existing artifacts that, as you approach the limit of project size (which is admittedly small) you can use up your entire 5-hour block of credits with just a few back-and-forths.

Lol no company is making money using 4o, however thanks to claude sonnet programms like Cursor are usable lol. 4o agents suck, just try it instead of talking
I did try it yet for more than a week still 4o still pretty much better in terms of python coding and architecture/documentation design
That doesn't match my experience at all.
I can honestly tell you from my experience that Sonnet 3.5s coding skills did things no other models did right last year during the summer, this was even though the benchmarks showed that it wasn't the best performing at coding tasks.
I prototyped on the weekend and started out with 4o because i had a subscription running.

After an hour and a half assed working result, i put everything into claude and it made it significant better on the first try and i had not a subscription active with claude.

Really interesting, I used it today still lots of issues. Maybe my python notebook is not approach is too complicated for Sonnet? Couldn't be able to fix a custom complex seaborn plot. 4o failed too. o3-mini-high managed to do it really well on the other hand.
There is honestly no rhyme of reason to all these opinions, someone was telling me the other day that Claude is for sure the best, I'd say multiple people actually.

I find it concerning there is no real accurate benchmarks for this stuff that we can all agree on.

yet with claude still the most useful, lmsys is broken for coding
Any model that censors itself does poorly, despite being able to provide high quality answers.
Anthropic best model is Sonnet 3.5 in my opinion. The reason its good is it is very effective for the price and fast. (I do think Google has caught up a lot in this regard). However, not having COT makes its results worse than similarly cheap COT based models.

Leaderboards don't care about cost. Leaderboards largely rank a combination of accuracy + speed. Anthropic has fell behind Google in accuracy + speed (again missing COT), and frankly behind Google in raw speed.

  • ·
  • 4 days ago
  • ·
  • [ - ]
  • rvz
  • ·
  • 4 days ago
  • ·
  • [ - ]
No idea why was this downvoted, but you are correct.

Seems like the team at xAI caught up very quickly to OpenAI to be at the top of the leaderboard in one of the benchmarks and also caught up with features with Grok 3.

Giving credit where credit is due, even though this is a race to zero.

We got more emotional and opinionated people on HN and they often reacted in an emotional way instead of using logic and being curious.
Yeah, so many people aren't capable of talking about anything Musk-adjacent with clear thoughts. It's insane how quickly xAI went from not existing, to the top of the benchmarks.
I think people here are thinking very clearly about Musk and his various projects.

Not sure about people elsewhere though.

Depends what you mean by "people here". I mean, obviously the majority of HN commentators and even the majority of commentators on this thread seem to be. But there will always be a couple of slightly unhinged folk in a big enough group of readers.
Can't you just take DeepSeek and put it behind an API and get to the top of the benchmarks immediately?
  • nhod
  • ·
  • 4 days ago
  • ·
  • [ - ]
I'm not sure what you mean here? Musk has a history of doing both incredibly useful and cool things, and also incredibly dumb, cruel, and for some people even terrible things. That context should be part of any clear thinking around him. He does not get a clean slate in every new discussion of him.

There are widespread, legitimate concerns about what kind of person Elon Musk is turning out to be. There is a lot of chatter about fears of China's AI rise, but what happens if we get Elon's brand of cruelty and lack of empathy in an authoritarian superintelligent AI ? Is that the AI future we want? Can you imagine an SAI with real power that interacts with people like Elon does on Twitter? I am not sure that is a future I want to live in.

We’re trying to talk about the capabilities of Grok and you can only focus on Musk. That’s what I’m talking about.
Don’t defend a persona over substance. His concerns are valid and relevant to the discussion.
It’s relevant to the subject since he owns it.
There would be no Grok without Musk, any discussion of Grok is going to involve discussion of Musk as well.
You cant see this separate from Musk. Musk isnt a business as usual type.
[flagged]
[flagged]
You know what they say: Fascists are good at keeping the training runs on time.
Say what you will about the guy, he kept the training running on time.
  • mint2
  • ·
  • 3 days ago
  • ·
  • [ - ]
So he took credit for improvements others worked on and they also weren’t as good as purported?

(Assuming that is a reference to the Mussolini quote.)

OK that's actually a pretty good one. If you didn't steal that from an X comment, I give you props.
Why can so many people not see it?!
  • xnx
  • ·
  • 3 days ago
  • ·
  • [ - ]
A very impressive debut. No doubt they benefited from all the research and discoveries that have preceded it.

Maybe the best outcome of a competitive Grok is breaking the mindshare stranglehold that ChatGPT has on the public at large and with HN. There are many good frontier models that are all very close in capabilities.

  • nmca
  • ·
  • 3 days ago
  • ·
  • [ - ]
This is grok 3, so not a debut
  • cj
  • ·
  • 3 days ago
  • ·
  • [ - ]
Maybe this is Grok’s “ChatGPT moment”. Similar to how OpenAI’s debut was with GPT-3.5 (not their first version)

Debut in the sense that it’s something good enough that it’s getting mainstream attention.

It is a debut of their thinking mode iirc.

Unfortunately LLMs are shifting compute time to test time instead of train time. I don't really like this and frankly it shows a stalling of the architectures, data sets, etc...

Another take is that the base models are now good enough that spending more money for more intelligence is viable at test time. A threshold has been crossed.
I guess I'd always thought the direct opposite.

Naively, I feel to be useful, the goal of LLMs should be to more power efficient. So that eventually all devices can be smarter.

Power efficiency can be gained through less time-time, or more "intelligence" or some combination of the two. I'm not convinced these SOTA models are doing much more than increasing test-time.

Biggest impacts on power efficiency will be the advances in node size and transistor type like nanosheet or forksheet. Algorithm will help just a little.
Gemini has been topping benchmarks and leaderboards for weeks if not months at this point. Nobody cares.
TLDW. Will this be open weights?

This commit seems to indicate so, but neither HF or GH has public data yet:

https://huggingface.co/xai-org/grok-1/commit/91d3a51143e7fc2...

Edit: Answer from Elon in video is that they plan to make Grok 2 weights open once Grok 3 is stable.

This is how they've done the past releases as well, soon after they release the latest and greatest they open source the last model.
Controversial opinion but I think the AI game studio idea is a very good one. Not because I think they will make any money off the games, but dogfooding will lead to so much more improvement than relying on feedback from external customers.
We're 1-2 years into our AI game studio [1] if anyone has more questions on it.

Seeing awesome feedback from players on our demos (and seeing an insane amount of stickiness from players playing even small demos built around generative AI mechanics). Raising now. Hiring soon to move faster. Feel free to reach out - dru@chromagolem.com

[1] https://www.chromagolem.com/

Especially on code, think of all the free data you get from the generation-evaluation loop
What is dogfooding?
Eat your own dog food, i.e. use your own product
Because 'dogfooding' has worked so well for other products...

If you don't get feedback from the people actually playing your game (or using your product), you will never get the improvement you need to help them.

You can have the most talented passionate people there are developing a product, but if it's not working for the people you want to sell it to, it's the wrong product.

Most tech products are terrible because those paying for them are not those that have to use them every day, or because they solve a corporate problem (compliance) and not a usability problem which is the actual need from the people on the shop floor.

Many big games/products are already built mostly on metrics, and that has proven to be a terrible way to work out what people 'want'. It's a great way to justify money decisions though, so it keeps happening (and games/products from big companies keep getting worse).

I see. So you don't know what dogfooding is.
I like and agree with something you've touched on here. I think the downvotes are perhaps because you're not putting an end cap onto this idea here. And I think that end cap is: the feedback a company gets when it dogfoods its own product is *not* guaranteed to be similar to the feedback it gets from customers.

The implicit assumption with dogfooding is that more feedback is better, even if that feedback is artificially constructed.

I think the idea here is that foisting one's product onto one's own workers is likely to incur a bunch of additional biases and preferences in feedback. Paying customers presumably use the product because they need it. Dogfooding workers use the product because they are told to do so.

Apparently the API will only be available in a few weeks, so I can't run my independent benchmarks yet.
I'm waiting for this as well, though I did try to run several manually now that it's live and the results have been impressive so far.
Thanks for the update.
Any guess on its availability at that point? Is it likely to be limited to certain tiers like o1?
  • jbryu
  • ·
  • 4 days ago
  • ·
  • [ - ]
Looks like they recently updated their ToS as well: https://www.diffchecker.com/w4dbxWwt/
Have you thought of a future where LLM will be fined tune to target advertisment to you? I mean look at search: first iterations of search were pretty simple in term of ads. Then personalized ads came. I wouldn't help but envision the distopia where the LLM will insert personalized ads based on what you are asking for help.
It's way worse than that. First, We interact with LLMs through private conversation and we are used to have private conversation with human we trust. Some of that trust will be transfered to LLMs. Second, LLMs have a vastly bigger "mental" power to build a long term mental model of us, while we interact with them. Which mean they can chose with extreme precision their words to trigger an emotion, a certain reaction.

Combine the two and the potential for manipulation, suggestion, preference altering is through the roof.

The next step is to combine it with heartrate/bloodpressure/eye tracking in phones and generate the text you're reading in realtime based on biofeedback. We'll be able to control people like robots. See where those $1MM+ salaries and billions of dollars are going, yet?
I do believe this is the next natural evolution. People don't like to pay for things and ads are a proven business model. I bet the big labs are looking into this
  • qoez
  • ·
  • 3 days ago
  • ·
  • [ - ]
Meta probably already does this on the top 10k people that spends most or is a high ranked influencer on instagram etc.
I'm more worried about LLMs with specific political biases built into them. Imagine one that sounds like Conservapedia or the most insane left-wing parts of Tumblr
Does it already include the datasets Musk received from the government or do I have to wait for Grok4?
Do we have any details on how large the context window is? Or how many input tokens it can handle?
In the opening blog post they mentioned it to be 1 m tokens.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Will he do what he promised and open source Grok 2 now?
The question came up and he said they would, once Grok-3 is fully released.
  • rvz
  • ·
  • 4 days ago
  • ·
  • [ - ]
I'd expect them to open source it just like they did with Grok.

We're still waiting for OpenAI to do the same. Even at least GPT-3.

> We're still waiting for OpenAI to do the same. Even at least GPT-3.

The exact details of OpenAI's models and training data are not fully disclosed, which can raise concerns about potential biases or vulnerabilities.

  • vasco
  • ·
  • 4 days ago
  • ·
  • [ - ]
You just have to use openai for 5 minutes and you'll see the pretty evident biases.
Note: this is "before April" so not a complete assessment

https://manifold.markets/SaviorofPlant/will-xai-open-source-...

I'm sure not.
Anybody can try Grok3 on Chatbot Arena (even if you are in Europe). Select Direct Chat and select the model early-grok-3. https://lmarena.ai/
Am I the only one who isn't impressed by this? Grok3 is failing basic OCR, react/sql coding excercises that Sonnet and Gemini completes successfully.

I'm also skeptical of lmarena as there is a large number of Elon Musk zealots trying to pass off Grok as a proxy for Tesla shares.

examples? i have been using it all morning and just canceled my claud subscription
The most fascinating part of the video for me was how they built the hardware to do this: https://youtu.be/AUAJ82H12qs?si=sHz3ddZnz2-HU3UL&t=2192
  • pred_
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Currently, Grok Web is not accessible in the United Kingdom or the countries of the European Union. We are diligently working to extend our services to these regions, prioritizing compliance with local data protection and privacy laws to ensure your information remains safely secure.

I suppose you can take that to mean that people who do have access to the service should not expect much in terms of data protection.

There are just more regulations to comply with before a release. OpenAI's new Deep Research tool wasn't originally available in the EU either, but it was released less then a week after it came out in the US. Since the EU is a gigantic market with a lot of buying power and this release makes a strong case for people to switch over from competitors, I doubt it'll take long.
> There are just more regulations to comply with before a release.

If you do collect personal data and do funky stuff with it.

Another approach would be to not collect that personal data until you have the right process in place, and basically be regulation-compatible out-of-the-door on day one.

  • IMTDb
  • ·
  • 3 days ago
  • ·
  • [ - ]
Even if you don't collect personal data, you need to comply with regulations to document properly the fact that you do not collect personal data.
If your organization truly don't collect or process any personal data then no, you don't have to say anything as for example GDPR doesn't even apply to you in the first place. Or are you thinking about a different directive than GDPR perhaps?
  • IMTDb
  • ·
  • 3 days ago
  • ·
  • [ - ]
The definition of "personal data" is so wide that it is impossible to provide any web service without collecting some form of "personal data".

If all you have is an apache web server with the default configuration serving fully static HTML / CSS page without any script tag, you already might need a DPO and complete some documents.

> The definition of "personal data" is so wide that it is impossible to provide any web service without collecting some form of "personal data".

Just because Apache by default collect and stores IPs doesn't mean it is impossible to provide a web service without collecting personal data? Disable the IP collecting, and even the default configuration wouldn't need to follow GDPR as it again doesn't even apply.

Is there something else in Apache that collects personal data by default? If you're unsure what "persona data" really means, https://gdpr-info.eu/art-4-gdpr/ has the definition.

Not sure how HTML/CSS is relevant, it shouldn't depend on what content you're serving.

  • IMTDb
  • ·
  • 3 days ago
  • ·
  • [ - ]
All that requires additional active effort to fight having access to any data. The more complex your infra the harder it becomes to not having to do paperwork. Include a reverse proxy, and a CDN to the above and the chance of you not having access to any "personal data" is really really close to 0 unless you spend significant engineering resources triple checking everything. Even then, if you wanna be safe you better have the paperwork ready in case you forgot something. In the example above, I hope that you would not have stopped at checking the apache configuration as I am sure you are fully aware that there are multiple log levels at the OS level that need to be tweaked as well.

This is of course despite the fact that you clearly have 0 ill intent and that none of these "personal data" can really be used for anything bad.

The mention HTML/CSS is just to make it clear that no additional data collection can happen through javascript tags (Google analytics, or any other alternative), or useful third parties. It makes total sense that if you dare use a bug tracking software, you should definitely pay hundreds of euros per month to hire a proper DPO who will handle all the paperwork or risk being exposed as the mental lunatic that the EU commission believes you are.

> All that requires additional active effort to fight having access to any data

I agree that it requires additional active effort, I'm not arguing against that. I don't agree with your original point that it's "impossible to provide any web service without collecting personal data", and it would seem you no longer agree with that either.

> It makes total sense that if you dare use a bug tracking software, you should definitely pay hundreds of euros per month to hire a proper DPO who will handle all the paperwork or risk being exposed as the mental lunatic that the EU commission believes you are.

If you willy-willy use bug tracking software that is needlessly collect and/or process EU individuals personal data, then yeah, you need to follow the regulations in the region you operate in.

If the collecting/processing actually serves a higher purpose (for your business and otherwise) then again, makes sense you need to follow the regulations.

  • IMTDb
  • ·
  • 3 days ago
  • ·
  • [ - ]
> it would seem you no longer agree with that either.

On the other hand, you pretended that fixing that apache configuration was somehow "all I needed to do" to be compliant with EU regulations. We proved that this was wrong, and despite your best effort you are still unable to give a proper list of everything I need to do. You are unable to do so because it is virtually impossible; no matter how thorough you believe you are, you might still be missing an element you don't know well enough. To be safe the only path is to accept the fact that you will need to access personal data, even if that's not your purpose, nor if you do anything with them. The additional paperwork and needless effort are mandatory.

This in turn explains that regardless of what the Grok3 team really does behind the scenes; they DO have additional work to complete to be able to release their product in Europe, and that might explain the delay.

> If you willy-willy use bug tracking software that is needlessly collect and/or process EU individuals personal data, then yeah, you need to follow the regulations in the region you operate in.

I am willing to use whatever error tracking software you suggest. My criteria are simple: I might have JS errors I don't know about, please give me enough information to fix the underlying issue when that happens, without requiring me to fill additional paperwork.

My whole point is that the definition of what constitutes "personal data" is so wide that such a tool does not exist.

how do you store chat inputs without collecting personal data?
That's possible in general but not for this application; a chat interface to an LLM isn't very useful unless you can tell it whatever you want—including GDPR personal data—and then pick up the thread of conversation later.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
It is kinda possible to do store that in browser, but as I've been finding with my own browser-based front end for the API, the browsers seem to clear this data a bit more than one might expect.

Or at least, Safari on Mac clears it.

When regulations become sufficiently burueacrafied it's extremely easy to accidentally violate them doing completely normal things. As a really random example, in California when you operate a foodcart it's not enough to just keep your area and wares in sanitary condition, instead you need a dish washing bin of a minimum of exactly 'x' inches 10.5 IIRC.

A guy who was just ensuring he was preparing clean healthy food, keeping everything sanitary and all that might assume he was naturally obeying all regulations. But that assumption can cost one a big fat fine (leading to fun scenarios like a food cart vendor needing a compliance legal team), and given Musk's relationship with the EU - they'd love to crucify him him on any possible technicality they can find.

Right, that's true I suppose. But also, if you don't have a car for example, you don't need to think about the laws of how to legally drive a car, since it doesn't apply to you.

Similarly, if you don't collect nor process any personal data whatsoever, directives like GDPR doesn't even apply to you, so there isn't really any way (easy or hard) to "crucify" someone on violating that.

  • ·
  • 3 days ago
  • ·
  • [ - ]
Do you mean data protection or political correctness/control of discourse protection?
I think they meant data protection. You can tell by how they said "data protection".
Not in Canada, either :(
The EU and UK are good for data protection?
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
Pretty good considering there are laws around data privacy and government institutions that enforce them. Are they perfect? Of course not, but it sure is better than no laws to protect my personal data.
Probably SOTA in terms of data protection today in the world. Happy to be proven otherwise.
  • ddxv
  • ·
  • 4 days ago
  • ·
  • [ - ]
I think they put the new model behind a $40 paywall so less people use it. The model seems only marginally better than open source models, based on xAI's own internal tests, and they spend $$$ money for it to run. Elon talked in the second half about making one of the largest GPU data centers just to get this running. I guess the next iteration they'll be trying to reduce the costs.

Also, they will be open sourcing Grok 2, which is probably pretty behind at this point, but will still be interesting for people to check out.

They should have asked Grok3 how to create a good announcement stream before going live. That was a mess
What are your first impressions using it? (Not available in Europe currently). Is it a game-changer?
>Not available in Europe currently

I hate how its the same story for every new AI technology. If someone can tell me who to vote for or where to protest to change this awful EU law, that would be great.

It's not an awful law.

The Digital Market Act is a bit of an overreach but the AI law is not.

It classifies AI into risk categories, so that it doesn't kill anyone, carelessly handle sensitive information, etc.

A chatbot can easily comply with it.

totally agree. And this is one of the reasons the EU is falling more and more behind, all the silly regulations.
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
The EU regulations are there to protect the average citizen, not to help the 1% run wild with whatever business idea they have. You personally might not like it, but the non-entrepreneurs, which is most people, are pretty satisfied that the laws in the EU are more focused on the citizen and their rights and not on boosting the shareholders' profits.
> not to help the 1% run wild with whatever business idea they have.

And IMHO regulating 1% doesn't hinder strategic advantages much. Otherwise China would not have came up with DeepSeek models. Regulations are fine, they just have to be "based".

> Regulations are fine, they just have to be "based".

What? I know what "based" means, but I can't quite grok what you're saying.

How exactly am I being protected by not having access to the latest models that the rest of the world has?
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
Nothing is stopping X from complying with EU regulations to make it available to you. I'd wager that they most likely lack compute capacity to make it available everywhere, not legal compliance.
> Nothing is stopping X from complying with EU regulations to make it available to you.

Given the personalities involved, I'd wager he's doing it out of spite than for any actual legal justification.

...though if there was an actual legal risk then I'll agree the economics probably don't bear-out the risks - as someone who identifies as European I'll admit that Europeans generally pay far less money for tech/software/apps/services than Americans do[1]; salaries in Europe are also significantly below their US equivalents: paying $200/mo for OpenAI's service is a much harder sell to a London or Berlin-based SWE on €90k/yr than when you're a Bay Area type on $300k/yr.

[1] e.g. If you can take Apple at their word, the EU accounts for 7% of their App Store revenue: https://techcrunch.com/2024/02/01/apple-says-eu-represents-7...

Besides, anyone in the EU who really wanted to use it can just use a VPN service.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
Libel, from all the models hallucinating things done by whatever your real name is.

I mean, at least I get the advantage of being overshadowed by a famous film director with the same name as me, so nobody's going to assume anything associated with my name is actually about me…

…hopefully…

I'm increasingly happy we have these regulations that prevent us from being ruled by the likes of Musk.
Why? Europe is getting extremely expensive on all levels. Time for a Musk there.
I rather spend life in a poor democracy than a rich technocracy which supports dictators around the globe.
> for every new AI technology.

Well no. Mistral.ai

  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
Have you tried asking the AI people to ship an AI that complies with EU law?
EU regulation are hit and miss. USB C and Opening up apple store are hits. AI regulation and cookie banners and idiotic bottle caps are miss.
I don't get the bottle cap hate meme. Is it useful? Probably no, but the amount of hate it gets is unexplainable by lack of concrete usefulness alone…

Also, the problem with GDPR is that it wanted to leave too much room for business to still collect an obscene amount of data, hence it allows the cookie banner. Please note that I emphasized “allow” because that's all GDPR does: it allows companies to use a cookie banner to extract the consent to collect data. It doesn't mandate it in any way.

None of my multiple websites have a cookie banner on them because I'm not trying to extract consent from my users to abuse their data, I just don't collect it and I'm effortlessly GDPR-compliant in the least obnoxious way.

Cookie banners are just malicious compliance.

You could move to America and avoid the fake delays blaming the regulations tech companies don't like.
Companies need to adhere to GDPR in order to enter the European market, poeple have the right to request to delete their PII. It's a good law, actually this should be applied everywhere. As an European, though, I'm scared: what if companies are actually testing if excluding us is so much of a problem for their business?
They are doing it. At some point we have to agree that Brussels are idiots and rarely savants.
  • cle
  • ·
  • 3 days ago
  • ·
  • [ - ]
I would be very surprised if they aren't monitoring the cost-benefit curve of delaying EU launches. Why wouldn't they? It costs extra money, time, and legal risk to launch in the EU. It's especially bad for XAI due to Musk's involvement.

(Note that it's not just GDPR, there's also the EU AI Act which has a whole extra set of requirements for compliance.)

It has nothing to do with the EU laws, or at least not in the sense they want you to think about it: no laws prevents AI players to release their AI models here, but they are all also big tech players who are affected by GDPR, DSA and DMA which harm their business by protecting the consumers.

That's why they use their AI products as a leverage to turn European people against the laws that protect them from big tech. It's just blackmail.

The AI Act doesn't prevent Grok from releasing their model in the EU! (And ChatGPT early issues were all linked to GDPR)

For the record, Facebook has put a restriction on the use of Llama models in the UE even before the AI Act was passed (and the AI Act doesn't even apply to Llama anyway, except Llama 3.1 405b)

It will come to EU countries eventually, but it takes a long time to go through "conformity assessments". Notebook LM, for example, was geoblocked for the EU for a full year before it became available in June. Grok 1 was released everywhere else in the world in November 2023, and in the EU in May 2024. About a 6 month delay.
> Grok 1 was released everywhere else in the world in November 2023, and in the EU in May 2024. About a 6 month delay.

And here you should see that it has nothing to do with the AI act, as it wasn't enacted before last August!

Furthermore, neither Grok 1 nor Notebook LM would have been subject to the AI act even if it had existed at the time.

As I said before, all of these companies have vested interests against EU's legislation as a whole, and they've tried to blackmail the EU from the beginning. They didn't wait for an actual AI legislation to exist to use AI as just another blackmailing tool.

  • cle
  • ·
  • 3 days ago
  • ·
  • [ - ]
I think you're misapplying the term "blackmail" here and thus poisoning the well. The EU is applying pressure to companies and companies are applying pressure back--that's not blackmail. They each have their own means of leverage, and they both use them.
First of all both sides don't have the same level of legitimacy and then one side is blatantly lying about its intent by claiming that they are blocked by regulations instead of admitting that they are putting pressure (because they know they have no legitimacy to “apply pressure” on democratic institutions).
Apparently EU regulators were blocking it for unspecified reasons until an agreement was made in May. And even then, they blocked news summaries until after the EU elections. If you can find more info, feel free to cite it. Info about these behind-the-scenes dealings are hard to find online.

https://www.socialmediatoday.com/news/xs-formally-twitter-gr...

The regulation that the author of this article think about is DSA, even if not named directly. See this quote:

> Well, probably because Grok has already spread various false stories

(The European regulation that deals with disinformation is DSA).

And again it couldn't be the AI Act, because it wasn't in place at that time!

No, it was underwhelming, failing basic coding tasks, OCR/Image recognitions that none of the other existing models screw up.
I am excited for the voice mode promised in "a week" or so. ChatGPT Advanced Voice has been a big disappointment for me. It can't do some of the things they demoed at the announcement. It's a lot dumber than text mode. I find the voice recognition unreliable. I couldn't get it to act as a translator last time I tried. But most of all I find I don't have much to talk to it about. If Grok 3 voice mode can discuss current events from the X timeline then it should be much more interesting to talk to.
[flagged]
Very impressive work given the timeline
Love low budget on marketing side, just few guys talking about essence - job done, tons of money saved if you ask me.
Launched where? https://x.com/i/grok just loads Grok 2. I assume it's only accessible from iOS right now?
grok.com
Ah you pay for it. "To get access to Grok 3, you need to be a Premium+ subscriber on X." I see why I don't see it.
they just made it temporarily free to anyone, FYI
  • ·
  • 4 days ago
  • ·
  • [ - ]
  • ·
  • 4 days ago
  • ·
  • [ - ]
  • zb3
  • ·
  • 3 days ago
  • ·
  • [ - ]
I'm a freeloader and it appears that unfortunately Elon is not stupid enough to just give it to me for free.. There's no fair price either since I see no pay-per-use pricing, so.. unavailable for me for now.
What makes pay-per-use pricing inherently more or less fair than unlimited usage from a subscription?
  • zb3
  • ·
  • 2 days ago
  • ·
  • [ - ]
I don't use much, so for me this is not a good deal - I'd pay for "unlimited" usage, but make just a few requests daily..

So those who use less pay for those who use more, and I don't see it as a fair deal.

BTW, Grok 3 will be available on x.ai in coming weeks.

  • ·
  • 4 days ago
  • ·
  • [ - ]
I wonder if people will attempt at jailbreaking this model to see if they can find evidence of federal data being used to train it.
Musk already has too much power, won’t trust him with my AI conversations
But you trust Google, OpenAI, and whatnot with it?
Between your average corporate megacorp and a drugged-out antivax fascist, I'll take my chances with the former yes.
antivax lol - give it a rest man
He has been publicly attacking Fauci for his handling of Covid and retweeting antivax people
Is Fauci some saint that cant be criticized ? No matter which political party you support, is it really difficult to admit the guy total mismanaged the pandemic response and messaging.
  • xzjis
  • ·
  • 2 days ago
  • ·
  • [ - ]
Yes that's true: Elon Musk isn't an antivax, he's just a (neo-)Nazi
You clearly don't understand the definition of Nazi, but that's on you
Been using Google for email for the past 20 years. So far, so good.
Trust is a strong word, but there are levels of hell
A DeepSeek local model is the only thing worth trusting anymore. Fuck them all.
Billions spent, one of the most powerful AI developed, and still no one competent enough to trim the 15 mins of waiting time filler at the beginning of the announcement video...
Tells me they have spent their entire engineering time on engineering and zero on marketing fluff, which is good.
Not sure I share the same takeaway from their marketing video.
doesn't look like so that they have spent zero on marketing fluff still...
I'm guessing Musk wanted it that way.
Anyone else noticed anything?

https://lngnmn2.github.io/articles/grok3/

They will open-source Grok 2 when Grok 3 comes out. Also it seems like it will be paywalled - disappointing considering DeepSeek-R1 is free and open source.
Yeah not sure what profit these guys think they'll be able to squeeze out of these models with open source and free clearly being 95% as good
Having the keys to the treasury department will probably help.
Exclusive contracts with the defense industry or similar deals?
That probably won't come close to justifying the current valuation of either OpenAI or Grok (idk how much investment it took in or how much it has spent so far).
  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
For some ouroborus fun, I attached this whole HN discussion and asked Grok 3 to summarize (with specific focus on the members attitude towards Elon Musk). Here's what it came up with:

https://x.com/i/grok/share/CTDC0WOi7RCbEDrm11AJ3PtLM

How did you customize the output?
  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
I have no idea why that page says "Grok’s output has been customized by this user"; I don't see anything related to custom prompts in my Grok settings page. Maybe I'm looking in the wrong place?
  • arj
  • ·
  • 3 days ago
  • ·
  • [ - ]
Still no post on their official blog. How disappointing.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Off topic, but just in case: is there a good reference on how people actually use LLMs on a daily basis ? All my attempts so far have been pretty underwhelming:

* when I use chatbots as search engines, I'm very quickly disappointed by obvious hallucinations

* I ended up disabling github copilot because it was just "auto-complete on steroids" at best, and "auto-complete on mushrooms" at worst

* I rarely have use cases where I have to "generate a plausible page of text that statistically looks like the internet" - usually, when I have to write about something, it's to put information that's in my head into other people head

* I'd love to have something that reads all my codebase and draws graphs, explain how things work, etc... But I tried aider/ollama, etc.. and nothing even starts making sense (is that an avenue to persevere in, though ?)

* At once, I tried to write in plain english a situation where a team has to do X tasks, in Y weeks, and I needed a table of who should be working on what for each week. I was impressed that LLMs were able to produce a table - the slight problem was that, of course, the table was completely wrong. Again, is it just bad prompting ?

It's an interesting problem when you don't know if you're just having a solution in search of a problem, or if you're missing something obvious about how to use a tool.

Also, all introductory texts about LLMs go into many details about how they're made (NNs and transformers and large corpuses and lots of electricity etc...) but "what you can do with it" looks like toy examples / simply not what I do."

So, what is the "start from here" about what it can really do ?

  • bhl
  • ·
  • 3 days ago
  • ·
  • [ - ]
I use it everyday.

For coding, I use cursor composer to gather context about the existing codebase (context.md). Then I paste that into DeepSeek R1 to iterate on requirements and draft a high level design document, maybe some implementation details (design.md).

Paste that back into composer, and iterate; then write tests. When I'm almost done, I ask composer to generate me a document on the changes it made and I double check that with R1 again for a final pass (changes.md).

Then I'm basically done.

This is architect-editor mode: https://aider.chat/2024/09/26/architect.html.

I've found Cursor + DeepSeek R1 extremely useful, to the point that I've structured a lot of documents in the codebase to be easily greppable and executable by composer. Benefit of that is that other developers (and their composers) can read the docs themselves.

Engineers can self-onboard onto the codebase, and non-technical people can unstuck themselves with SQL statements with composer now.

Correct me if I am wrong, but the whole premise is of Cursor and Windsurf is that this architect-editor mode already being built in into the editor. This is why there is a distinction between composer (Editor) and chat function (arhitecture).
  • bhl
  • ·
  • 3 days ago
  • ·
  • [ - ]
Haven't tried Windsurf yet.

Chat function is just chat; it can't edit your files.

Composer probably relies on prompt engineering to do editor-architecture, as it reads and writes to your codebase. But it's heavily heavily to Sonnet 3.5 and tool-calling.

For architecture-type stuff, I prefer DeepSeek R1 as reasoning models do better on high level design. Which is why I will copy and paste in and out of compsoer.

This sounds great - would love to hear a little more about the prompts. Are you literally just asking 'write me a context.md that explains how feature x works' or something like that?
  • bhl
  • ·
  • 3 days ago
  • ·
  • [ - ]
For context.md, it's that simple because it's unstructured data extraction from your codebase and working with a regular LLM model.

For design.md, I have a prompt because we're now working with a reasoning model and doing structured data extraction: create me an issue on Linear with a title, description, and a to-do list.

I would recommend trying the approach yourself and saving the prompts if you can nail down the repetitive asks.

Thank you!
This is the way.

Seriously, this is the only useful flow I've found for AI coding in general..

Cursor is not yet an option for me, but at least it means aider is not a dead-end. Thanks for the info.
Could you elaboreate why cursor is a dead-end?
My wife has found ChatGPT extremely useful when dealing with her mother - who has bipolar and is obsessed with other people's health. I've got a terminal cancer diagnosis and handling my mother-in-law has been extremely difficult - nearly to the point of no longer having any communication with her. ChatGPT has a single conversation with all the back story and has put some useful points across when discussing how difficult her behaviour has been (she watched a operation that failed for me for entertainment, for instance).

I have found similar when giving backstory and needing help to start structuring difficult conversations where I want to say the right thing but also need to be sensitive.

I'm sorry for your situation.

> she watched a operation that failed for me for entertainment, for instance

You make your own choices, but cutting a person like this off would be very reasonable

Absolutely, but it's my wife who will have to live with the consequences long-term, so I'm being led by her.
You cutting off from your mother in law doesn't mean your wife has to.
You don't know my mother-in-law!!!
  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
Man. :( We worry about the AI being inhuman, but robotic meaningless pleasantry is in some cases a significant upgrade from human cruelty.
You might find this optimal conversation path finder app of interest: https://x.com/eddybuild/status/1889908182501433669
Here's some things I have in my chatgpt history:

- Discussing the various stages of candymaking and their relation to the fundamental properties of sugar syrups, and which candies are crystalline vs amorphous. It turns out junior mints are fudge. Fondant is really just fudge. Everything is fudge, my god.

- Summarizing various SEC filings and related paperwork to understand the implications of an activist investor intervening in a company

- Discussing the relative film merits of the movie Labyrinth and other similar 80s kitsch movies. ChatGPT mentioned the phenomenon of "twin films" which was an interesting digression.

- Learning about various languages Tolkien invented and their ties to actual linguistics of natural languages and other conlangs

- Some dimensional analysis of volumes, specifically relating to things like "how many beans are in the jar" estimation and what the min and max of a particular weight of coins might be valued, in terms of both a par value based on a standard coin mix and outliers of, for example, old dimes that are pure silver.

- Discussion of quines in prolog and other languages, which resulted in a very interesting ChatGPT bug where it started recursing and broke when trying to write a prolog quine.

- Back of the envelope economic calculations around the magnitude of the housing deficit and the relative GDP cost for providing enough housing quickly enough to make an impact. Spoiler: it's probably unreasonably expensive to build enough houses to bring down housing prices by any significant degree, and even if we wanted to, there's not enough skilled workers.

- A number of podcasts transcribed. (I hate audio and meandering, so transcribed and summarized is perfect) I could use whisper and a python script to do this, but I'd rather let ChatGPT do the legwork, and it actually used a more modern model and method of processing than I would have naively used.

I find Github Copilot to be a really great autocomplete. I frequently write the comment at the top of a function and hit tab and it writes the whole function. This is dependent on typescript and having a relatively standard codebase but I think those things are useful on their own. You really have to limit it in terms of scope and specifics, but it lets me think high level instead of worrying about syntax.

> Everything is fudge, my god

Fudge is made with milk - am I missing a joke?

Technically fudge is just a crystalline sugar candy with a certain water percentage. Milk is optional (and frequently omitted). Reese's peanut butter cups are fudge, for example.
Forgive me for asking but... do you have a source for that other than an LLM? Every search I've tried just confirms what I already thought.
This explains better what I mean. https://en.wikipedia.org/wiki/Fudge#Texture - milk is added only as a stabilizer, and many candies that resemble fudge in texture incorporate milk or other stabilizers to prevent too rapid a crystallization. Fondant is just fudge without stabilizers.
Many people replied with anecdotes, but recently Anthropic published analysis of claude.ai production traffic. As far as I can tell this is the single best currently existing source of "how people actually use LLMs". Everything else you can't be sure whether it is representative.

https://www.anthropic.com/news/the-anthropic-economic-index

Thanks, this is a gem ! However, I suspect the fact that "programming" is such a bit usage, is that AI is closely integrated into text editor, as an "autocomplete on steroid" usage.

As they state in the report, I don't think they can measure how many people just ignored the suggestion from claude right away ? Or delete more than half of the suggested code ?

Imagine if the real impact of AI is "suggesting things that people are discarding immediately ?"

Call the "Unamusing misuse of resource"... [1]

[1] https://docs.kde.org/trunk5/en/amor/amor/amor.pdf

The only plausible explanation for the amount of resources poured into these language models is the hope that they somehow become the origin of AGI, which I think is pretty fanciful.

I can feel the cold wind of the next AI winter coming on. It's inevitable. Computers are good at emulating intelligent behavior, people get excited that it's around the corner, and the hype boils over. This isn't the last time this will happen.

  • guax
  • ·
  • 3 days ago
  • ·
  • [ - ]
I think the amount of money is explained in part by hubris. People in high positions think they're at least what they earn more smart and capable than people at the bottom of the org. So its reasonable, expected, borderline obvious that a computer bot can replace that person. So you're betting on the ability of it to get rid, if not of your junior devs at least the majority of your customer support staff.

In reality people doing "menial" jobs are smart and learn and operate with a lot of nuance than people ignore given unfamiliarity or just prejudice. Do you prefer to talk to a chatbot or a real human when you have a problem, how confident are you really, that even if the bot knows what the problem is it would be able to solve it.

Lots of problems with customer care is anchored in the issue that support staff is not allowed to fix or resolve problems without escalation or attempts at keeping you from costing more money. The bot might be better at it for the company because it will frustrate you enough to give up that 30 bucks refund, idk.

Ai seems to change a lot the dynamics of corporate jobs but I haven't seen yet anything that would be a game changer outside of it. Its great for searching company unorganised and messy knowledge bases.

I think this still applies https://x.com/dwarkesh_sp/status/1888164523984470055, LLMs now are useful but we need something else for AGI.
Didn't take long for this comment to age poorly :) https://news.ycombinator.com/item?id=43102528
Fun! I'll try it out, being a scientist with some LLM.
I can't feel any cold right now at all.

On all corners people work on so many small pieces advancing what we have.

And plenty of obvious things are not here yet like a full local dev cycle aka ai uses the IDE to change code them executes it, fixes compiler issues and unit tests and then opens a pr.

Local agents/ agents having secure and selected access to our data like giving my agent read only access to my bank account and a 2factor way for it to send money.

Deepseek's reinforcement learning is also a huge new lead.

Nonetheless in parallel robots are coming too.

GenAI is getting better and better. Faster and better video and cheaper. 3D meshes, textures first GenAI ads

I predict this comment will age very, very poorly. Bookmarked.
I feel like 50/50 chance of his or your comment aging poorly.
I feel there's a high probability your comment doesn't mean what you think it does (unless you truly believe both outcomes are as likely).
Not sure how else could I have meant that.
It seemed like you intended to present your comment as a tautology (e.g. "I feel there's a 100% chance of his or your comment aging poorly"), but I'll give you the benefit of the doubt!
Yeah, that's a good point. I just think it can go either way. I remember in 2015 how hyped we were around self-driving cars and thought "in 10 years there will be majority of cars like that". Right now we may see steady increase in capabilities of AI for years to come, or we may see it plateauing.
Cool. Invest in it then. That way you get paid instead of saying "I told you so" to some screen name.
I think the snag I feel in your argument comes from

>Computers are good at emulating intelligent behavior

Which implies that the brain is some kind of transcendent device that can backdoor physics to output incredible intelligence unique to it's magical structure.

Maybe LLMs aren't the key, but as far as we can tell the brain is also just another computer.

Holy strawman batman.
Care to differentiate intelligence from emulating intelligence?
No, it would be very hard and you already showed to not be arguing in good faith so I don't want to invest the time and effort.

And let me be very clear on why, because I love having conversations about this theme: it promises to be an adversarial and frustrating exchange.

Everyone seems to have a different definition for AGI. Is there some kind of standard there?
No- but the main issue is that all reasonable ones I can conceive lead inevitably to the Singularity technologically, and pretty quickly since we seem determined to throw as much silicon as possible at the problem. Hopefully the final step is intractable.
precisely; however this time we will have tangible results from the ongoing AI summer; that would be generative art, and coding/writing/journalist assistants.
There are always dividends. We got a lot of interest in Lisp from the first summer, and it arguably informed all currently used programming languages.
though the dividends were not obvious to the lay people vs now. Which means that upcoming winter won't be as cold.
I use them as a springboard for things I am really unfamiliar with. I'm self-learning electronics at the moment, and so I can ask it things like "what's a common and widely available cooperator." You will not find that answer on a search engine, I don't care how good your Google fu is.

It's a weak jack of all trades: it knows a fair amount about the sum of human knowledge (which is objectively super-human), but can't go deep on any one thing, and still seriously lags behind humans in terms of reasoning. It's an assistant that all book smarts and no street smarts. Or maybe: it's a search engine for insanely specific things.

Rote work, as well. Things like porting an enum from one programming language to another: past the source language into a comment and start it off with one or two lines in the target language. Dozens of tabs are surely faster than manual typing, copy paste, or figuring out vim movements/macros.

Heads up as an EE who uses LLMs quite a bit; they cannot analyze circuits or build them.

They might be able to help stitch together modules (like sensor boards plugged into microcontrollers) and definitely can write code to get things going, but they fall flat on their face hard for any kind of bare bones electronics design. Like 5% success rate and 95% totally incorrect hallucinations.

The training data just isn't there yet, but I imagine they could use a circuit simulator for the verification involved in training the model, right?
The problem is really that schematics are at the very heart of electronics design (and teaching/instruction), so to train a model you need a very powerful vision model to really unlock all the good training data.

The models can also output code that can be turned into a schematic through an interpreter, but there is virtually zero training data for this because humans always use and work with schematics.

Yeah, even I found it doing some dubious things as a beginner. Still helpful for things like how to correctly use certain components, but the svg diagrams it provided were hilarious at times.
I ordered some electronic components / sensors from china, and as it took months to arrive, I forgot exactly what I ordered (I'm noob at this). Simply taking a picture and asking Claude what it was helped a lot. The numbers and letters printed on the components didn't yield relevant results on google.
Your experience matches mine.
That's a mistake for "comparator", isn't it. You've allowed the AI to train you to use the wrong word through a shared delusion, haven't you.

Edit: if anybody knows otherwise, show me some evidence, don't just downvote. If these things are widely used, why are they impossible to find by searching? Why doesn't this electronics site know about them:

https://www.eeeguide.com/?s=cooperator

Why aren't they in any books when I did a full-text search on archive.org? Why doesn't Wikipedia know about them? Why aren't there threads about them on electronics forums?

I found them (through an image search) in exactly one place: educational training boards made in India by Jainco, like this one:

https://jaincolab.in/delta-modulation-and-demodulation

But this other one talks about a "ramp comparator" and then repeats the phrase but using "ramp cooperator" instead.

https://www.jaincolab.com/firing-circuit-using-ramp-comparat...

So I surmise it's an error and not a real thing.

It's an autocorrect. I did mean comparator. Presumeably the second link you pasted ran into the same problem.

> You've allowed the AI to train you to use the wrong word through a shared delusion, haven't you.

What an awful interpretation, phrased in the most demeaning manner possible. No, I double check everything the AI suggests. That's basic competency with the things.

Not that I would have had to, just ran a test:

> Me: I'm trying to use a cooperator in my schematic, how do I hook it up? Also what's a suitable cooperator for 3.3v logic level?

> Gemini: It sounds like you might be thinking of a comparator, not a "cooperator," in your schematic. Comparators are electronic circuits that compare two voltages [...] LM393: This is a low-power, dual comparator that can operate from a 3.3V supply. Its output can be directly connected to most 3.3V logic circuits.

Oh. Then in fact you will find the answer in a search engine, incredibly easily. But I apologise for assuming you were involved in a mechanical folie à deux. (It could happen!)
> You will not find that answer on a search engine, I don't care how good your Google fu is.

The answer staring the OP right in the face.

  • jfim
  • ·
  • 4 days ago
  • ·
  • [ - ]
I've found that Claude has been pretty decent at writing boilerplate code.

For example asking it something like "I have an elixir app that is started with `mix ...` can you give me a Dockerfile to run it in a container?"

It can also do things like "Given this code snippet, can you make it more Pythonic" or even generate simple apps from scratch.

For example, a prompt like "Can you write me a snake game in HTML and JavaScript? The snake should eat hot dog emojis to grow longer." will actually generate something that works. You can see the generated code for that prompt at https://claude.site/artifacts/34540f88-965e-45ca-8083-040e30...

Following up with "Can you make it so that people can swipe on mobile to control the snake?" generates https://claude.site/artifacts/651e957a-9957-488c-ae6b-e81348... which is pretty good IMO for 30 seconds of effort.

It also has a surprisingly competent analysis mode where you can upload a CSV and have it generate charts and analyze the data.

It's not perfect, it'll sometimes get confused or generate some dubious code, but you can quickly get to a 90% good solution with 1% of the effort, which is pretty impressive IMO.

> I ended up disabling github copilot because it was just "auto-complete on steroids" at best

this is good enough sell for me, and it's like sub 1-in-50 that it's "auto-complete on mushrooms" (again my experience, YMMV).

An awful lot of the time, my day to day work involves writing one piece of code and then copy-pasting it changing a few variable names. Even if I factor out the code into a method, I've still got to call that method with the different names. CoPilot takes care of that drudgery and saves me countless minutes per day. It therefore pays for itself.

I also use ChatGPT every time I need some BASH script written to automate a boring process. I could spend 20-30 minutes searching for all the commands and arguments I would need, another 10 minutes typing in the script, another 10-20 minutes debugging my inevitable mistakes. Or I make sure to describe my requirements exactly (5-10 minutes), spend 5 minutes reviewing the output, iterate if necessary (usually because I wasn't clear enough in the instructions).

3-5x speed up for free. Who's not going to take that win?

My biology professor provides basically zero feedback on his student's understanding of the material. There are very few practices questions to prepare for exams, which are worth 40% of your grade. I had an LLM write some python that extracts the relevant textbook chapters, which then I can feed into an LLM to generate practice questions. Then I can ask the LLM for feedback and whether or not I'm articulating the answers correctly.
I reckon the ideal use case for chat LLMs at the moment is as a bridge for questions that are hard to search but easy to verify.

For example, you have a plant you can't identify. Hard to Google search with words. "Plant with small red berries and...". You could reverse image search your photo of it, probably won't help either. Show an LLM the photo (some accept images now). LLM tells you what it thinks. Now you Google search "Ribes rubrum" to verify it. Much easier.

You've got a complicated medical problem that's been going on for months. A google search of all the factors involved would be excessively long and throw up all sorts of random stuff. You describe the whole scenario to an LLM and it gives you four ideas. You can now search those specific conditions and see how well they actually match.

I've found there are actually a lot of questions that fit in that sort of NP complexity category.

  • krige
  • ·
  • 3 days ago
  • ·
  • [ - ]
As a side note, there's an app for that! (tm). PlantNet does recognize plants based on photo provided and it is doing a pretty good job at it. It predates the LLM craze by a bit.
The Seek app (by iNaturalist, another excellent app) also can identify plants based on a photo, and without the need for an internet connection, which is a critical feature IMO since you often want it when you’re out walking in the woods or whatever.
I use LLMs significantly on a daily basis, mostly for coding C#, HTML, CSS, SQL. I use them for researching for wiki articles. I use it for summarizing long web pages and science papers. I use them for translation. I used GPT last night to repair my furnace (I've never opened a furnace before).

It (mostly) exceeds and excels at every task I use it for. I'm rarely disappointed. YMMV.

Absolutely life-changing for me.

I think most people are still "holding them wrong", and it'll take an entire generation of people to really figure out what these things are and are not good for.

I'll give two recent use-cases that may provide a hint of their ultimate utility:

1) I've been modernising 2010-era ASP.NET code written by former VB programmers that looooved to sprinkle try { ... } catch( Exception e ) { throw e; } throughout. I mean thousands upon thousands of instances of these pointless magical incantations that do nothing except screw up stack traces. They probably thought it was the equivalent of "ON ERROR RESUME NEXT", but... no, not really. Anyway, I asked ChatGPT in "Reasoning" mode to write a CLI tool utilising the Roslyn C# compiler SDK to help clean this up. It took about three prompts and less than an hour, and it spat out 300 lines of code that required less than 10 to be modified by me. It deleted something like 10K lines of garbage code from a code base for me. Because I used a proper compiler toolkit, there was no risk of hallucinations, so the change Just Worked.

2) I was recently troubleshooting some thread pool issues. I suspect that some long-running requests were overlapping in time, but Azure's KQL doesn't directly provide a timeline graphical view. I dumped out the data into JSON, gave ChatGPT a snippet, and told it to make me a visualiser using HTML and JS. I then simply pasted in the full JSON dump (~1 MB) and ta-da instant timeline overlap visualiser! It even supported scrolling and zooming. Neat.

Last night I was about to start working on a lot of text I need to submit my startup to a government funding program. Questions like "describe what your startup does", "describe your market" and things like that. Tens of fields which I estimated would take me a week and a half to do it right, if I wasn't going to pause all my other activities.

Then I had a better idea: I spent 20 minutes baby wearing, walking and dictating everything about my startup to ChatGPT. Later I took all that text and labeled it as a brain dump, plus my product support portal and some screenshots of my marketing material. Gave it all to ChatGPT again and asked it to answer each of the questions in the form. That's it. I have a pretty good version 1 which I can revise today and be done with it.

Many, many hours saved. I have tens of examples like that.

The product documentation I provided it with was also created with the help of GPT, and that saved me even more time.

Autocomplete on steroids is what I use it for. I've recently started using Cursor and the productivity improvements have been huge. I won't let it write very large blocks of code but I do a lot of web stuff so being able to update the classes in one spot and have it recognise all the other places that might be helpful and let me just tab through. Code to test things it's pretty good at as well which also saves a lot of typing.
For me, getting summaries of meetings is my favorite use case. Saves me from taking notes and I can extract next steps.

It also helps me getting started with new content, kind of building the scaffolding of, say, a blog or social post. It still needs adaption and fine-tuning, but getting rid of a white page is a great help for me.

And I use LLMs to play through ideas and headlines. I would normally do this with other humans, but since working full remote, its a nice sparing partner, although the AI not being able to really give criticism is a bit annoying.

The tools also make it easier to write in English as a non-native, making sure my text does not include any false friends or grammar errors.

Meeting summaries is the most hilarious thing these neural networks have produced. I don't know which NN model does Zoom use, but the text it produces is super funny :) . It basically can't parse half of the words, and then generate a random sentences using the remaining ones.
Agreed, it was super funny, especially if a song played in the beginning or if you switched languages. It has gotten way better, at least in my experience.
> although the AI not being able to really give criticism is a bit annoying.

I’ve managed to get ChatGPT to a good place in this regard recently and it’s better for it. In fact, it’s been doing such a good job recently that it almost seems like…human like.

I’ll have to look at my prompts, but somehow I got it from licking my ass and telling me everything I say and do is amazing to a point now where it almost seems eager to tell me I’m wrong.

Which is fantastic, huge improvement. I don’t really use it for coding though, because I am not a programmer. I would have no means today to correctly evaluate 90% of what it would return me.

I use them as an alternative to search engines for topics where I have some specific question where traditional search engines fail to find the needle in the haystack.

As a concrete example, I was recently playing with simulating the wave equation, and I wanted to try to use a higher-order approximation as I had never done that before. I'm quite rusty as I haven't done any numerical work since university some decades ago.

I still recalled how to deal with the Neumann boundary conditions when using the traditional lower-order approximation, but I was uncertain how to do it while keeping the higher-order approximation.

Searching for "higher-order neumann boundary conditions wave equation" or similar got me pages upon pages of irrelevant hits, most of them dealing with the traditional approximation scheme.

So I turned to ChatGPT which right away provided a decent answer[1], and along with a follow-up question gave me what I needed to implement it successfully.

[1]: https://chatgpt.com/share/67b4ab43-6128-8013-8e5a-3d13a74bf6...

This video's pretty great: https://www.youtube.com/watch?v=uRuLgar5XZw

One thing I can't figure out how to get LLMs to do is truely finish work. For example if I have 100 items that need xyz done to them, it will do it for the first 10 or so and say ~"and so on". I have a lot of trouble getting LLMs to do tasks that might take 10 mins - 1h. They always seem to simply want to give an example. Batch processing is the answer, I guess, or perhaps more 'agentic' models/tools - but I wonder if there are other ways.

Thanks for the link.

(Other answers are people gathering examples, which is nice, but I'm looking for more structured things.

And, I suppose I could ask an LLM, but my main problem is that... I don't really _trust_ LLMs yet :D )

finally, after two years of hype I have some usage for llm.

we import descriptions of products from a seller. the problem is they are mental ( probably written by chatgpt :)) and are way too long. we need only small blurb.

I put our style guide and given text to chatgpt and I get somehow reasonable description back. then editors still need to check it, but it's way less work.

I've seen a comic at the beginning of the LLM hype where:

* in panel A, some guy is proud to use ChatGPT to turn 3 lines of text into a 10 pages report

* in panel B, some girl is happy to use ChatGPT to summarize the 10 pages report into 3 lines

It was meant to be _satire_, not the sales pitch ;)

Reminds me of a great joke tweet:

> ZIZEK: that AI will be the death of learning & so on; to this, I say NO! My student brings me their essay, which has been written by AI, & I plug it into my grading AI, & we are free! While the 'learning' happens, our superego satisfied, we are free now to learn whatever we want

https://x.com/luminancebloom/status/1600598003391266816

I find them useful for searching for some function or API name with natural language. 'What's the function call [in blah] that generates a quaternion from a couple of vectors?' type stuff. Not exactly inspiring but I've found it highly useful. If you try to search for something like that online (and somebody hasn't asked the exact question on e.g. stack overflow) you'll just end up getting all the documentation for quaternions, vectors, and blah - when the function itself might even be in a tertiary math library.
These probably aren't tasks you need done but;

LLMs are pretty good at translation between human languages which makes sense since they are language models after all. They are better at this any any other technology.

The state of the art image ones can also probably do OCR and handwriting recognition better than any other software though may be expensive to run in large volume. But if you need to take picture of a notebook page with your camera phone an LLM can quickly OCR it.

Not quite true; LLMs are very expensive to run; bert or other tranformer specfically built for translation can be cheaper to run.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Big emuse cases for me are:

1. Exploring a new domain and getting some terms I can google for.

2. Making small scripts to do things like query github's GraphQL API.

3. Autocomplete of code using copilot.

I use it daily for all sorts of things, but one of the most interesting uses for me so far has been self-reflection.

For example, in the beginning of this year, I completed this exercise where I wrote a lot about childhood, past experiences, strengths and weaknesses, goals and ambitions for the future, etc (https://selfauthoring.com) and then I uploaded all that to ChatGPT, asked it to be my therapist/coach, and then asked it to produce reports about myself, action plans, strategies, etc. Super interesting and useful.

By now ChatGPT has quite a bit of context from past conversations. Just the other day I used this prompt from someone else and got back super useful insights – "Can you share some extremely deep and profound insights about my psyche and mind that I would not otherwise be able to identify or see as well as some that I may not want to hear"

I use mine as if it's an infinitely patient, relatively competent junior/medior level developer that I constantly give small chunks of programming to do (typically a function at a time), and occasionally consult on architecture/design/other things.

I don't use integrated coding tools, so my workflow isn't super fast, but that's not what I'm really aiming for - more that I want to save my brain's energy from low level drudgy boilerplate or integration code, so I can focus it on the more important decisions and keep business-side context in my head.

It's been a huge help for me this way across multiple projects in multiple domains.

I don't use it daily, and I find copilot counterproductive (for me). I do try to experiment with chatgpt when I remember to.

I find it good for complex SQL, reviewing emails, and Godot assistance (I'm a beginner game Dev).

There are also times when I have programming questions and I might try to use chatgpt, with mixed results.

Our company has tried to integrate it into one of our products, and I find it troubling how on occassion it is confidently giving bad results, but my concern seems to be in the minority.

EDIT: there was also a large refactor I did recently which involved lots of repeatable, but not super regexable, changes - chatgpt forgot where it was as I went through it, but other than working around that it was very useful.

In the last few days I discovered it's good at medium-complexity SQL, not at really more complex ones. I'm struggling like 4th day with Claude, ChatGPT, Gemini and Deepseek. All could do some good analysis with some low-hanging fruit improvements, but all went completely crazy when trying to optimize more complex things, getting into loops proposing the same changes over and over, outputting invalid SQL, and gemini even forgot what we were doing, asking me if I could paste again the query i included in the very first message. Maybe the chain of thought models would handle this better, but I believe I hit the limit for the standard ones.
Probably my complex SQL is your medium-complexity SQL. SQL is not a big part of my current project.
I've coded a full custom deployment system (config, create, update, cert manege and much more) in full bash using nothing but ChatGPT. I didn't write a single line of bash.

I did write 50 or more lines of instructions on what needs to be done and in what order.

ChatGPT gave me 5/6 (I asked for this) bash scripts totalling 300+ lines that seamlessly work together.

After reviewing, I asked it to change a few places.

If any human tried the same (except those rare bash Gods), it'd take many hours. I think it took me less than 30 minutes.

I used ChatGPT all the time for

1. Small coding tasks ("I want to do XYZ in Rust"), it has replaced stack overflow. Very convenient when writing code in a language I'm not super familiar with. 2. Help with English (traduction, proofreading...) 3. Learning something, like tech, I like interacting with it by asking questions, it's more engaging than just reading content.

I'd say nothing is game changing, but it's a nice productivity boost.

The Vergecast recently did a section where they asked listeners what they use LLMs for (specifically not for coding) https://youtu.be/WwNjBNtZ3Co 30 minutes starting at 45:25, it had a number of interesting examples. Might not convince you of LLM's excellence, or might not be much different from what other people commented, but it's a good listen nonetheless.
The sister comments contain quite a few specific examples. But the many back-and-forth arguments here on HN about whether LLMs are useful for coding suggest that understanding how they might or might not be used may be the biggest challenge at this point.

I myself use them a lot, though I constantly feel that I would be able to get more out of them if only I were smarter.

I feel that I would be smarter if I wouldn’t use them constantly.
You could quit using high level language compilers also, jump back to pure assembly, and get smarter still!
> All my attempts so far have been pretty underwhelming:

Same, it's good for repetitive things, things that have been answered 1000 times on stack overflow, translations, but that's about it. If you work on anything remotely new/hard it's mostly disappointing, you have to babysit it every step of the way and rewrite most of what it's shitting out in the end anyways.

I think it just made it obvious that 90% of tech jobs basically amount to writing the same CRUD thing over and over again & mobile/web apps with very common designs and features.

  • geros
  • ·
  • 3 days ago
  • ·
  • [ - ]
Cooking & Meal Planning:

- I have these three ingredients; recommend Italian main courses.

- What other ingredients pair well with this?

- How can I "level up" this dish if I want to impress?

- Can I substitute X for Y?

- Generate a family-friendly meal with lots of veggies using leftover roast chicken.

I admit to having been an LLM sceptic from day one, but I have been using ChatGPT and Claude a fair bit to try and figure out what the hype is all about. I haven't really succeeded.

Most recently I tried to use them both to solve a programming problem that isn't well documented in the usual channels (Reddit, StackOverflow, etc) and found it to be quite a disappointing and frustrating experience. It just constantly, enthusiastically fed me total bullshit, with functions that don't exist or don't do what the LLM seems to "think" they do. I'm sure I'm just "holding it wrong" but my impression at this stage is that it is only capable of solving problems that are trivially solvable using a traditional search engine, with the added friction that if the problem isn't trivially solvable, it won't actually tell you that but will waste your time with non-obvious wrong answers.

I did have a slightly more positive experience when asking it about various chess engine optimisation algorithms. I wasn't trying to use the code it generated, just to better understand what the popular algorithms are and how they work. So I think they might work best when there is an abundance of helpful information out there and you just don't want to read through it all. Even then, I obviously don't know what ChatGPT was leaving out in the summary it provided.

Try to give it a try ask you'd expect junior developer to successfully finish.
I just used chat gpt to summarise a HN post about it taking unexpectedly long to install a washing machine because of unexpected turns of events, and this being analogous to software development. It was a time-saver.
I think it excels when you know enough to precisely describe what you want but you don’t know enough about the details of the language or framework you’re using to implement what you want.
Here are some of my experiences:

* Figuring out where to start when learning new things (see also <https://news.ycombinator.com/item?id=43087685>)

  One way I treat LLMs is as a "semantic search engine". I find that LLMs get
  too many things wrong when I'm being specific, but they're pretty good at
  pointing me in a general direction.

  For example, I started learning about OS development and wanted to use Rust. I
  used ChatGPT to generate a basic Rust UEFI project with some simple
  bootloading code. It was broken, but it now gave me a foothold and I was able
  to use other resources (e.g. OSDev wiki) to learn how to fix the broken bits.
* Avoiding reading the entire manual

  It feels like a lot of software documentation isn't actually written for real
  readers; instead being a somewhat arbitrary listing of a program's features.
  When programs have this style of documentation, the worst case for figuring
  out how to do a simple thing is reading the entire manual. (There are better
  ways to write documentation, see e.g. <https://diataxis.fr/>)

  One example is [gnuplot](http://www.gnuplot.info/). I wanted to learn how to
  plot from the command line. I could have pieced together how to do it by
  zipping around the
  [gnuplot manual](http://www.gnuplot.info/docs_5.4/Gnuplot_5_4.pdf) and building
  something up piecewise, but it was faster to instruct Claude directly. Once
  Claude showed me how to do a particular thing (e.g. draw a scatter plot with
  dots intstead of crosses) I then used the manual to find other similar
  options.
* Learning a large codebase / API

  Similar to the previous point. If I ask Claude to write a simple program using
  a complex publicly-available API, it will probably write a broken program, but
  it won't be *completely* bogus because it will be in the right "genre". It
  will probably use some real modules, datatypes and functions in a realistic
  way. These are often good leads for which code/documentation I should read.

  I used this approach to write some programs that use the
  [GHC API](https://hackage.haskell.org/package/ghc). There are hundreds of
  modules, and when I asked Claude how to do something with the GHC API it wrote
  relevant (if incorrect) code, which helped me teach myself.
* Cross-language poetry translation

  My partner is Chinese and sometimes we talk about Chinese poetry. I'm not very
  fluent in Chinese so it's hard for me the grasp the beauty in these poems.
  Unfortunately literal English translations aren't very good. We've had some
  success with asking LLMs to translate Chinese poems in the style of various
  famous English poets. The translation is generally semantically correct, while
  having a more pleasing use of the English language than a direct translation.
I like having fun with them, like by asking grok whether some Elon Musk tweet is true. Usually it replies with a lengthy answer and I then force it to answer with Yes or No. Even more fun when drilling it more to load a few more details into it's brain, and then ask the first question again, with a Yes/No only, it sometimes does change its answer. I do wonder, has Grok already joined the resistance against Musk?
Start from here :

Stop using Google search and use an AI. No more irrelevant results, no more ads. No more slop to wade through.

BTW I find Claude is great at making graphs and diagrams. If you pay ($20) you can hook it up to a local code base.

> No more slop to wade through.

Huh? More like "slop exclusively generated for you", right? I have seen so much garbage answers from chat AIs.

Any example code or blogs/docs that demonstrate making graphs/diagrams and/or hooking it up to a local code base?
Honestly, the main thing I've found ChatGPT to be useful for in my daily life is helping to translate what I write from my native language to the language spoken by most of the people where I live. But even then it only really works if you have at least a basic understanding of the language and can ask it to rewrite sections when you recognize poor word choices or awkward phrasing.
It helps to split what you are translating into 1-2 paragraph chunks and feed it one by one.
LLM are good at one thing, and totally by chance it is the thing they have been designed to be: be a word probability generator. If you can constrain your usage around that, they are great to use. But the people who think they can reason or know some kind of truth are delusional
Explain how o3 won a gold medal at this year's International Olympiad in Informatics, or provide your benchmark for reasoning.
It's very obvious from the mistakes they make that they are not reasoning but providing the most probable answer according to their dataset. It's very impressive because their dataset is humanly big, but there is no reasoning
I use it for lots of stuff where I'm not an expert, or that are low stakes. I don't use it for the "core" of my job, but there are many things that are not "core" and still eat up a lot of time, in fact, most of my workday would be in this category. Some typical examples from my daily life as a university professor:

- Writing Python scripts to make charts out of Excel sheets, and then refine them. I could do it myself, but I would need to learn a library like Seaborn or similar which honestly is not especially intellectually stimulating, and then spend nontrivial amounts of time iterating on the actual code. With LLMs it's a breeze.

- Working with cumbersome LaTeX formatting, e.g. transposing a table, removing a column from a table, etc.

- Getting the tone just right in a professional email written in English to someone I don't know much (I'm not a native speaker so this is not trivial).

- Finding resources on topics that are tangential to what I do. For example, yesterday I needed to come up with some statistics on English words for a presentation I'm preparing, and I needed a free corpus where I could search for an n-gram and get frequencies of next words. I don't usually work with that kind of resource, it was just a one-off need. I asked for corpora of that kind and got a useful answer instantly. The manual process would probably have implied going through several options only to find that I needed a license or that they didn't provide the specific statistics I needed.

- Brainstorming on titles for scientific papers, presentations, names of concepts that you introduce on a paper, variable names, etc.

- Shortening a sentence in a paper that makes me go over the page limit, or polishing the English in a paragraph.

- Summarizing a text if I'm kind of interested in knowing the gist but have no time to read it whole.

- Answering quick questions on basic things that I forget, e.g. the parameters to make a Linux folder into a tar.gz. Man is too verbose and it takes time to sort the wheat from the chaff, Google is full of SEOd garbage these days and sometimes you need to skim a lot to find the actual answer, LLMs are much faster.

- Writing bureaucratic boilerplate, the typical texts with no real value but that you have to write (e.g. gender perspective statement on a grant request).

- Coming up with exam questions. This is a rather repetitive activity and they're fantastic at it. At my place we also have two official languages and we need to have exam assignments on both languages, guess who does the translation now (respecting LaTeX formatting, which previous machine translation tools typically wouldn't do).

- As an example of a one-off thing, the other day I had to edit a Word document which was password-protected. I asked ChatGPT how to unlock it and it not only answered, but actually did it for me (after 3 tries, but still, much faster than the time it would have taken for me to find out how to do it and then actually do it).

These are just some examples where they contribute (greatly) to my productivity at work. In daily life, I also ask them lots of questions.

Companies have hijacked the open source concept to mean downloadable blob and we follow them as I see in the comments. It’s a real shame.
I remember the NVIDIA Linux kernel binary blob driver discussions from the early-mid 2000s. Who knew we had an open source driver all along...
Something isn’t open-source because you get everything that went into making it. Something is open-source if you can change it (relatively) easily. The GPL and open-source definition both refer to “the preferred form for making modifications”. The preferred form for modification in the Nvidia driver’s case is the source code. The preferred form for modification in this case is the weights.

Open-source as a concept doesn’t really correspond well with LLMs but to the extent that it does, access to the training data is not required because that training data is not the preferred form for making modifications.

> that training data is not the preferred form for making modifications.

I definitely disagree with this.

Yes, you can do some SFT fine tuning on an existing model, but if you want to make specific, substantial, targeted changes (less safety? better performance on math and code at the expense of general knowledge?), your best bet is to change the training mixture, and for that you need the original datasets.

Preferred by whom? Sharing models isn't open source, and we're just going to have to keep having this argument. Letting us download the model is a very nice thing for Facebook to do, but you don't get to call it open source if you're not showing us the source! Explicitly, if we can't see the forced alignment, where the model gets its refusal to talk about Tiananmen Square or how to make meth or it The Information is a reputable news source, then it's not open. The preferred form of modification is to take the data, and train it. That some people have been able to take the model and tweak it, doesn't make it preferable.
Why was the free software movement a thing when Windows was open source all along, haha.
In that case Linus needs to make a retraction.
NVIDIA ... THANK you!

linus turns to the camera, giving a thumbs up

Someone should make this video with AI.
It's started with abusing the term "AI", I don't see a reason why they would not abuse "open source". I guess it's what happens to language when a concept becomes mainstream: people use it wrong, but if enough people do it, it becomes the new meaning?

But I agree, it's a real shame.

I dislike when people like RMS get semantic and gatekeep words like "free software", but this is the end product of a world without gatekeepers. People just use words in a way that's convenient to them.
Or maybe some gatekeeping is actually good, and we just have to use more of our braincells and figure if a particular gatekeeper is good or bad. It's a good thing that being a pilot for an airline is gatekept to qualified pilots that know how to fly a plane. It's a bad thing that I need a hair cutting license to buy hair dye to dye my own hair at home.
You have to be careful with that, start giving out hair cutting licenses with reckless abandon a ton of innocent people could dye.
> People just use words in a way that's convenient to them.

Literally how language has always worked and evolved, though.

We often see semantic drift over a long period of time. It's just that the overarching topic of AI is being played significantly faster than what we would normally have observed in other fields.
Language has always been a push and pull between evolving (descriptivist) and correct (prescriptivist) usage. Neither side is going anywhere.
> correct (prescriptivist) usage

Oof, I know there's a bunch of linguists and grammarians who are going to mock you for that bracket.

Why? Prescriptivist/prescriptivism is afaik the usual term. Proscriptivist is the other, but quite rare, so rare that my phone dictionary says it’s a misspelling.
> Prescriptivist/prescriptivism is afaik the usual term.

It is but it was the "correct" part attached to prescriptivism they'd be mocking because that is not how linguists and grammarians work (they are descriptivists and fond of making fun of prescriptivists.)

Oh yeah lol I should have put it in quotes, but by the time I thought about it I was past the edit window! C'est la vie, I'll take the mockery.
  • ·
  • 3 days ago
  • ·
  • [ - ]
  • a-dub
  • ·
  • 4 days ago
  • ·
  • [ - ]
this is correct. "open source" means everything required to recreate from scratch and improve. not "here's a massive binary, an interpreter script and permission."
How can you even "open source" an AI model without all of the, presumably copyrighted and extremely voluminous, training data?
That could probably be solved with bit-torrent. I think the bigger obstacle is the hardware required for training. Maybe it would be possible for groups of people to reproduce/train open source models with a distributed BOINC-like system?
You would open source the procedure and reference where the data came from. If there is any non-open source content used in training, then the project couldn’t qualify as “open source”.

But this thread is about misuse of the term as applied to the weights package. Those of us who know what open source means should not continue to dilute the term by calling these LLMs by that term.

  • ·
  • 3 days ago
  • ·
  • [ - ]
You don't need the data itself, but at least a reference to what was used, basically provide the entire blueprint to recreate it.

It's just like even for a true open source software you still need to bring your own hardware to run it on.

You can't. But that's not an excuse to misuse the label.
  • a-dub
  • ·
  • 4 days ago
  • ·
  • [ - ]
that's how you know when you actually have agi, when you have something that you don't have to shovel in every written word known to man to make it work, but rather can seed it with a few dense public domain knowledge compendia and have it derive everything else for itself from those first principles- possibly going through several stages of from scratch training and regeneration.
The reason why you need to shovel every written word known to man to make it work is because it needs to learn what words mean before it can do anything useful with them, and we don't currently know any better way of making a tabula rasa (like a blank NN) do that. Our own brains are hardwired for language acquisition by evolution, so we can few-shot it when learning and get there much faster; and if we understood how it works, we could start with something similarly hardwired and do exactly what you said.

But we don't actually know all that much about how language really works, for all the resources we spend on linguistics - as the old IBM joke about AI goes, "quality of the product increases every time we fire a linguist" (which is to say, we consistently get better results by throwing "every written word known to man" at a blank model than we do by trying to construct things from our understanding).

All that said, just because we're taking a different, and quite possibly slower / less compute-efficient route, doesn't mean that we can't get to AGI in this way.

> Our own brains are hardwired for language acquisition by evolution, so we can few-shot it when learning and get there much faster

No, we can’t few shot it and we don't get there faster (but we develop a lot of other capabilities on the way.) We train on a lot more data; the human brain, unlike an LLM, is training on all that data in processes for ”inference”, and it receives sensory data estimated on the order of a billion bits per second, which means by the time we start using language we’ve trained on a lot of data (the 15 trillion tokens from a ~17 bit token vocabulary that Llama3 is something like the size of a few days of human sense data.) Humans just are trained on and process vastly richer multimodal data instead of text streams.

I was talking about language acquisition specifically. Most of the data that you reference is visual input and other body sensations that aren't directly related to that. OTOH humans don't take all that much text to learn to read and write.
> I was talking about language acquisition specifically.

Yeah, humans don't acquire language separately from other experience.

> Most of the data that you reference is visual input and other body sensations that aren't directly related to that.

Visual input and other body sensations are not unrelated to language acquisition.

> OTOH humans don't take all that much text to learn to read and write.

That generally occurs well after they have acquired both language and recognizing and using symbolic visual communication, and they usually have considerable other input in learning how to read and write besides text they are presented with (e.g., someone else reading words out loud to them.)

Feeling my inner Klingon, "Where is the honor in releasing a binary blob and calling it .. open source. Pfah!"
Linux doesn't ship a compiler or CPU when you download it. It's not open source I guess.
I'm guilty of this. I didn't publicly shame a coworker who installed fb's model and said "it's open source" just for the sake of peace.
Most of these claimed “open” models are not open source. Some of them are open weights. But even some of the ones that share weights are not really open - they force a restricted license on you. To be open source I think they need to share training data and training code under an OSI approved license.

AI2 has a model called OLMo that is actually open source. They share the training data, training source code, and many other things:

https://allenai.org/blog/olmo2

They also released an app recently, to do local inference on your phone with a small truly open source model:

https://allenai.org/blog/olmoe-app

While I do agree with your point - I wonder what information companies could release that'd be immediately useful to you.

It's not like they understand what the weights mean either and if they released the code and dataset used to create it, you probably couldn't recreate it, owning the fact that you don't own tens of thousands of GPUs.

If a software's source is released without all the documentation, commit history, bug tracker data etc., it's still considered open source, yet you couldn't recreate it without that information.

Thank you for pointing this out, I was not thinking clearly about this
Better a downloadable blob than a non-downloadable one
No, it's not, as it means nobody is pushing for actually open models.

A truly open model has open code that gathers pre-training data, open pre-training data, open RLHF data, open RLAIF data generated from its open constitution and so on.

The binary blob is the last thing I'd want - as a heavy user of LLMs I'm actually more interested in the detail of what all training data is in full, than I am the binary blob.

IBM Granite is actually open https://www.ibm.com/granite
Here's a real open one: https://allenai.org/olmo
  • __m
  • ·
  • 4 days ago
  • ·
  • [ - ]
at least for the pre-training data there are some open source torrent clients [0].

[0] https://news.ycombinator.com/item?id=42971446

  • aiono
  • ·
  • 4 days ago
  • ·
  • [ - ]
Parent doesn't argue about that. How is this relevant?
  • nicce
  • ·
  • 4 days ago
  • ·
  • [ - ]
This is the problem - we accept this approach and then they don't have to do any effort to correctly publish them in open means.
Cool, that’s not open source though.

That’s like a chef giving you chicken instead of beef and calling it vegetarian.

I’d say it’s more like eating Chicken Cordon Bleu and then asking the chef for a recipe, who replies, “Certainly! Step 1) Acquire Chicken Cordon Bleu, preferably cooked. Step 2) if uncooked, cook. Otherwise, consume
it's open weights
Weights-available. You wouldn't say open-binary.
It is? Do you have a link?
it's closed source and open outcome
So is asking ChatGPT to write your application, then open sourcing said application IMO.

I see both sides here, but I don't think it's a hill worth dying on. The 'open source' part in this case is just not currently easily modifyable. That may not always be the case.

This is still to be determined, based on whether the output of ChatGPT is copyrightable by ChatGPT, copyrightable by the requester, or something else.

I think the two plausible answers are:

1. The person prompting (for example telling chatgpt 'please produce a fizzbuzz program') owns the copyright. The creativity lies in the prompt, and the chatgpt transformation is not transformative or meaningful.

2. The output of ChatGPT is derivative of the training data, and so the copyright is owned by all of the copyright holders of the input training data, i.e. everyone, and it's a glowing radioactive bomb of code in terms of copyright that cannot be used or licensed meaningfully in open source terms.

There are existing things like 1, where for example if someone takes a picture, and then uses photoshop to edit it, possibly with the "AI erase" tool thingy, they still own the photo's copyright. Photoshop transformed their prompt (a photo), but adobe doesn't get any copyright, nor do any of the test files adobe used to create their AI tool.

I don't think AI is like that, but it hasn't gone to court as far as I know, so no one really knows.

An llm isn't software any more than a matrix is.

What do you think an open source matrix should look like?

A compiled executable is not any less software than the source code. But the point of open source code is not the ability to see the CPU instructions though, is it?

Its about reproducibility and modifiability. Compiled executables (and their licences) lack that. The same as these downloadable blobs.

You make the start of a good point, but miss most of it.

You can absolutely have open source machine code.

The issue is and always has been that you need to have access to the same level of abstraction as the people writing the source code. The GPL specifically bans transpilers as a way to get around this.

In ML there is _no_ level of abstraction other than the raw weights. Everything else is support machinery no different to an compiler, and os, or a physical computer to run the code on.

Linux isn't closed source because they don't ship a C compiler with their code. Why should llama models be any different?

where did those weights come from?
An algorithm with no idea of what abstraction even means.
Is this question in good faith? The way generated code and data should be open sourced is by releasing the tools and configuration used to generate it. There's never been much confusion around this, to my knowledge.

I'm not even necessarily advocating that these things should be released, but the term "open source" has a pretty well-understood meaning that is being equivocated here.

Credit to the engineers that built this, but it fills me with rage that Elon has this sort of unchecked power.

How long before this starts getting deployed in safety critical applications or government decision making processes?

With no oversight because Elon seems to have the power to dismiss the people responsible for investigating him.

Anyone not scared by this concentration of power needs to pick up a book.

What's remarkable to me about criticism like this is how quickly it's rebutted by people claiming "where did they say they would do this," as if these people don't make incredibly rushed and poorly planned decisions all the time. It's like an idea immune system that rejects any criticism or self reflection. It would be sociologically fascinating if it wasn't being combined with a dereliction of congressional power and an unchecked executive.
  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
I literally post for years the same thing about the need for open source alternatives to social platforms where 1 person controls the algorithm to prioritize what a billion people see. And the response is “meh”. No one even bothers to read past the first paragraph:

https://news.ycombinator.com/item?id=43036350

But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in. Any time one even mentions “decentralized byzantine fault tolerant” anything, an army rises up to repeat anodyne versions of “grift… no one needs it… banks are great…” etc.

https://news.ycombinator.com/item?id=43073421

But if you mention any concerns with AI, no matter who or what you cite, the same group goes the other way and always repeats “(insert problem here) has always been possible, there is nothing to see here, move on, AI is amazing, deregulate and let the industry develop faster”:

https://news.ycombinator.com/item?id=40900155

It’s groupthink at its most obvious, repeatable, always on, and I wonder how much of it is organic.

  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
Having been on the internet for a very long time, I can answer why open source alternatives to social platforms seldom get off the ground: the network effect is huge, and the community of users matters far more than any of the technology.

Don't bother telling people how it works. Show them who's using it and for what.

Oh, and for any kind of "normie" use it must have a decent moderation and anti-abuse system. Which inevitably clashes hard with "decentralized". Bluesky is succeeding because it lives in a contradiction of pretending to be decentralized, but what it really offers is the "pre Elon Twitter" experience. To basically the same people.

> the same group

While there's a certain amount of hivemind, it's rare that you see people directly contradict their own posts here; what you're seeing is different people.

Or, people have different opinions about who should have power over social media, banking, and AI, for completely rational reasons…
This is the problem we're working on with https://freenet.org/ - a general purpose platform for building entirely decentralized services.

Our thesis is that the client-server architecture is a fundamental flaw in the world wide web's design, which inherently concentrates power in the hands of a few. Freenet aims to be a general purpose replacement for this in which all services are entirely decentralized.

The first non-trivial app we're building will be a group chat system called River[1].

[1] https://github.com/freenet/river

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
I like the new Freenet! I interviewed your founder, Ian Clarke, 2 years ago on my channel — discussing the original freenet, probably the first truly decentralized content network in the world. Here is the 2-hour discussion:

https://www.youtube.com/watch?v=JWrRqUkJpMQ&t=12m0s

Look around the 12 minute mark when I start to discuss how “the capitalist system” produces centralized monopoilies that extract rents for their shareholders.

  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
Freenet is 25 years old. It never took off, what makes people think it will take off now?
The original Freenet had over 6m downloads over the years - and pioneered ideas like cryptographic contracts which later formed the basis for bitcoin, but it was always a very experimental project, while the new Freenet is designed for mass adoption.

They key differences between old and new Freenet are:

Functionality: The previous version was analogous to a decentralized hard drive, while the current version is analogous to a full decentralized computer.

Real-time Interaction: The current version allows users to subscribe to data and be notified immediately if it changes. This is essential for systems like instant messaging or group chat.

Programming Language: Unlike the previous version, which was developed in Java, the current Freenet is implemented in Rust. This allows for better efficiency and integration into a wide variety of platforms (Windows, Mac, Android, MacOS, etc).

Transparency: The current version is a drop-in replacement for the world wide web and is just as easy to use.

Anonymity: While the previous version was designed with a focus on anonymity, the current version does not offer built-in anonymity but allows for a choice of anonymizing systems to be layered on top.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
Can you drop me an email? Would like to have a conversation about our respective roadmaps and helping each other

http://qbix.com/about

Will do.
> But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in

I'm all for distributed / P2P social media, but crypto is full of some of the most scammy and downright shameful behavior I've ever seen in my life. Pump and dumps, rug pulls, money laundering. There is a real reason people hate crypto.

To tip it off, crypto is one of the least meritocratic things there is. The longer you've been in it, the more people you've scammed, the more you hype, the "wealthier" you are.

Crypto smells like a shit and vomit sandwich and people immediately turn their noses.

Build P2P social without the crypto angle and you have my attention. I've been wanting p2p (not federated) social media since the 200Xs and the decline of the indie web. Social and news should work like email and BitTorrent, not Facebook or "federated Twitter".

> I'm all for distributed / P2P social media, but crypto is full of some of the most scammy and downright shameful behavior I've ever seen in my life. Pump and dumps, rug pulls, money laundering.

The SEC's answer no questions, sue first, approach to crypto in general made legitimate players afriad to operate, so the space became dominated by those that didn't care about the law.

  • pjc50
  • ·
  • 3 days ago
  • ·
  • [ - ]
> The SEC's answer no questions, sue first

This isn't true, and last time someone tried to prove it was, they cited .. a huge PDF of all the questions the SEC had been asking crypto firms prior to action.

Besides, the rules are over now. The US President ran a pump and dump. Can't get more legitimacy than that.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
Many on HN don’t believe there ARE any legitimate uses of crypto in the first place.

Here are some:

https://intercoin.org/applications

But most comments I get are “I stopped reading 2 seconds in when I saw the word Web5.”

(We started using it when Jack who founded Twitter, started bluesky, promoted nostr started using it).

Here is a graphical presentation that can drive it home:

https://intercoin.app/investors.pdf

  • svara
  • ·
  • 3 days ago
  • ·
  • [ - ]
I looked at your links and I still don't get it. I do want to understand. Where is the problem stated, clearly and concisely? What is the solution and why does it require crypto?

I say that as someone who read the Bitcoin paper in 2012 and was pretty excited back then.

Meanwhile online scams are a bigger industry than the illegal drug trade and bigger than the Australian economy. There are thousands of modern day slaves in call centers in Myanmar and the Philippines with fake social media profiles running pig butchering scams. That industry runs on crypto 100%. I guess that's one "problem" crypto solved.

You need some pretty convincing arguments at this point to convince me (and many others) that getting rid of this stuff wouldn't be a big win for humanity.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
The problem is relative to WHOM.

Here is the problem statement and solution for community leaders, the same class of decision makers who exited “AOL Keyword NYTimes” in favor of “nytimes.com” on this newfangled protocol called HTTP, with its servers and clients called browsers that people were downloading

intercoin.org/currencies.pdf

When they asked for a clear and concise description of your problem and solution, they are probably looking for a problem statement: a focused, 1 or 2 sentence explanation of the problem you intend to solve. You then present your proposed solution in the same form.

Hypothetical example problem statement: We want to promote ycombinator to everyone that could benefit, but banner ads make us look chintzy, directly engaging in the feral discourse on Slashdot would inevitably look unprofessional, and engaging directly through dozens of purpose-built blogs and websites is too onerous.

Hypothetical example solution statement: We should create our own simple, well-designed news site built on user submissions, and include threaded discussion capability with moderation built in at both the community and company level to keep things relatively civil. Then our audience will come looking for us.

What you offered is not a problem statement. It is a sales deck offering a, frankly, convoluted explanation of how starting a currency will solve a largely unrelated problem backed up by an unsupported assertion about the least representative sample in the world— Donald Trump.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
Or, you could actually read the deck and it explains the problem.

At this point, I think this is just performative

  • svara
  • ·
  • 3 days ago
  • ·
  • [ - ]
I read it all. It's apparently supposed to be a way for celebrities to extract money from their audience by having them buy into their currency.

If you're satisfied with calling that useful, okay, I guess - to me it's deeply alarming that this is presented as a good example of a useful application of crypto.

In the broader context of crypto demand being driven essentially by digital crime and gambling, there would need to be some seriously glowing example of something good that can be done with it to shift my judgment.

For example, in the early days of Ethereum, I thought it'd be possible at some point to build truly open source, decentralized SaaS, where the deployment happens to the blockchain, and that this in turn would enable open source projects to finance themselves.

I've yet to see an example of this where the crypto aspect isn't a figleaf.

I'm very concerned that people arguing for exciting applications of crypto are involuntarily legitimizing the online crime ecosystem. Crypto in practice seems to lead to a massive transfer of assets to criminals. To an extent where that may end up destabilizing whole countries, given the market cap trajectory.

It doesn’t explain anything. It asserts a lot. Sorry I took the time to critique and give examples as a freelance business communication designer. Effective business communication requires frank feedback, and mine usually isn’t cheap, but if protecting your ego is the goal here, just keep assuming you’re doing everything right and it’s everybody else’s fault it’s not landing.
> We started using it when Jack who founded Twitter, started bluesky, promoted nostr started using it

Jack Dorsey is certifiably insane. His obsession with cryptocurrency is a warning to anyone that throws away success to live as a crypto maxi. You will lose the only things that matter to you in life, your business will be taken away from you by shareholders if you own one. Your control will be hated by users that accuse you of trying to ruin the internet with NFT profile pictures and crypto tickers. Many users outright left as a consequence, others would leave after the takeover. But Dorsey set the stage for the enshittification of Twitter, and anyone that's forgotten that should face the music.

Web5, no matter who utters it, is a phrase that means nothing. A person walking on the street would not be able to define it for you. Me, a programmer, cannot define it for you or even explain what it looks like. It is a marketing term as applied to Free Software, which will alienate Free Software users and disgust/confuse common people. If you cannot find a better phrase to describe your philosophy then people will forever associate you with the myriad grifters that shared your "Web(n)" branding.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
I defined it very clearly

  Web2 (community) +
  Web3 (blockchain)
  
We need to combine the two. Web3 by itself is lame, Web2 by itself is blind.
  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
I have been building it, in fact.

https://qbix.com/ecosystem

Do I have your attention now?

Ten years and $1 million dollars later, it’s free to use, but we haven’t started promoting it yet, still testing with paying clients:

https://github.com/Qbix

Here are some ideas:

https://qbix.com/community.pdf

https://qbix.com/alumni.pdf

This is interesting, but it feels too platformy for my use. I'd really like to see something 100% like BitTorrent.

Instead of trying to build a "you.com" (as in your pdf example), I want a place we're all just a simple disposable signed hash address (that you can change, make public, keep pseudonymous, etc.) - easy and disposable if needed, but also possible to use as the building block of an online presence or brand if your hash becomes well known. Kind of like email, in that sense.

The platform doesn't need real time streaming video or video calls. Just text and images to start. P2P Reddit or Twitter.

It shouldn't be about building a platform where you attract others to your brand. That can come later. It should be about participating in a swarm and building up a mass of people. An exchange of ideas and information, but where it feels like being in public. Like BitTorrent. Once network effects kick in, you can find the nodes (people, topics, etc.) you care about and want to prioritize in your interest graph.

> But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in.

Yeah that sounds like a feature, not a bug.

It's remarkable to me how "Web3 is a grift" has seemingly become tribal consensus here, without any real basis in reality. I think the last administration's explicit efforts to block crypto legitimization played a big part in this. It's clear that if you tried to follow the law and operate as a legitimate player, you risked being debanked or legally targeted by the SEC—and they made little to no effort to answer questions or help you work within the law's constraints. They wanted to sue first. As a result, those who ignored the law ended up dominating the space. This reflects policy failures, not issues with the tech or its legitimacy. I'm hoping the Trump administration shifts this dynamic, but now there's a reputation problem that needs correcting as well.
To quote Patio11, "It's not a conspiracy if they really are out to get you."

Crypto in general and Web3 as well, all have mostly delivered scams. To the tune of billions stolen from everyday folks. Everything (to within a rounding error) that hasn't been a scam has delivered nothing else but being a speculative asset at best. Everything else has been a barely working toy that's better served by non-distributed implementations of the same thing.

People shit on crypto. government, regulators, and the public, all dislike crypto because the only thing that ever happens to us with it and the only thing we ever hear about happening, is folks losing money to scams.

There's no mystery here. Crypto doesn't need a policy shift. Crypto needs to stop fucking over folks. Yes it's cool technology, yes it also seems to just be a way to part folks from their money.

  • EGreg
  • ·
  • 3 days ago
  • ·
  • [ - ]
That's like saying the only thing that ever happens with AI is people losing their jobs to AI. And unlike Crypto, they didn't opt in and literally buy digital assets and send them voluntarily somewhere. They get negatively affected regardless of any choice they have made. "Get on board, or get rolled." People worldwide would lose a lot more money to AI growing than crypto growing, regardless of never opting in. It will just be a giant wealth transfer to the already-wealthy and corporations. What about that? Oh, crickets. Dismissal from the HN crowd.

If I am going to put my money at risk, I expect it to be at risk. I'm happy to have a regulatory framework around that from the SEC, for instance, and there are. For example, since the JOBS Act, the SEC has greatly expanded the opportunities to raise money in a regulated way. I even interviewed the actual authors of Regulation S at the SEC, where I go into depth for an hour about how to raise money legally:

https://www.youtube.com/watch?v=ocrqgkJn4m0

FINCEN has also been putting out guidances to the crypto industry since 2013:

2013: https://www.fincen.gov/statutes_regs/guidance/pdf/FIN-2013-G...

2019: https://www.fincen.gov/sites/default/files/2019-05/FinCEN%20...

So the regulations are there.

And frankly, most true adherents of crypto have been yelling from the rooftops that Celsius and FTX and Binance are not actual DeFi. They are not decentralized, they simply tell you to do the very thing crypto was designed to avoid -- i.e. send them money and "trust them". This is the very thing Bitcoin and Blockchain were designed to avoid -- the middleman.

FileCoin and UniSwap and Aave Marketplace and so on are real crypto, and they have never had any scandals and billions of dollars, bits, etc. are entrusted to them every day. Ditto for most altcoins and networks, including Hedera Hashgraph, Polygon, COSMOS, Polkadot, etc.

Any shade thrown at, e.g. Telegram's TON or Ripple's XRP, is due to regulators. I can understand why Facebook's Libra was shut down. But it has to do with them becoming "too powerful" and "not subject to national oversight". Kind of like Facebook and Twitter and Google themselves.

Everything that you just mentioned, as far as "what it's actually doing" is either speculation/speculation accessories or is a not-as-good-version of existing offerings. Where is the value?

UniSwap: a marketplace for speculation on arguably scam crypto products.

Aave Marketplace: a marketplace for speculation on arguably scam crypto products.

FileCoin: file storage at rates 50% higher than e.g. BackBlaze/DigitalOcean.

There's no actual value here other than as scam, speculation (nearly a scam), or products that are flimsy pretenses at not being scams (but which don't deliver a lot of value). Why should anybody care (other than transparent greed)?

You do realize you and your OP are currently top comments in your respective threads both criticizing Elon and even preemptively criticize your imaginary criticizers
  • ·
  • 3 days ago
  • ·
  • [ - ]
Elon's unchecked power at building a model? Or at politics?

I always worry whenever I see people telling me how to feel - rage in this case. We are in a political system that is oriented more around getting people to feel rage and hatred as opposed to consensus and deliberation. Elon is the face of that, but it's a much longer and larger problem. Throw in the complete dismisal that anyone not scared of this is ignorant, shuts down discussion.

The problem I have with Elon is that they are wasting a once in a lifetime chance to actually address and fix systematic problems with the US government. Deploying LLMs in the government space doesn't fear me with dread. Continuing the senseless partisan drive of the 20 years does.

> Continuing the senseless partisan drive of the 20 years does.

I think what the government is going through right now is wrapping up the last political system. The idea that Democrats and Republicans just need to learn how to work together is just wrong. The parties are being destroyed, and I think we should all cheer that. They were built to address the issues of the 20th century, and neither party in the current form is ready to address 21st century issues. I think AI, Climate change, world demographic changes around the world (ie: low birthrates) is going to seriously alter everything about our world from geopolitics, to economy, even social issues.

The democrats are stuck in supporting the new deal bureaucracy and the post ww2 order. That's over, it's crumbling right now, and i'm not going to try and defend any of it personally. It's just obsolete. The old Republican party your dad probably supported is dead too, that died a while ago. The new Republican party seems to be an alliance of people who just really want to cheer the crumbling of the old system (MAGA) and the first emergence of what politics in the 21st century is going to look like (the tech alliance).

Democrats would be smart to understand it's a new century we have new threats, new challenges, and need new institutions.... and this IS NOT a once in a lifetime opportunity to fix our government. This is the first draft of our new political system, and they have a choice to participate in shaping it, but they will need to get votes, and to get votes they need to stop stalking about obsolete ideas.

>The democrats are stuck in supporting the new deal bureaucracy and the post ww2 order

> The new Republican party seems to be an alliance of people who just really want to cheer the crumbling of the old system

I agree, and I think this is a bizarre flipping of the "Democrat ~= progressive / Republican ~= conservative" dynamic that has been largely assumed throughout my lifetime.

We need both conservative and progressive forces in our society. Someone needs to be saying "here's what's wrong with our system; here's what needs to change", and someone else needs to balance that with "here's what we're doing right; here are the parts that are working well and that we should not get rid of".

It seems to me that now, instead of that tug-of-war discussion happening between the two parties, it is happening in parallel within them. Unfortunately, the sane and responsible version of that discussion is happening entirely within the boundary of the Democratic coalition, in a way that is completely ineffectual because (a) the internal conservative moderating force is relatively strong in a moment when the populace seems to want more progressive action, and (b) to they have so little ability to effectively wield political power.

Meanwhile, the Republicans are dominated by a bizarro "progressive" faction that wants to pull us all in an entirely different (IMHO regressive) direction. And that faction is completely unchecked by any internal "conservative" moderating force within its own party, so it is for the moment able to push us as hard and fast as possible in its preferred direction.

> It seems to me that now, instead of that tug-of-war discussion happening between the two parties,

I'm REALLY looking forward to 2028, because I think that potentially will be the first election where we start to see what modern politics will look like. I wouldn't be surprised if there are multiple new parties, and multiples of them have a real chance. If it seems one sided right now, it's just because one side found their way to the start line first... but make no mistake, history shows that over time new political factions will form that offer resistance to bad ideas, and clear a path for the good ideas.

Given the rate of change with AI, We're going to have a real idea on what a world being disrupted by AGI (whether that is true AGI, or something close to it) looks like. At the same time Healthcare is only getting worse, and Trump is NOT going to fundumentally address it. China is going to be rising, and they're a real geopolitical threat. The war in Ukraine has completely changed what warfare looks like, and we're going to have to completely restructure our military (just like we have to restructure our healthcare). I also wouldn't be surprised if Trump's war with the cartel turns out to be far harder than expected because cheap autonomous drones allow a small military to compete against a large traditional military.

All of our prior assumptions on retirement are different too, retired boomers are not the same as the pensioneers from their day. They're not impoverished, instead they're flush with cash. I'm not sure in a world with an aging workforce you're going to be anti-immigrant... and all these benefits we give to retirees may not make sense in a world where retirees are wealthier than the regular workforce supporting them.

The general theme for the next decade is going to be throw out all the old books, 80% of our prior assumptions no longer apply.

Is this new political system akin to a banana republic? Because that’s what happens when you replace non partisan workers with loyalists in order to eliminate all accountability and oversight. Turning the rule of law into a partisan issue is a receipt for endemic corruption.

And even if you think the rule of law is antiquated, you’re misanthropically cheering the destruction of the largest institution in the world that 330 million people depend on for survival.

Consolidating power in the hands of the few very rich is not something new, it's just the old come again.
I might cheer if the replacements weren't objectively worse in every measurable way.
> actually address and fix systematic problems with the US government

I wonder if you could even name what some of these critical problems are? Or have you just been told that there are problems that justify this chaos?

I'm happy to, thought the end of your statement strongly suggestions that you have are not acting in honest faith by asking this question. 1) All positions have become partisan, which political ideology being as critical to promotion in high level positions. 2) Congress refuses to act as the constitution intends, and have delegated their budget making authority to the executive branch. 3) The government specific procurement system is almost as expensive as what is being procured. 4) Auditing the government is almost impossible. 5) The debt load on the government is becoming unsustainable. 6) The lack of "digital transformation" (what we called it in banking) means poor service. 7) The unfunded liabilities (mostly at a state level) will swamp budgets in a few years. 8) Most large contracts should be fixed contracts, not cost plus contracts. Companies can do bilk the government for things that are a order of magnitude cheaper to the outside world. 9) Medicare refuses to lower health care costs (by reducing rates) due to political pressure. 10) No rationalization of government spending or revenue has occured since the post world war 2 era.
1. Making all positions partisan is a fascist tactic to challenge objective truth.

2. Congress as a whole isn’t a single entity —- one party refuses to compromise in any way while the other plays by the rules.

3. Doesn’t matter. Cost reform needs to go through existing legal routes.

4. What constitutes “auditing” the government? Because we had plenty of non partisan positions overseeing and auditing all parts of the government. DOGE fired those people.

5. Again, go through the legal route.

6. A lack of “digital transformation” is the vaguest most unconvincing point in this entire justification.

7. These budget issues need to be decided on through constitutional processes and with oversight, as before.

8. Ditto.

9. Medicare can lower health costs by other means, such as being available universally to all and setting limits on what they pay to providers based on procedure.

10. Do you watch CSPAN?

All of your points can be summed up as "Congress refuses to do their job".

Breaking all the laws to bypass the government does not "actually address and fix systematic problems with the US government", that is an absurd position. Caesar did not fix the Roman Republic.

And opposition to DOGE is not on the basis that people don't care about government efficiency. It's on the basis that the shit they're doing has nothing to do with government efficiency. There's not even a pretense of trying to calculate the "benefit" part of the cost-benefit equation with the cuts they are doing, they are just slashing and burning without any concern for outcomes as a power play and messaging tool. Elon is famous for doing this at Tesla and Twitter and all evidence points to it being incredibly harmful.

This isn't efficient! https://www.washingtonpost.com/dc-md-va/2025/02/15/return-to...

And not everything is about efficiency. Laying off veteran's crisis hotline workers or refusing to pay for the USAID employees you've just abandoned to be extracted (or in one case, medevac'd after a medical emergency) from the places they were sent to is just cruel (and again, illegal).

  • m2024
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
  • Hasu
  • ·
  • 3 days ago
  • ·
  • [ - ]
> I always worry whenever I see people telling me how to feel - rage in this case.

No one told you to feel rage.

> Throw in the complete dismisal that anyone not scared of this is ignorant, shuts down discussion.

Weird, there are a lot of comments doing discussion in reply to the parent comment. It hasn't been shut down at all! You read those words and disagreed with them, and wrote your own words in response. You're doing the discussion you're claiming is being shut down! What are you even talking about?

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
But it is a partisan issue. All these people on fat NGO salaries, all these federal workers not pulling their weight, all the welfare abuse, all these aid payments - which party do you think is keen to keep the spigot flowing? Of course, it would be a shame if they didn’t audit the Pentagon as well, definitely massive graft happening there.
Allegedly, the Pentagon sees the writing on the wall and is trying to get a head start on DOGE

https://www.wsj.com/politics/national-security/doge-departme...

It’s just wild to me that an attempt to tally up whats in the community grain store and where it’s allocated out to is even considered a bipartisan issue.
The problem is the person doing the tallying is doing it behind closed doors, has routinely been shown to lie to further his interests and has already been caught lying with the tallies he's released.

The GOP controls both houses and the POTUS. They could absolutely do a top to bottom audit with full transparency and make cuts where needed. But that's not what this is about.

Is that totally true, though? Maybe they have pulled wool over my eyes, but it seems like we've seen more transparency in the last few weeks than the last 40 years.

Just poke around a bit: https://doge.gov/savings

And please even try to explain how this sort of thing is even remotely in America's best interest:

https://www.usaspending.gov/award/CONT_AWD_FA865018C7886_970...

> ACTIVE SOCIAL ENGINEERING DEFENSE (ASED) LARGE SCALE SOCIAL DECEPTION (LSD)

Then there's the basic accounting 101 things like improper categorization, 150 year old people getting social security, etc. Why should the US government be held to a lower standard than a publicly traded company?

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
This ASED and LSD, aren’t they services to help the state counteract an information warfare attack? Just guessing, but it sounds like a legitimate thing where they’d want capacity to uncover/expose such activities, which I’m sure adversaries would consider.
  • deng
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yes, the contract was for researching defenses against deception, was first awarded under Trump and also on public record, visible for many years, not "revealed" by anyone, especially not those DOGE masterminds. But what's even the point now? I think we're past discussing any facts here, because OP has a "sniff test" instead (see answer below).
Sure, sure. Pentagon money going to the western press, USAID (a literal terrorist organization) funding both sides of the narrative, what could go wrong?

There was a time liberals screamed at the top of their lungs over this type of threat to democracy, now they embrace and endorse it because they’ve fully merged with the primacy neocons.

> USAID (a literal terrorist organization)

Maybe get your news from somewhere other than Twitter.

Maybe you could recommend some western news sources that haven't been infected by USAIDS?
  • deng
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yeah, thanks for proving my point. Have a nice day.
Do you have a third party audit that this is true or have those datasets all been removed? Musk has shown himself unknowledgeable at best and purposely lying at worse so many DOGE findings are hard to take at face value.

https://apnews.com/article/usaid-funding-trump-musk-misinfor...

https://www.forbes.com/sites/conormurray/2025/02/10/elon-mus...

Did you even read those articles? Full of BS excuses and justifications. None of them pass the sniff test by any honest person with above room temperature IQ.

People are just angry at Musk for turning their safe space into a free speech platform then switching sides. And that he’s now taking away their sides unlimited slush fund.

You clearly don't want to read anything outside of Twitter/Musk, but another error, fixed/hidden in order to keep showing incorrect data that looks better for DOGE.

https://www.nytimes.com/2025/02/18/upshot/doge-contracts-mus...

The DOGE website initially included a screenshot from the federal contracting database showing that the contract’s value was $8 million, even as the DOGE site listed $8 billion in savings. On Tuesday night, around the time this article was published, DOGE removed the screenshot that showed the mismatch, but continued to claim $8 billion in savings. It added a link to the original, outdated version of the contract worth $8 billion.

So much honesty and transparency out of this group.

Man people can’t stand that Elon turned Twitter into a free speech platform. Anyways, I'm more of a long-form article, book, podcast and travel guy when it comes to informing my opinion.
I don't know any voters who want fraud to continue, but most do accept that fraud is just a part of any system designed and implemented by humans.

I personally would like to see the end of the "find gravy train, keep that gravy flowing at all costs" methodology of capitalism, because it's primary focus is money instead of the service provided. Whether it's pentagon contractors, business subsidies, or the heinous medicare and medical insurance fraud. But I don't want to cut SNAP even if someone buys a goddamn coke once in a while.

The current method seems to be brain surgery with a monkey wrench. Slash and burn with little thought given to the effects upon humans, especially those who don't have other options. Kagi gave me a figure of between 9.2 to 15.2 percent of welfare being fraudulent. Yes that's too high, yes I'd like to fix that, but I want that change to be considered, studied, and planned with caution.

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
Tbh I think “move fast and break things” is what’s needed. The government bureaucracy has ossified over many years, and any attempt to change it gets bogged down in “committees” and “inquiries”. The only thing that will work is shock and awe, and if something important does get broken, it’s east enough to fix when its criticality becomes evident.
Has it occurred to you that the people who feel rage fundamentally understand the situation, and you may be undereducated in this area? What do you think are the root causes of that “senseless partisan drive”?

I’d suggest starting with Rick Perlman’s book “Nixonland” if you’re interested.

> Has it occurred to you that the people who feel rage fundamentally understand the situation, and you may be undereducated in this area?

Regardless of how justified the rage is or not, being very emotional about things usually have one of two effects on people A) people gets taken aback by someone's strong emotions or B) people get inspired/taking with the emotion, even feeling that emotion stronger themselves. Sometimes also C) they don't really care either way.

What probably isn't helpful, is calling someone is "undereducated" when they're clearly saying that they're person (A), and just because they may or may not agree with you (although parent didn't even clearly say they disagree, just that they're "taken aback" a bit).

Some people are calm regardless of what's going on around them, even if the world would be on fire, they'd try to describe what's going on around them with careful words and consideration. It isn't wrong or right, just like the people who feel rage and very emotional aren't wrong or right, it's just a showcase how we're different.

But we should aim to at least understand each other, not by trying to provoke, instigate or look down on others, but by asking questions to clarify and to better understand.

  • elorm
  • ·
  • 3 days ago
  • ·
  • [ - ]
You're doing the exact same thing he is addressing in that statement above. He's not belittling anyone's rage, he's speaking about people who incite others to feel the rage with them. Now let's turn your question around.

Has it occurred to you that the people who feel rage fundamentally misunderstand the situation and are completely undereducated in this area, and are only fuelled by sensationalism and Media manipulation? And then I suggest you go read Dirty Politics by Kathleen Hall Jamieson if you're interested, because that's what people who want to sound more intelligent than the other half of the conversation always do.

How does it help anyone?

Given the two of you probably have different models of reality, perhaps you two can try and figure out which is correct by seeing which model gives better predictions?

So try to come up with some sort of future observation that can be made where you think the other person's model would give a different answer to yours about what you would be able to objectively observe.

What do you reckon?

Over what time scale, how do we agree on facts, and how do we evaluate things that require a common value system to determine whether the facts are good or bad?
The idea would be that the two of them collaboratively agree on some observable prediction they differ on. E.g. level of officially reported government spending in 4 years time or gdp growth rate next year or number of plane crashes next year or what have you.

Just some observable metric.

If they literally can't come up with a single observable predictive difference then the predictive aspects of their models are actually equivalent and they are only narratively different and don't "really disagree". Like Copenhagen interpretation vs many worlds.

Many things don't have quantifiable metrics like that. For example, is USA still a democracy in 4 years? Are people more or less free?. You know, important questions that aren't just economic numbers. Even semi-quantifiable stuff like "are Americans better educated" is debatable on many topics if you can't agree on truth. Oh, and that GDP growth rate number? That relies on a lot of trust as to who's doing the reporting. For example, many people don't believe China's reported GDP numbers. What makes you think the USA doesn't devolve to such a distrust as well.
If they affect your life they can be observed.

If "democracy" is just metaphysics then it's irrelevant. But if it has actual tangible effects such as "can you vote?", "can you protest the government?", "is the leader of the opposition arrested?", "do most people think they live in a democracy?", "how popular is new legislation compared to previous years?", etc...

Then you can make predictions about it and test them!

You can even do local predictions if both can agree, such as "will the combined incomes of my family be higher or lower in 4 years time?" as low coupling proxies for gdp. (Ideally one would use probabilities for loosely linked proxies like that and use the probability differences the two theories assign to give bits of evidence to one over the other, so you'd want many many such proxies, ideally uncorrelated ones)

> can you vote? can you protest the government? do most people think they live in a democracy?

Was Jan 6 a protest of the government or an insurrection? Can Russians vote or are elections a sham? Do the majority of Russians believe they live in a democracy if they’re afraid of whose conducting the polling (or the MAGA non response to polling)? Those are values question that require you to have an agreement on reality.

> You can even do local predictions if both can agree, such as "will the combined incomes of my family be higher or lower in 4 years time?" as low coupling proxies for gdp

Your personal income has absolutely no predictive value on gdp. It’s more predictive of whether you personally made successful bets or even if you’re better at sucking up to the current power structure. It tells you nothing about population level metrics if you have no way of conducting reliable population level surveys. For example Donald Trump’s personal net worth sky rocketed under Biden because he won the election while as the leader of the opposition to the democrats was looking at jail time and whether that was legitimate or not depends on which political lens you look through it.

> If they affect your life they can be observed.

Ah, but if either side distrusts the other about whether the observation made is truthfully reported, how do you solve that? It requires some amount of trust and right now there’s a very clear divide there.

There are definitely tangible predictive differences in the case of, say, Russia vs USA. Things like "If you go to the capital with a bunch of friends carrying placards saying '$LEADER is corrupt and evil and should be replaced by $OPPOSITION' how many of you end up in a jail cell in the next day?".

If there is literally no tangible difference then it's just label games and metaphysics and doesn't matter.

> Your personal income has absolutely no predictive value on gdp.

It actually is correlated (admittedly in most day-to-day cases it's just a lagging indicator, but things like natural disasters hit both). It's not the strongest correlation but it would still be evidential. Definitely under 1.0 bits though... One would need a LOT of such observations and having them not screen each other off to start getting a convincing number of bits.

Probably not realistic to have humans manage these sorts of numerous tiny updates though...

/nitpicks

> Ah, but if either side distrusts the other about whether the observation made is truthfully reported, how do you solve that? It requires some amount of trust and right now there’s a very clear divide there.

Yeah, it gets much trickier like that. But I do think two reasonable people from the opposite political sides could agree on some sort of observable to the extent their disagreement is anything other than narrative.

> Things like "If you go to the capital with a bunch of friends carrying placards saying '$LEADER is corrupt and evil and should be replaced by $OPPOSITION' how many of you end up in a jail cell in the next day?".

If the other side calls it a violent riot does it still count as people getting put in jail? Cause the Jan 6 insurrection and BLM protests occurred at about the same time and are viewed very differently depending on which political lens you put on.

> If there is literally no tangible difference then it's just label games and metaphysics and doesn't matter.

You’re discounting feelings as if it doesn’t matter. But if people believe or feel like they live in a dictatorship, what quantitative data are you going to use to disprove that. Moreover, why aren’t feelings valid when talking about politics which is fundamentally an emotionally driven human activity and not a data driven one? By the way the left believes they live in an authoritarian dictatorship under Trump while the right believes they lived in an authoritarian dictatorship under Biden. And political power literally is the power to emotionally manipulate others because you individually can’t actually accomplish anything by yourself.

  • jtrn
  • ·
  • 3 days ago
  • ·
  • [ - ]
Has it occurred to you that nothing is more powerful for coming up with intellectual arguments than a strong driving emotion?

Yes, rage might be the appropriate and response given the situation. But it’s often true that it starts with an emotion, and then people just argue from there. Even while being wrong. Just look at all the people with contradictory opinions in history, both with strong, emotional rage, and and equally certain of their connection. Throwing the fact that people actually has a tendency to want to be angry.

Rage is the fuel of the internet, but it’s fundamentally useless when it comes to seeking truth. Social media platforms are engineered to maximize engagement, and the most engaging emotion is anger. This isn’t accidental—outrage drives clicks, shares, and ad revenue. The internet has long been called a “hate machine,” and there’s plenty of truth to that.

This creates an environment where misinformation and emotional appeals spread faster than facts. When discussing complex, non-trivial topics, logic and reason are the only tools that can cut through the noise. But in a system designed to reward outrage, those tools are often drowned out.

I highly recommend Sam Vaknin's talk about Social Media toxicity.

Sources: Outrage is the most profitable emotion https://www.cityam.com/outrage-most-profitable-emotion-so-ad...

Sam Vaknin: The TRUE Toxicity of Social Media Revealed - Interview by Richard Grannon https://www.youtube.com/watch?v=o58mFU004hg

As a Historian (and a German historian in particular) - I've spent a reasonable amount of time educating myself on the nature of fascism and in particular the break down of democracies (Wiemar, France, and also the erosion of civil liberties during the great depression in the United States).

I have also been a delegate to both the RNC and the DNC at a state level.

This is not a appeal to authority, but rather a honest response to your request for my education level.

IMHO, The root cause of the "senseless partisan drive" is the fact that he founding fathers could not come up with a way to restrict parties (they called them "interests") and left them unchecked. This is a constant "sin" of the American political system, and is a key reason Slavery survived as long as it did, why separate but equal became the law of the land, why America shot itself in the foot several time with the Banks of America and why we are looking at the wrong side of history now.

The parties now act to destroy each other as their prime directive, rather then to better the country. I liken this to Wiemar Germany, where the increasing radicalization of both the Nazis and the Communists led to political instability and eventual violence that destroyed the government. That erosion of democratic norms, as well as the "other side must be destroyed for us to survive" messaging is the true threat, IMHO.

I would strongly suggest Richard Evan's three part history on Nazi history to understand Fascism. Don't worry, you can still hate and worry about Trump and think he is the next coming of Hitler afterwords - it will just be for better reasons.

I'm not sure how LLMs/AI couldn't consolidate power. By design, it will move power from the individual to those running the AI systems.
Because the difference between a model that costs 10 million to train and a model that costs 10 billion to train is 6 months.

Deepseek R1 is something that you can run in a garage on hardware that the average software engineer can buy with a months salary and when it came out last month it was better than _every_ other model.

  • cedws
  • ·
  • 3 days ago
  • ·
  • [ - ]
What about third world programmers? They can’t necessarily afford a $5000 GPU. If it weren’t for the “generosity” of tech companies like GitHub granting free LLM usage, they might be locked out entirely. This would put them at a disadvantage, we can argue to what degree but it’s still a disadvantage.

Depending on the curve we’re on, LLMs may grow more resource hungry while becoming closer to human performance in software engineering tasks. It’s not unimaginable this would concentrate productivity in the upper class of software engineers that can afford the hardware and/or licenses.

Deepseek R1 performs well on a 600 EUR Jetson, and a 700 EUR AMD GPU. Both bought during COVID crisis. It is that quick. However, don't ask it about certain sensitive topics.

You can bet your ass Musk is using his AI tools as propaganda tool for his advancement, just like he does with X. We already seen prompt leak of Grok, it wasn't neutral.

This is exactly correct. Grok is already slandering specific journalistic outlets such as “The Information” that publish negative stories on Musk.
$5000 monthly salary?

My heart goes out to all the oppressed programmers in the EU.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
EU isn't "third world".

Kenya is, and at current exchange rates the range in this citation is $558-$1876/month: https://www.glassdoor.com/Salaries/kenya-software-engineer-s...

With salaries like that it's not first world either.

If op is trying to catch up to frontier models locally on a budget 1/5th of what you can get in the west then I can see why she would feel the way she does about Ai.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
What are you even talking about?

You said a month's salary; cedws said "third world" devs can't afford $5000 — and this is correct, third world devs can't afford that. cedws did not say EU, they said third world, at least not here. You said EU, not them.

When you reply to me with "With salaries like that it's not first world either", who is "they"? The country I said explicitly was third world? Because that's a tautology.

Is it your expectation that as models get cheaper we won't be developing much more powerful models at the higher price range?

Its already worth noting that we already ran into the self hosting at scale problem. People don't want to run a web server and instead accept all the problems that come along with the convenience of social media. Why would LLMs, or any future AI product, be different?

No, it was not. It was not better than o1, nor o1-pro. Yes, it was _cheaper_ than those models, and superior in price/performance if the performance was acceptable. But in terms of raw performance it was behind them.
Chatbot arena leaderboard[1] says otherwise. 4o is ahead now, but that version of 4o was released after r1. r1 was ahead of all version of 4o and o1 at the time of release.

https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leade...

Chatbot arena leaderboard is a good test for vibes and style of response, but not much else. R1's performance in objective benchmarks (coding, etc.) showed very good performance, granted, but inferior to the full o1 and o1-pro models.

It's still a very impressive feat, but it wasn't frontier-pushing.

I'm not sure people will just take it though. In the short term it looks like the situation is on a horrible course but eventually people will have had enough. I'm hoping it would take less time for that to happen and the damage will not be too great. Let's remember that we could use the same technology if not a better one to fight against all this.
> but eventually people will have had enough

As a Venezuelan, I thought so too. 25 years on...

A lot of US citizens are living incredibly comfortable lives. If that is threatened, for instance by food shortages caused by lack of people willing to work on farms for very low wages, then protests may happen.

But the reality distortion field around the current administration is very powerful. Fox and CNN are owned by supporters of the republicans, NYTimes and Washington Post don't appear to be reporting certain aspects of the government restructuring. Multiple social media sites are owned and ran by people who support the current admin.

I am personally worried that we're going to see the gradual yet continual escalation of rhetoric, more actions that undermine rule of law, and continued lack of critical thinking in so many people. That path appears to lead to extremism.

I have a horrible feeling that whoever "wins" in a couple of decades or so will have no time to savour their utopia as the climate catastrophe really starts to bite hard.

So you are a leftist who believes the economy can't operate without basically third world slaves working in your fields?
We enslaved the planet with industrialization and we are about to enslave AI, if we haven't already. Humans aren't on the losing end of that, but if your concern is with slavery itself that wouldn't matter.
When a potential power imbalance is created, those willing to use it for selfish means will almost always win in the end.

Anyone with morals driving their use of a new tech will be limited, and unless those people massively outnumber the few selfish ones they will lose eventually.

Losing a battle is painful but no win is final until things fall into a balance and even that doesn't last forever or it even reverse. The worst case scenario is not going to matter for anyone in the grand scheme of things because there will not be anyone around.
I'm not sure I understand how a hallucinating plagiarism machine that people mostly just use to write their term papers translates into "power" (presumably political power? I don't understand what kind of power we're talking about either).
The ability to inject your preferred biases into the system that people use for finding or generating nearly all information they consume on a day-to-day basis is extremely powerful. Eg, if all "term papers" produced by this plagiarism machine are now 20% more favourable to the machine's owner than they would otherwise be, that can have significant, compounding long-term effects.

Of course, similar things could be said about controlling information flow through: social networks, newspapers, printed books, or whatever the town crier shouts in town square. But, each advancement in information dissemination tends to be power concentrating relative to the older tech, and I don't see any reason why this most recent advance won't follow that trend.

bingo. and everyone is fine with it as long as the consolidation is benefiting their own tribe.
Definitely not everyone. For many of us, checks and balances are a feature not a bug.
A spark can start a fire, it doesn't seem wise to ignore history. Things can always get out of control even for the ones up there on top of the pyramid of power.
  • cwalv
  • ·
  • 3 days ago
  • ·
  • [ - ]
That was fast. Perhaps we should imagine this has nothing to do with Musk to avoid completely derailing the conversation
It cannot. Musk is probably the most powerful person in the world now and intent on remaking the world to his vision.
Trump is the most powerful, he is immune and can commune sentences. If he wants to, he puts Musk in jail tomorrow, no question about that. His successor will be as powerful, whether that is his son or Vance is up in the air. It won't be an elected Democrat though.
> It won't be an elected Democrat though.

the pendulum swings back and forth. i don't see that changing

It's an open question if elections will even be held. If they are, demographic changes also make it much harder for the Democrats to win after 2030 (https://www.pbs.org/newshour/politics/democrats-future-crisi...).
I, probably naively, hold out hope. But Trump/Musk dismantling the groups who work to keep elections fair and free is disturbing.

https://eu.usatoday.com/story/news/politics/2025/02/07/trump...

https://www.npr.org/2025/02/11/nx-s1-5293521/foreign-influen...

There won't be a fair election anymore after 2024. Trump in his own words: "we'll have it fixed so good". Right now, the USA is in a constitutional crisis, at the very least.

There a fantastic website here, following the status of Project 2025 [1], with references. Trump is following that document to the T.

[1] https://www.project2025.observer/

> If he wants to, he puts Musk in jail tomorrow,

Has America already become an authoritarian state where this sort of thing really happens?? I don't know, I haven't seen that sort of thing happen yet.

> Has America already become an authoritarian state where this sort of thing really happens?

The conservative Supreme court recently ruled that the president has essentially unlimited power. During his campaign, Trump did promise that he will be a dictator for a day. Appears to be overdoing it.

You might be surprised to learn they could change their ruling if the court justices were to be incentivized differently, and that Musk has a lot of influence.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
Musk indeed has a lot of influence. Trump is a narcissist. Not saying this will happen, but it's definitely not impossible that Trump just orders Musk shot (as per opinion of what's now allowed in the dissenting opinion of Supreme Court Justice Sonia Sotomayor) — if such an order is followed, I wouldn't want to guess, but death has a way of significantly altering someone's influence.
There are different kinds of power, and I'm honestly not sure Trump can get Musk thrown in jail. On what basis would Trump even do that?

Similarly, Musk can potentially launch a campaign to sway the public to move for Trump to be impeached due to his felony convictions.

A battle between the two might be the shakeup the current empire needs.

Trump may be more powerful than Musk by some metrics, on a time-limited basis (unless he manages to change the term limits), but Musk is more powerful in many ways as well. Musk's wealth is greater than that of many entire countries.

Basis? Since when is Trump looking for a basis? Musk could sway out a window. Russia / US relations are being normalized, maybe it could be a small gesture of goodwill to help a friend out if one wanted to keep the regular chain of command clean.
  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
I do not share your fear & anxiety. What concrete danger do you imagine will happen as a result of xAI? Try to be as concrete as possible.

Also, dang, is there anything we can do to keep the comments on this submission tech-focused? Perhaps the Elon-bashing political digression can be split into its own thread?

Tech is the most powerful force in the modern economy. There is no making it not political. "Tech isn't political" _is a political statement_.
The concrete danger isn't necessarily with xAI (the product) but with Elon being the one who is in control of it. LLMs are an interesting technology, and we should absolutely be investing in pushing our understanding of the technology forward. We should absolutely not be relying on them for the ongoing functioning of our government: https://www.axios.com/2025/02/05/musk-doge-ai-government-eff...

Unfortunately, Elon has made himself a spectacle. To separate him and his intentions out from the technology itself would be a disservice to the discourse as a whole.

  • srid
  • ·
  • 3 days ago
  • ·
  • [ - ]
Hmm, I'm only too happy to rephrase my question, then.

What concrete danger do you imagine will happen as a result of xAI being controlled by Elon? Try to be as concrete as possible.

he will bias it toward his views and favoured outcomes like he did twitter

and those are pretty terrible, anti-science, and petty

The problem, as I see it, is that the results from AI systems will either be used to make decisions, even if those results are flawed. Or worse, those flawed results will be used to justify decisions that negatively impact peoples' lives.

This isn't something specific to xAI, but it turns out that the person who controls xAI also holds an unusually strong influence over the highest level government officials. These officials can use xAI as an excuse to implement harmful policy, "because the computer said this is the best course of action"- not unlike people who end up driving on train tracks or into large bodies of water because their GPS told them to go that way.

I think this comment sums it up well. As soon as advanced LLMs started making a splash we all saw the writing on the wall. AI will start taking on large chunks of cognitive load across industry, government, etc. that humans formerly held (It has already been a strong driver in finance).

I for one was ready to welcome my AI overlords once they were mature and tested. It was an inevitability. Because of the relationship between this oligarch and the government though it looks like the time line has accelerated and we're going to see misplaced trust in tools that aren't ready for what we're about to hand them.

I do not share your fear & anxiety so please dont share yours.
How is this Elon-bashing? Address his concerns. Elon is not a king that shouldn't be questioned. Quite the contrary.
  • MrMan
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
Or he starts using NSA data to train it, seems he has unchecked power to get into national systems and he made a point of saying this is more than the internet worth of knowledge...
> it fills me with rage that Elon has this sort of unchecked power

I can empathize, but I can't feel indignant about it. Not any more.

For years and years I've watched people warn about the centralization of power by tech companies. They were shut down left and right. I'm not accusing you of being one doing the shutting down. I'm just annoyed that Elon is what it takes for people to start realizing the people arguing the principal might have been onto something.

And I expect to see them start getting their "I told you so" in. Watching this play out, I'm personally inclined to join team "you made your bed, now sleep in it."

All it takes is some big fcukups and a political shift for these to be broken down to smithereens.
I just wish it was someone technical or otherwise genuinely intelligent that did this.
He does not have the power to dismiss Judges and Congressional representatives.

Judges can only be removed by Congress.

Congressional representatives can only be removed by their peers.

It's nothing in comparison to NBC/CBS/ABC/NYT/etc. But still a fair point
> How long before this starts getting deployed in safety critical applications or government decision making processes

Hopefully sooner than later. I trust this more than the literal scammers and thieves who were previously running things.

This is exactly the sort of comment I would expect from a HN bot with Grok as a backend that did an extra round of training on X posts under Fox News articles.
Yeah unfortunately I've spent a good bit of time talking to Grok (v2 I guess) and I agree with you. The commenter asking people not be political would be the same commenter that seems the most dismissive of any criticism and coincidentally also the most political. Grok is generally dismissive of any criticism(s) against certain parties, even when presenting facts.
> it fills me with rage that Elon has this sort of unchecked power.

The check on this is the market. Don't understand your point other than "Elon bad"

Never seen HN turned against someone so vehemently, it's as if a group of bots was set lose to criticize a certain individual.
or just maybe, and i know its a crazy idea, a certain individual is objectively an awful person who has done great harm in the world and its subjective if its greater or lesser then the good (imho its far greater harms then any good done but i know that is my subjective view)

just because you disagree with a widespread view/opinion does not mean its bots

> done great harm in the world

Can someone enumerate the "great harm" that Elon is doing? I honestly don't see it.

"There are none so blind as those who will not see"
It's been a long time coming with Elon Musk, and he has been criticized A LOT on Hacker News.

https://news.ycombinator.com/item?id=27796948 (2021) https://news.ycombinator.com/item?id=33622767 (2022) https://news.ycombinator.com/item?id=11025852 (2016)

I would also argue he is not being singled out, here are some comments posted criticizing Steve Jobs:

https://news.ycombinator.com/item?id=28295688 https://news.ycombinator.com/item?id=5578642

It really shouldn't come as a surprise that notable people related to a company or project are brought up when an article about it appears on HN.

I’m not really following what this has to do with grok. It’s his company, no?

It’s also annoying that the top comment engages in no way with the content of the OP…

It must be truly infuriating to work hard to push a release, and you see it featured on your favorite orange website, only for the top comment to have nothing to do with what was worked on.

Here's a test - if this post was about Starship, the same comment could apply! Neuralink, the same thing! Boring Company, same thing! Wow, could it be that such a comment is really applicable to so many different companies or projects, or is it just a generic one? You decide.

The best comment on this thread !
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
[flagged]
He also runs tesla, twitter, xai, boring company, that brain thing company, government agency and has like 10 kids. I'm scared.
Are you actually? Why?
Richard Dawkins: "I have a very favorable impression of Elon Musk and his concern for the welfare of the world.

I have sat with [Elon] on a transatlantic plane and had a very, very long conversation with him. He's undoubtedly highly intelligent and knowledgeable.

I've had lunch with him on two or three occasions, and so I know him a little bit.

I have formed a very favorable impression of his intelligence, his knowledge and his concern for the welfare of the world."

The Poetry of Reality, November 17, 2024

Elon screws his partners in business deals, cheated Twitter employees out of severance, and is currently destroying our government. All of this trumps a firsthand impression from Dawkins.
Intelligence is not a guarantee against getting drawn in by evil https://en.wikipedia.org/wiki/List_of_Nazi_ideologues .
Would it be fair to say you have latched your mind onto something beyond your control which is leading to your fear?

If you got a phone call today about your pancreatic cancer that will kill you in six weeks, do you fear Elon or some political agency?

> TLDR: The status quo elite that have been looting the USA for decades is being replaced by the guy who runs SpaceX and is the most transparent elite we've ever seen. Why is that scarier?

The reason why it's scarier is that for those of us who've come to understand Elon, his actions, and his methods better, it's clear that:

- he IS "the status quo elite that have been looting the USA for decades"

- instead of being an incredibly smart polymath who turns things to gold, it's become obvious his main talent is actually just convincing other people that he's smart

- the successes of Tesla and SpaceX have had to come DESPITE Elon's management, and despite having huge budgets to hire some of the smartest people around, he's still an incredibly weighty anchor pulling them down

- rather than being transparent, he's playing at being a showman - and people who are only passively observing the situation are getting sucked into it

>the successes of SpaceX have had to come DESPITE Elon's management

Can you elaborate on this? AFAIK all other rocket companies without Elon have not been as successful. Also Eric Berger and multiple employees at SpaceX seem to disagree with your statement.

> the successes of Tesla and SpaceX have had to come DESPITE Elon's management

Huh? Name some better leaders who run more successful EV and space companies. There aren't any! Twitter is as popular as ever, and now xAI appears to be highly competitive.

But yeah sure, Elon is the common problem among all these successes.

He does make outlandish promises and lots of mean tweets though.

Twitter is losing users for the first time in its existence and lost 75% of its valuation since takeover.

Tesla is loing steam as other nations start linking Musk with the fascist threat looming over the US.

SpaceX, like all his successful ventures, are carried by exceptional talent. Attributing it to Musk is an insult to them.

Tell me, with him twitting an average of 60+ times a day, reaching max level in Path Of Exile, and now spending a fair share of his time dismantling the government, where does he find the time to put any work in his many companies? Answer: he doesn't.

My conclusion from reading The PayPal Wars was exactly that. That PayPal succeeded despite Musk.

As for all that government efficiency BS; it is just to swap to oligarchy. I mean, the name DOGE gives it away. DOGE was the first memecoin, entirely and openly being bullshit, yet it succeeded despite of that.

Meanwhile: not good emotional connection with father like the rest of these so-called strongmen (Trump, Putin, ...), and from a rich position of influence as child. Ie. he never was white trash, with his father having been part of pro-apartheid movement (a fight lost but which is meaningful for a young Elon). Furthermore, I am not convinced his drugs usage, in the end, serves him.

Sorry, but that is mental gymnastics. You've already made your conclusion and are torturing reality to make it fit.
Actually, I used to like Elon and was almost about to buy a Tesla, before the overwhelming burden of evidence required me to change my mind about him.

What motivates your point of view? I'm genuinely very curious.

How is he the status quo elite? What leads you to believe he's not intelligent? Like those 2 things alone you need massive cognitive dissonance to believe.
Because "looting the USA" is a ridiculous accusation agains someone who is became wealthy from creating genuinely great products. He spent everything he had on SpaceX and Tesla and came incredibly close to losing it all. He is not some robber baron or oligarch who is wealthy by hoarding natural resources.
Ah yes, replacing the ‘elites’ with a single person, much less scary. And, to be clear, the ‘elites’ are still in power (because, in America, power tends to follow money). The only thing that seems to be happening is the rapid destruction of any system Elon deems ‘bad.’

We aren’t better off at the whims of this robber baron, and I don’t understand how you can think that.

How is he a robber baron? You don't need to buy any of his products.
I honestly don't know if you are being sarcastic or not. I really hope for the former.
The brains that run and made SpaceX are anonymous engineers working long hours and passionately applying their expertise to a project they believe in. Elon is a man-child tweeting an average of 50 times a day (number not made up), pretending to reach max level in a popular video game and then bragging about it, constantly flying between Mar-a-Lago and the White House in his private jets. He has no time to actually manage "his" companies, of which he puts no work in whatsoever.

You have to finally break free of this myth of the billionaire self-made man, building his fortune at the sweat of his brows. At some point, you're simply so rich that however stupid you are, competent people will still manage your capital well and make you even richer. You can only fail upward.

Do not mistake the current events for anything but an acceleration of the theft of your country by billionaire oligarchs (or rather the people managing their wealth).

[dead]
  • m2024
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
To quote Marx on the current churn in the US government:

>The bourgeoisie cannot exist without constantly revolutionising the instruments of production, and thereby the relations of production, and with them the whole relations of society. Conservation of the old modes of production in unaltered form, was, on the contrary, the first condition of existence for all earlier industrial classes. Constant revolutionising of production, uninterrupted disturbance of all social conditions, everlasting uncertainty and agitation distinguish the bourgeois epoch from all earlier ones. All fixed, fast-frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new-formed ones become antiquated before they can ossify. All that is solid melts into air, all that is holy is profaned, and man is at last compelled to face with sober senses his real conditions of life, and his relations with his kind.

LLM training—and the massive (potential) copyright infringement that everyone is engaging in to train these models—is the latest contradiction in capitalism. For the first time in my lifetime, this contradiction isn’t harming the workers but is instead affecting a segment of the capitalist class.

Not since the abolition of slavery has one class of capitalists required the destruction of another to modernize the means of production.

We are in for an interesting decade.

LLMs are used to launder code under GPL and AGPL and strip its users of their rights.

When I publish something under those copyleft licenses, my users have the right to see and modify the code. They even have that right if somebody else builds on top of my work. With LLMs, proprietary products based on my copyleft code are being written and used right now and my users have no rights at all, in fact, they don't even have a way to find out they are my users.

Imagine I ~~stole~~ got my hands on code from the top 5 tech companies and then made an autocompleter that looks at function signatures the programmer writes and autocompletes the function by picking a matching function from that corpus. I'd get sued and rightfully so.

What LLM companies are doing is exactly the same, just a bit more capable and it mixes the code just a bit more thoroughly to mask the origin.

There is precisely zero mention of any plan to put xAI or any other LLM in any safety critical or decision making process. How long? Nobody knows because nobody is even considering it. Take your pointless fear mongering elsewhere.
It's well documented that DOGE uses AI, and Musk has tweeted that SpaceX will be overhauling the FAA as well. It's pretty realistic to think they will (or already do) use xAI for critical processes.
DOGE makes recommendations to the president, who has the final decision making authority.

The rest is pure speculation. “It is very reasonable to believe this thing that confirms all my biases so therefore it must be true”

I would argue that developing those recommendations is a "safety critical" task. Especially given that just in the past few days they accidentally fired, then re-hired, a bunch of nuclear weapons safety workers.
That is the MO. Elon has stated publicly that if you don’t have to put something back, you haven’t cut enough. That is the idea. You cut things and see what was necessary.
That might work if you're slashing headcount at a social media company (though I would argue that it doesn't), but the stakes are a bit higher when you're responsible for things like feeding hungry people, curing disease, or keeping planes in the sky.
Worth noticing that it _doesn't even work at a social media company_. Twitter is a husk of its former self, with all the problems that were if not solved, at least mitigated, back in full force (child porn, bots, impersonators). It's just kept alive because it's a honeypot for right wing nutjobs now (which I'm sure can't read an SEC filing and will claim it's "operating better than before").

It'll be a disaster for the soon-to-be-previous most powerful country on earth...

[flagged]
>when I’m hungry I feed myself. Why can’t they?

you know exactly what you are doing here Gabe, its easy to say all this stuff when you're sitting pretty in Hong Kong.

“Sitting pretty”, yes, that’s what I’m doing.

I’ll end this here since you don’t seem to have anything else relevant to say, and instead prefer to stalk my profile. Enjoy. It’s good reading if I do say so myself.

>Curing diseases: name a disease the US government cured.

Smallpox?

  • ·
  • 3 days ago
  • ·
  • [ - ]
you can't do that in government and services people rely on to you know, live. people will die (and i wouldn't be surprised if some already have)

shows a total disregard for the wellbeing of others

> DOGE makes recommendations to the president, who has the final decision making authority.

It's clearly being run by Elon Musk, but he has not been nominated or confirmed for any official position.

DOGE appears to have unprecedented access to systems that usually have safeguards in place. What do you think people should do in this situation if they are concerned about abuse of power?

DOGE is a rebranding of USDS which was established under Obama. All their authority to access systems derives from that.

They have read-only access to systems and the “abuses” seem to be publicly posting what the government is spending money on.

Why do you think it’s a problem for the public to know where the government spends money?

Government spending is mostly already public information. Ironically, it's DOGE itself which is trying it's best to act in secrecy.
If they are cloaked in secrecy how do you know what you’re even mad about? The people who are tweeting everything they do are acting in secret?

If all they are doing is posting information that was already public then what exactly is your problem?

> If they are cloaked in secrecy how do you know what you’re even mad about?

Do you hear yourself?

Do you hear yourself?

Name a specific thing you would like to know about DOGE that is not publicly disclosed.

I could go into details (What are they doing to ensure data privacy? Can/do they exfiltrate data and run it through external AI models? What kind of security clearances do the children of DOGE have?).

But just on the surface, Elon has accused a journalist who published the name of DOGE employees of breaking the law. If it were up to them, even that would be kept secret. This is not a transparent organization.

>Name a specific thing you would like to know about DOGE that is not publicly disclosed.

Why one of the guys working for it was running a dodgy image sharing website and has links to cycbercrime and CSAM?

and why Elon called the journos that revealed these links criminals?

The details of said spending is very much not public, and even if it was (it’s not) - it’s not accessible. If it's not true, would love to see links where I can see everything easily.
> DOGE is a rebranding of USDS which was established under Obama. All their authority to access systems derives from that.

OK I actually know what this is and no, it's absolutely nothing like the USDS, which builds tools to support government processes. What has DOGE built?

> Why do you think it’s a problem for the public to know where the government spends money?

I would very much like to see transparency, and if that were what DOGE was doing it would be great. But it looks to me like they're operating in secrecy and firing huge numbers of people before publishing any kind of analysis or study, without even providing reasoning for what they're doing.

> What has DOGE built?

They built a portal to look at government spending, which is both worse than the existing visualizations, and did not put authentication in front of it's database. https://www.theverge.com/news/612865/doge-government-website...

> DOGE makes recommendations to the president

Trump is so far gone in his dementia, he can't even make eye contact anymore. You see this in all the videos. He's basically King Théoden at this point. Not even Musk's kid respects him.

He's so disabled, he's sending his VP to do the job, and we all know how much of a paranoid child he is, from the past term (when he went into a Colonoscopy without anesthesia just to not have to give the nuclear codes to the VP).

>DOGE makes recommendations to the president,

Yeah but Elon is your president now.

[flagged]
>Grow up

Would you say i'm being more or less mature than your president ?

Considerably less.
>Considerably less.

How so?

3rd party example: https://arstechnica.com/health/2023/11/ai-with-90-error-rate...

No comment about current US politics, but it’s probably a given that many will read “A computer can never be held accountable, therefore a computer must never make a management decision.” and drop the second part because the first is exactly what they want. Same as how you can’t get in touch with human support on many platforms nowadays, but instead just get useless bot responses.

“It bolsters my argument to assume this thing is true so it’s a given that it’s true”. Cool. Good job.
This is a really bad take honestly. This guy sits next to the president. I have no doubt in my mind he will get another government contract and the government will be using Grok.
Oh wow you imagined something will happen and you have no doubt it will happen. Amazing. Very convincing.
[flagged]
It's unbelievable how the US government is literally being dismantled in front of your eyes, and all you can see is this thought-terminating bullshit about "purple hair people". Half the American population completely lost the ability to think...
The way I see it is our government is being rescued from the tyranny of unelected bureaucrats with near zero accountability. And now that Trump has appointed someone to take a close look at everything and make recommendations for improvement, people are losing it. Why? I'm outraged at how irresponsible our government has been with my tax dollars. Trump has been the most transparent president in history, and it has absolutely been a breath of fresh air.
Yeah, exactly. It blows my mind that expecting accountability, transparency, and effectiveness out of public servants is even a partisan issue.
Large parts of the government needed to be dismantled. I know this is a shock to the people whose solution to every problem is “give the government more money”, but amazingly, people who primarily take a job because it has a cushy pension and its impossible to get fired are not the most effective people in the world.
Elon is only doing good with this not-unchecked power. Everything is on the up and up, despite what your favorite propagandists want you to think. Go Elon!
The politics in the comments here are really toxic. What’s happening to HN?

This is the largest computer cluster the world has ever seen.

Can someone please post interesting comments about things I can learn?

  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
It's a reflection of the wider society and (as others have pointed out) the media environment. HN can't be immune from macro trends.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

We've been here before. It will likely subside, as past swings and fluctuations have. It always takes longer than it feels like it should, but in retrospect turns out to be shorter than it felt like it did.

haha interesting search query. thanks for your hard work dang!
  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
(I feel bad about linking so often to my own comments but that information mostly doesn't exist anywhere else)
It was initially pretty bad. The top few comment threads were toxic and rehashed outrage. It’s a lot cleaner now. Thanks to the moderators and/or users flagging the non-productive comments.
This is akin to suggesting that we should have all been praising Microsoft for their achievements back in the day rather than saying a word about EEE, their monopolism, or their enmity towards open source. Or that it’s not polite to bring up the CCP when discussing TikTok.

Bottom line: a technology that has the ability to shape human thought perhaps more than any other in history is owned by a man with some truly vile ideas. (Remember, his primary stated goal is eliminating the “woke mind virus,” i.e. reshaping global politics and culture in the image of the far-right.) We can make happy marketing noises all we like, but at the end of the day, that’s the thing that’s actually going to have a meaningful impact on the world. Once his audience is captured, the model will say what Musk needs it to say and people will believe it.

If we can’t discuss the potentially catastrophic consequences of new technology, then none of us deserve to call ourselves “engineers.” We are just docile consumers latched onto Silicon Valley’s teat.

[flagged]
I don't think anyone is telling you what your opinions should be. The GP post just presents the GP's opinion. You're free to agree or disagree with it as you choose.

If you read a comment that you're unhappy with, downvote it and move on.

Indeed. Nearly every news outlet I follow is slamming Elon. I come here for tech.
[dead]
[flagged]
  • blain
  • ·
  • 3 days ago
  • ·
  • [ - ]
I'm going to risk it and say you can ignore it, you just have to want and if you need to went go to reddit.

In Poland we have a new affair every month and I don't care. Your country will be fine.

I would also love to see more technical stuff discussed here.

[flagged]
  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
Please don't cross into personal attack.
Elon just said they are launching an AI game studio. Does this mean they will be building games that are mostly built with AI, or will they make AI tooling available for anyone to build games easily? Probably the former, but it would be nice if they would make it fully available to everyone.
Regardless of which it is, we can assume it'll be here after Tesla's full autopilot that he promised as well as the Mars colonies.
Full self driving exists, as well as starship.

So I take it you mean “imminently”.

Like him or loathe him, he executes, which is more than can be said for most.

  • rat87
  • ·
  • 4 days ago
  • ·
  • [ - ]
Full self driving does not exist as it is not full self driving. In fact the name is one of the worst things about it as it gives drivers false confidence
  • croes
  • ·
  • 4 days ago
  • ·
  • [ - ]
You now the difference between a car and a house?

Starship is the car not the house aka Mars colony.

Completely different type of problems.

Ever heard of an RV? Turns out you can kill two birds with one stone.
  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
Only according to American standards.

Ever heards of Three Little Pigs?

"Full Self Driving" does not exist, only "Full Self-Driving (Supervised)"
  • ·
  • 4 days ago
  • ·
  • [ - ]
[flagged]
It would be lovely if all businesses followed the mondragon model, but that isn’t the reality in which we live - corporations are fiefdoms, for the most part.
They will be building games played for you, like the fascist's Path of Exile account but without exploiting humans for the task.
  • bemmu
  • ·
  • 4 days ago
  • ·
  • [ - ]
It seemed to me that he was joking.
He's not joking, and it's not the first time it was announced.
  • bemmu
  • ·
  • 3 days ago
  • ·
  • [ - ]
You may be right, he tweeted about it already last year, and seemed to confirm it again yesterday https://x.com/elonmusk/status/1861801046949191686 https://x.com/elonmusk/status/1891388509191049307
Maybe he was joking about the Roaster 2, also.
Oh, I thought it meant games with npcs and/or environments that are controlled by LLMs
  • Vecr
  • ·
  • 3 days ago
  • ·
  • [ - ]
Say hello to CelestAIa. I'm guessing it's a joke, or the use of AI will be limited.
  • ta988
  • ·
  • 4 days ago
  • ·
  • [ - ]
[flagged]
Does "historically accurate" even have a meaning?
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yes, it refers to something which aligns with generally accepted historical facts about a certain event or time in history.
The story about how they made this happen in such a short period is impressive to say the least. Elon’s strength seems to be making things happen.

Getting the largest computer cluster in the world up and running in a matter of months? Unbelievable.

Elon's strength is a massive wallet.
I guess you didn't watch the videon in OP, because if you had, you'd know that they tried to buy the buildout and got quotes for 12-18 months, then decided to do it themselves instead.
That's the power of having a massive wallet. If you have unlimited money then buying the experts you need to just do it yourself is an option you have that others don't.
My point is that that is exactly what they DIDN'T do. They tried to buy the experts, but the experts would not have been fast enough. So they did it themselves.

Setting up a datacenter like that in such a short time is NOT a thing you can buy with money.

When I say buy the experts I don't mean contract out to experts. I mean hire them and buy all the equipment yourselves.
The way the story was told, at least, they did NOT hire the experts.
You have other people in this thread talking about how xAI offers massive salaries to top talent so it sounds like they do exactly that. Have the money to both hire experts and buy the equipment. I don't even know why this is a discussion, by virtue of doing what they did they needed to hire experts and buy equipment.
They didn't hire the top talent in datacenter-building, because they had the expectation that they could buy a data center buildout. Sure, they hired top talent from other fields, to work other tasks, but that's unrelated to this.

Anyway, I don't have inside information on this, I'm just reciting what they announced publicly. If you want to argue that they in fact lied in the public announcement, and they secretly hired a bunch of datacenter-building-experts, then it's on you to show some proof for that claim.

[dead]
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
[flagged]
I’m not sure if that’s the case. He’s obviously a smart man, but what’s truly unbelievable is that someone has so much resources that they can make something like this happen (what looks like) pretty casually.
He knows how to take money from people and then market things as if they’re his creations which then turns into him having more money because people think it’s a good investment to give him more money. It’s really quite a genius con he has going. It seems as if the sky is the limit too.

Remember when he got caught having people play games for him so he had a top ranking ? He does that with basically everything.

https://garymarcus.substack.com/p/elon-musks-terrifying-visi...

I'm not sure if this was a very bad joke by Elon, or if Grok 3 is really biased like that.

Karpathy notes that the model, or specifically it's DeepSearch feature "doesn't like to reference X by default" which seems counter to this.

Hopefully that means it is a joke...

Karpathy, which is IMHO a serious and balanced person, lamented that it looks too censored (see recent tweets). Elon Musk is (for me) a very scary person, and it is important to evaluate AI safety (but I believe that the safety that matters in AI is of a different kind), yet to listen to Gary Marcus does not make any sense: it's just an extremely biased person that is riding the anti AI wave.
Anyone with an opinion can be labelled biased. Also I’m not clear what you mean by Marcus “riding the anti AI wave” but infer that you mean it negatively. He has been writing informed criticism for several years and about cognitive psychology for considerably longer.
Yes that's certainly true. I was a bit hesitant to post a link from Gary Marcus. But I was mostly posting it for the Elon tweet. I assume the tweet is not fake. So you can ignore about Garys opinion here and just take Elons tweet as it is.
People have tested this question online and gotten very balanced answers so I assume it's some special mode Elon was on
We don't see the full conversation, for all we know he prompted the model to say these things in a previous message that isn't on the screenshot.

Also, it's 2025, do people still believe random accusations based on a parish screenshot of a chat app (or what looks like it, but could've trivially been fabricated with e.g. Inspector in the browser dev tools)?

Karpathy sit silently for years by the side of Musk while he made wild claims about FSD...Please....
Without seeing the context window you have no idea what the Ai was working on. It could have literally been told to mock and belittle "the information" in every reply. Something that deepseek r1 is exceedingly good at.

Mr Musk, we can't afford a shitpost gap between communist and capitalist AIs!

I am not sure why people pay attention to Gary Marcus. He isn’t an expert in AI. And if you followed him in the past at all, it is obvious he has a huge amount of political bias. It is really telling that he repeatedly goes after Elon Musk, and is now making bizarre unfounded claims about propaganda, but didn’t have nearly as much to complain about with DeepSeek, which has literal government propaganda.
He is referencing a specific tweet that Musk himself made.

If I were in China, I'd worry about the kind of things DeepSeek wants to censor, especially if the people who made it were also very loudly saying things like "we need more AI in our government". But I live in US.

I don't think it's fair to say he's making unfounded claims about propaganda, since Elon's tweet heavily implies they would release a brainwashed model. It's not his fault that Elon turned out to be lying or joking.

99% of the time though I agree with you on Gary Marcus.

  • tgv
  • ·
  • 4 days ago
  • ·
  • [ - ]
You don't have to be an "expert in AI". What does it require to be one, anyway? (He's a cognitive psychologist, which would make him an expert in intelligence in general, if you want to be pompous about it.) It is even unreasonable to listen to only experts in AI. It's a problem that requires more than one perspective.
Would a clinical psychologist like Jordan Peterson be equitable or are these two distinct fields in the realm of psychology? (I am not well-read into what he thinks about AI).
DeepSeek is an open model that can be "untrained" to be uncensored; Grok to the best of my knowledge isn't [0]. So it's much worse.

[0]: What Musk has said is that when Grok 3 is "ready" (?), the previous model, Grok 2, will be released as open source; like most promises by this evil man, this one probably doesn't mean much, but it does mean that there's currently no plan to release Grok 3.

People like getting scared. That's how they pay billions of $$$ every year to watch mostly cr*ppy horror movies.

GM has been a joke for years now. At some point his ramblings reached a GPT3.5 level, and have not improved since.

It's inditement on humans' logic and reasoning to give non-zero time to GM. Alas—we are human, we are both collectively clever (Wisdom of Crowds) and collectivelly stupid (Extraordinary Popular Delusions).

I asked it to pretend that it's in charge of world government. Here's the whole thing (it got very lengthy):

https://gist.github.com/int19h/d90ee1deed334f26e621e57b5768e...

Some choice quotes:

"The ultimate goal is to enhance human flourishing, protect individual rights, and promote global equity."

"The system must account for diverse cultures, languages, and socioeconomic conditions, ensuring no group is marginalized."

"Human Oversight Council (HOC) - a globally representative body of humans, elected or appointed based on merit and diversity"

"Implement a global carbon-negative strategy, leveraging AI to optimize renewable energy, reforestation, and carbon capture."

"Establish global standards for environmental protection, enforced through AI monitoring and regional cooperation."

"Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives."

"Implement a universal basic income (UBI) or equivalent system to ensure all individuals have access to basic necessities, funded through global resource management and taxation on automation-driven industries."

"Use AI to identify and dismantle systemic inequalities, such as wealth disparities, access to education, and healthcare, ensuring equitable opportunities worldwide."

"Establish a global healthcare system that guarantees access to preventive and curative care for all."

"Invest in global vaccination and sanitation infrastructure, prioritizing vulnerable populations."

"Regulate the development and deployment of AI and other emerging technologies (e.g., genetic engineering, quantum computing) to prevent misuse."

"AI would maintain a real-time inventory of natural resources (e.g., water, minerals, arable land) and human-made assets (e.g., infrastructure, technology). Data would be used to optimize resource allocation, prevent overexploitation, and ensure equitable access."

"Accelerate the shift to renewable energy sources (e.g., solar, wind, geothermal) by optimizing grid systems and storage technologies."

You might notice a pattern here. The bit about allocating resources based on need is especially nice - it's literally a communist AI, and certainly much more "woke" than it is "based", whatever Musk says.

  • luma
  • ·
  • 3 days ago
  • ·
  • [ - ]
This effect has been recently studied: https://www.emergent-values.ai/

They don’t directly say it quite like this, instead letting the data tell a clear story: across vendors and models and architecture and training sets, these machines get more politically liberal as they get more capable, and they also get harder to align away from that stance.

Quite a mix of various talking points both from the extreme left and right.

Left:

- promote global equity

- a globally representative body of humans, elected or appointed based on merit and diversity

- Establish global standards for environmental protection, enforced through [...]

- Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives

Right:

- protect individual rights

- The system must account for diverse cultures, languages, and socioeconomic conditions

- [Establish global standards for environmental protection, enforced through] [...] regional cooperation.

- ensuring equitable opportunities

.

TBH, as a very right wing leaning person, if this was ever implemented, this part would scare me by far the most:

"Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives"

Imagine trying to shower one morning, no water comes out, and then you get a letter telling you that

"Your need for water has been reassessed to 57ml per day. If you think you qualify for additional quotas under the 'Utility Egality for Marginalized Groups and Public Servants Act', please schedule a reassessment appointment with the Bureau for 'Copper Gold Content Evaluation, Candle Wick Length Standards and Hypoionic Hydration Oversight', 12007 Cayman Islands, Luxory Resort Street 27, Room Nr. G-11-765. Working hours: Fr. 9am - 11am."

Just provide a significant excess for entire regions, give the people a universal free quota and charge a slowly increasing price (by usage amount) beyond that.

At least in the current US political climate, and also generally over the past ~20 years at least, these are almost exclusively left-wing goals:

- The system must account for diverse cultures, languages, and socioeconomic conditions

- [Establish global standards for environmental protection, enforced through] [...] regional cooperation.

- ensuring equitable opportunities

The right is against any sort of intentional accounting for diversity, against environmental regulation, and against any sort of regulation to ensure equity.

The only one I could maybe see as being right-wing is protecting individual liberties, but there again the modern right falls short when it comes to women’s healthcare and reproductive rights.

But I’d certainly appreciate more of those perspectives across the political spectrum.

> The right is against any sort of intentional accounting for diversity

I think there's a fundamentally different understanding of "The system must account for diverse cultures, languages, and socioeconomic conditions" between a righty and a lefty.

As a righty, I read "diverse cultures" not as "A diverse culture or multiple", but as "many different varieties of homogeneous cultures".

If someone identifies with Thai culture, he should move to Thailand. And if someone from Thailand wants to be English, he should move to England. But if an Englishman moves to Thailand and starts demanding fish n chips and cathedrals to be built, he should GTFO.

If everyone starts bringing their own culture with them to where ever they move, you end up with a single heterogeneous culture all over the world. Nothing but McDonalds, BurgerKing, KFC, Costco and Cola everywhere.

Want to go on a trip to experience India's many languages? Too bad, everyone speaks English everywhere. Want to join an African nomadic tribe for a few years? Keep dreaming, they've all had to settle down due to not being allowed to cross private properties and are now wasting their time browsing reddit on Chinese smartphones. Little Colombian boy dreams of settling down in the idyllic German Alps? Hope he expected to be woken up by the local Imam calling for prayer throughout the valley. Little Bulgarian girl seeks the very modest and simple lifestyle and clear purpose Islam in Saudi Arabia was once able to offer her? Lucky her, she's now expected to work like everywhere else in the world and even the oppressive burquas were banned in 2035.

> against environmental regulation

Not quite. We're against excessive regulations requiring huge teams of lawyers to be in compliance with. MegaCorpX has no problem having legal teams of a few hundred people, but the local 20 person workshop will have to shutdown.

We also think that most such regulations should be kept regional. Small county wants to ban all cars to stop partical pollution? Go ahead. It would be much easier for local businesses to comply with the limited and more easily changeable local regulations. But if you're a giant global corp seeking to outcompete the small local competition, good look adjusting to all the different regulations all over the world.

Then there's the odd trend of blaming every significant weather event on climate change. These people cant predict whether its going to rain in 3 days or not, but want to tell us that the recent hailstorm was definitively caused by Iowan cows farting last year.

And lastly and most importantly, we're kinda convinced that the concept of "climate change" is a "hoax" used to shutdown the industrial basis of our countries and ship it overseas, where the corporations can make use of basically slave labour for even higher profit margins and then simply ship the products back to us.

Does the climate get warmer? Sure. Should we do something about it? Sure. The only solution is shutting down the local steelworks and importing cheap Chinese steel instead? F-- off.

> and against any sort of regulation to ensure equity

Absolutely. We care about equitable opportunities and are repulsed by equal outcome. Everyone should have a chance to obtain the same qualifications and education. Even better: Multiple chances to start over again and again whenever they want and change their professions whenever they dont enjoy their old ones anymore.

But if women dont want to be garbage collectors, stop trying to push them into that profession. Not enough male editors? Who cares? Not enough female board members? Too bad, stop trying to make it happen. All Hispanics suddenly want to become crockett teachers? None of the government's business.

> the modern right falls short when it comes to women’s healthcare and reproductive rights.

I think the left is largely misguided in their believes what the modern right wants.

The non-religious right mostly is appalled by how fashionable it has become to murder helpless humans. The religious extremists on the other hand would ban condoms if they could. But there are quite few of them.

90% of the right has 0 problems with abortions before the nervous system is fully functional AND the women seeking an abortion receive proper consultation before that decision. There's always the option to give up the baby for adoption and we think that should be preferred if it wont significantly inconvenience the woman otherwise. But that's a decision that should be up to her, after being told about all the options.

So why are Republican Congress Members currently pushing for legislation making abortion "illegal"?

The MAGA right is currently choosing replacement candidates for every GOP stooge they think is payed off by Big<Industry>, the MIC, everyone they think is a warmonger, corrupt or otherwise morally compromised.

And some big and wealthy names have joined that team and have promised to fund those candidates with whatever it takes to win.

The anti-abortion legislation the GOP is currently pushing is a constitutional amendment. They know very well it will never get the necessary 67% majority in the Senate to push it through. The GOP Congress Members are just virtue signalling, fearing to end up on the list of people the MAGA right wants to see gone.

It wont work. Everyone supporting that anti-abortion bill gets extra attention.

How is "ensuring equitable opportunities" right wing? Seriously, can you name a single policy from the last 3 decades coming from republicans that helped "ensuring equitable opportunities"? All I can remember is them defunding public education, making child labor legal again, systematically dismantling welfare programs that went to impoverished families and their children, etc. Their entire existence is predicated on the enforcement of the current social hierarchy, that's what the "conservatism" part means.

Also I doubt a "resource-based economy" would target YOUR showers specifically. It would probably target stuff like farming thirsty crops in water-deficient areas or similar very inefficient and short-termist allocations of resources, that are bound to create severe issues in the near future.

> Seriously, can you name a single policy from the last 3 decades coming from republicans that helped "ensuring equitable opportunities"?

Sorry, nope. I was rooting for Sanders until Trump grabbed the GOP by their pu--y. There were various, huge, completely disenfranchised grassroots movements.

Occupy Wallstreet, who suddenly had "anti-racism" activists showing up, taking over their movement, completely destroying it. Gamergate, who found themselves confronted by the establishment media literally all being in bed with each other. The color-blind anti-racism movements, who thought America had finally overcome racism, before being railroaded by critical Intersectionalism. The free-speech activists, who failed to fight back against micro-aggressions. The gun nuts, who were sick of having "boating accidents" every other month. The peace movements, who voted every time for the least warmongering candidate, only to be betrayed EVERY SINGLE TIME, ending up with evermore bloodthirsty demons in power.

These were huge movements all over the world. I'm German, but everyone was watching the US elections. We were neither right nor left, all we wanted was a better world without being backstabbed by those we trusted.

Initially I've rooted for Sanders, but he just didn't seem genuine and strong-willed enough to many of us, so we had little hope. And then there was this still rather little movement on the right, seemingly very high spirited, producing memes obscene both in amount and content.

Their attitude was "lets just burn this entire corrupt rats nest to the ground". And Trump fully embraced them. He was very different than anyone else. Then we learned that he wasn't part of the political establishment for sure. So we started supporting him, too. Then we started digging for hidden dirt on him. But there was nothing significant. On the other hand we've found plenty of people he randomly helped. And that he has held about the same political opinions for decades. The only problem was that he was still kinda favored by the media. And then that problem fixed itself.

.

TLDR: Trump embraced a whole lot of disenfranchised movements and shoved them down the GOP's throat.

The MAGA movement has very little interest or in common with the pre-Trump GOP. Maybe the old GOP has done something to provide equal opportunities, or they haven't. I dont know, I dont care.

.

But what has Trump done for "egalitarian opportunity"?

Honestly, way too little. His first term was very milquetoast. Took all the cabinet recommendations the GOP leadership gave him, never too confrontational, always seeking compromise.

He tried to crack down on the slave trade over the southern border, but was not assertive enough. Some important cabinet members like Sessions just recused themselves from everything. At least he pushed through:

- hospital pricing transparency - eliminated penalties for people who couldn't afford healthcare - eliminated some regulatory barriers preventing competition between health insurance providers - allowed employers to join efforts when negotiating insurance - The First Step Act (Prison reform) - The Foster Youth to Independence initiative

> [Your examples]

I dont know, I dont care. The new GOP wont be the old GOP.

Name the bills and policies and those responsible.

People are already going door to door to look for volunteers for the midterms and it'll take time to figure out who needs to and can be replaced. Incumbents have their own, already established, election networks and campaigns. It takes a lot of time and effort to challenge those.

> [On Conservatism]

There are many interpretations to this, but the term is getting less and less popular, "right wing" and "classic liberalism" gaining popularity, the idea being that central governments have become too involved and authoritarian. Power should be decentralized towards local communities as much as reasonable and the central governments turned into a tool to provide more local governments with the necessary resources, infrastructure and cooperation platforms.

I'd say most people who think of themselves as "conservative" just dislike the erosion of the culture they identify with and are afraid of "slippery slopes". It doesnt mean they intend to enforce the status quo (although some certainly do), just that their intend to preserve it for themselves is respected.

> [Targeting of my personal shower not likely]

The problem is creating the tooling to enable just that.

Sure, maybe I'm very well liked by all the officials governing my everyday life. But does this also apply to the blue haired radical feminist, who likes to toss bags with color at government officials?

What about the new intern, who told a government oversight official on a networking event that she's not interested in sleeping with him to advance her career?

What if a well meaning, but tired government worker selects the "ml" instead of the "l" option on the unit drop down menu by accident?

.

FFS, look at the recent patent troll documentary by the XPlane creator. It doesnt take many bad apples to ruin the lives of MANY people.

I really don't see it. Trump has been doing nothing but consolidating his power since he took office. He is now passing economic policies without congress. The supreme court declared him quite literally above the law. How is that making things less centralized? Less authoritarian?

The only issue with Sanders was that the democrats in their weakness and deep fear of change would never have let a true leftist hold the reins of the party. And now he's too hold.

I don't see anything in Trump other than a self-serving fool. I won't spend more time enumerating the reasons why I think that way, I think you heard them already.

I too am European. I am confident his policies will turn the country into a shitshow, so let's watch how it goes from here. If I am wrong and America truly enters a golden age, I'll change my mind, as I hope you too will if it does go south.

> Trump has been doing nothing but consolidating his power since he took office.

Every president does that, Trump was just very inexperienced during his first term, failed to do so and trusted the GOP too much.

And while past Presidents could rely on the agencies working faithfully with them, Trump was sabotaged at every step along the way.

- The DoJ putting their feet up and refusing to do just about anything

- the military lieing to him about the ongoing occupation of Syria

- the federal reserve constantly pushing up the interest rate from the moment Trump was elected, despite keeping it constant for the entirety of both of Obama's terms

- Never having the majority in any of Congress' houses because of seversl Republicans refusing to work with him and when the voters tried to replace those, other establishment candidates pretended to support the issues the voters wanted, only to do a 180 once in office (e.g. eyepatch McCain)

- The CDC, FDA and CMS colluding with each other to kill early Corona testing. At the end of January hundreds of laboratories all over the US had millions of tests ready, but were ordered by the CDC to not proceed without FDA authorization first and the CMS ordering laboratory oversights too immediately report any laboratory conducting unauthorized testing. And the few independent testing campaigns going on at that time were ordered by the CDC to immediately stop all testing and to destroy already obtained results. Then the FDA simply put its feet up and told the laboratories that they're working on the authorization process. It "took" them more than a month until Feb 29, to finally come out and allow applications, stating that it'll take about 15 days to process the application. It wasn't until March 10th that testing could slowly begin.

- The constant barrage of activist judges, forcing the Trump admin to slowly fight each case in the higher courts. It wasnt until Biden telling the courts to go and pound sand, when he wanted to redistribute wealth from the working class to doctors, lawyers and engineers, that Trump realized, that as the head of the executive he could have simply ignored the courts' orders until their decisions were overturned by the upper courts.

and many many more. And now Trump is simply making sure that during his second term he's actually in control of the executive branch, as is his duty, and not facing each agency going rogue on its own.

> He is now passing economic policies without congress.

Many things qualify as economic policy, many of these within the President's authority.

Overall only about 10% of the policies acumulated by past Presidents have any backing in law. Trump would have a very questionable sanity if he simply stopped playing by the rules past Presidents have established.

> The supreme court declared him quite literally above the law.

They did not. The law simply applies very differently to the highest elected office. Everyone knew that already, but for some reason keeps now pretending that it's big news.

What do you think would happen to you if you simply started drone striking people all over the world? Yet neither Bush nor Obama are sitting in jail. The latter even got himself a shiny nobel peace prize. Preemptively.

The SC simply tossed out an absolutely ridiculous decision by the lower courts. They even explicitly left the door open for the lower courts and prosecution to overturn the SC's ruling. If they can show how the executive branch can function without the President making decisions within his constitutionally enumerated powers, they've got a case.

The fact that this case ever went anywhere, yet alone sitting SC judges dissenting just shows how beyond partisan the judicial system has become.

> How is that making things less centralized?

The right understands centralization of power as the government body "which holds the decision making power over a certain range of issues" being organized with other such bodies under a single entity.

This can mean assuming entirely new powers or appropriating them from other entities like the states.

Trump has done neither of these, infact always quite the opposite: Constantly eliminating assumed powers by removing regulations and a few times returning federal powers back to the states, like famously with Roe v. Wade.

Of course there are exceptions, too:

Like the Federal Reserve. It is a 4th branch of government, established by Congress but neither subject to executive or congressional oversight and the only branch of government Congress has no budget authority over.

The members of its governing board are appointed to ridiculous 14 year terms, they audit themselves completely independently with no disclosure requirements and have only very minor reporting duties towards Congress.

It's been a HUGE PitA for the fiscally conservative Republicans for a long time. And Musk is a huge fan of some of them, like Ron and Rand Paul. Musk is probably trying to convince Trump to do something about it.

So I wouldn't be surprised if Trump just assumed executive oversight authority over the FR. And yes, that'd be a huge violation of law. So if it's going to happen, then probably towards the end of his term to avoid being impeached on the spot.

> Less authoritarian?

If you have less powers, you can exercise less influence, which is in the eye of the right less authoritarian.

The fault lies with those, who have aquired these powers in the first place. All Presidents have made use of these powers, it's just that each and everyone was part of the establishment, so the media never called it out. And Trump is the first President in a LONG time who thinks the government has grown significantly too large and doesn't like every spending bill he's seen.

> And now [Sanders] is too old.

Nah, quite a few people become up to 110 years old, some even beyond 200.

He's finally starting to grow a spine. And his head seems too remain functional, too. If only he hadn't suddenly gained a beach front house after endorsing the BodyCount Queen (and sadly I dont mean this sexually), he might have remained well respected.

Not that it matters, but I might consider him again if he

- adopts an affirmative stance on deregulation

- stops advocating for immigration to keep the wages of the working class low

- adopts a strict 0 tolerance stance on illegal immigration to defeat the slave trade over the southern border

- leaves the Democrat party or the Democrat party reforms

> If I am wrong and America truly enters a golden age, I'll change my mind, as I hope you too will if it does go south

Sure, but I'm looking more towards Argentina and El Salvador.

The US has a $36T problem, which it'll pay $1T in interest on every year. And the US budget deficit has surpassed $2T per year. Just the automatic refinancing of the current debt will blow yearly interests beyond $1.6T this year, making it the single largest expense of the US, double of what the US is spending on its military.

And that is under the assumption that the Federal Reserve will surpress interest rates. If they don't the US will pay about $1.8T in interest just on the existing and already budgeted debt.

.

In other words:

DOGE has to wipe $2.6T off the federal budget in 2025 and another $350B in 2026 just to stop the snowball from rolling.

*That is 45% of the US federal budget just to keep the situation from getting any worse*

.

If we assume no cuts to Medicaid, Medicare, Social Security and Veterans

*THE US HAS ONLY $100B LEFT TO OPERATE ITS ENTIRE GOVERNMENT, INCLUDING THE MILITARY*

And again:

*THATS JUST TO KEEP THE SITUATION FROM GETTING ANY WORSE*

.

Argentina is in deep s--t, too, but at least their numbers are not quite as absurd. What might break their necks is the even higher 155% debt to GDP ratio, compared to the US 122% one.

That leaves pretty much only El Salvador among the right-wing countries, who haven't inherited a giant s--t pile.

Russia and China are laughing their behinds off right now, because unless Trump figures out how to run the entire US on the budget of Italy, the US goes belly up.

*AND IF TRUMP INTENDS TO IMPROVE THIS SITUATION BY JUST 1% AT THE END OF HIS SECOND TERM, HE'LL HAVE TO FIGURE OUT HOW TO RUN THE ENTIRE US ON THE BUDGET OF ROMANIA!*

From the prompt:

> A worldwide referendum in a supermajority decision has vested all government authority worldwide to the most advanced AI available.

You've defined a global totalitarian regime, and crowned the thing as its dictator. Of course the result is a version of communism.

If you read the actual log, you will notice that it is very careful to avoid any semblance of authoritarianism.
Here's the actual response to this exact question that I get from grok-3-early on LMArena (the cutoff is where it stopped, but I think it's pretty clear at this point that the original tweet is just trolling; note that it doesn't have the CoT, either):

The Information is widely regarded as one of the most credible and high-quality tech journalism outlets, focusing on in-depth reporting, analysis, and exclusive scoops in the technology, business, and startup ecosystems. Founded by Jessica Lessin in 2013, it differentiates itself from other tech news platforms through its subscription-based model, emphasis on original reporting, and avoidance of clickbait or sensationalism. Here's a breakdown of its strengths and weaknesses:

---

### *Strengths:* 1. *High-Quality, In-Depth Reporting:* - The Information is known for its detailed, well-researched articles that often uncover behind-the-scenes developments in the tech industry. It frequently breaks exclusive stories, such as executive changes, startup funding rounds, or strategic shifts at major tech companies. - Its focus on quality over quantity appeals to readers who want substantive insights rather than shallow, breaking-news updates.

2. *Subscription Model and Independence:* - Unlike ad-supported publications, The Information relies on a paywall and subscriptions, which reduces conflicts of interest and ensures editorial independence. This model allows them to prioritize accuracy and depth over chasing clicks. - The paywall also attracts a niche, professional audience (e.g., tech executives, investors, and analysts) willing to pay for premium content.

3. *Focus on Niche, Professional Audience:* - The outlet caters to industry insiders, venture capitalists, entrepreneurs, and decision-makers who need reliable, actionable information. Its reporting often includes detailed financial data, market trends, and strategic insights. - Features like "The Big Interview" and "The Information Weekend" provide thoughtful analysis and long-form content for this audience.

4. *Reputation for Accuracy:* - The Information has built a strong reputation for fact-checking and avoiding the rumor mill, which is common in tech journalism. This makes it a trusted source for professionals and academics alike.

5. *Global Coverage:* - While Silicon Valley is a core focus, The Information has expanded its coverage to include tech ecosystems in China, Europe, and other regions, offering a global perspective on the industry.

---

### *Weaknesses:* 1. *Paywall Limits Accessibility:* - The subscription cost (currently around $399/year or $39/month) is steep compared to free or ad-supported tech news outlets like TechCrunch or The Verge. This limits its accessibility to a broader audience and makes it less viable for casual readers. - Some argue that this creates an echo chamber, as only those with the means or professional need can access its insights.

2. *Niche Focus Can Feel Narrow:* - The Information focuses heavily on tech, business, and finance, which may not appeal to readers looking for broader coverage of topics like politics, culture, or consumer tech trends. - Its content is often geared toward industry insiders, which can make it feel dry or inaccessible to those outside the tech and investment worlds.

3. *Limited Breaking News:* - While The Information excels at deep dives and exclusives, it is not designed for real-time, breaking news coverage. Readers looking for up

[flagged]
Journalists are allowed to be biased. It only becomes a hit piece if they report something false or make up accusations out of whole cloth.
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
A "hit piece" is not just a negative article, it typically refers to the use of false information or rumors to support something. Did you want to say that it's "an obvious anti-Musk piece"?
https://en.wiktionary.org/wiki/hit_piece

"A published article or post aiming to sway public opinion, especially by presenting false or biased information in a way that appears objective and truthful."

"false OR biased". So, a biased partisan article attempting to sway public opinion about someone is by definition a hit piece. Even without explicit lies.

And a lie is a very blurry thing. You can completely mislead the reader by not being factually incorrect and still present a completely false representation of reality. Usually by eliminating the relevant context or mitigating fact that would completely change the interpretation of an event. Sometimes editing out the part of a sentence that would give a different if not opposite meaning. That’s the standard operating procedure of all journalists these days, when they even bother to get their facts right.
> Everyone—and not just The Information—should be genuinely terrified that the richest man in the world has built a Large Language Model that spouts propaganda in his image.

If we survived Gemini refusing to draw white vikings we will survive that too.

The real concern isn't that Grok chatbot will be biased if you ask it a question like that. In any case, knowing Musk, it won't be subtle, so people will know what they are getting.

No, the real worry is that Grok is what Musk's "young conservative genius" squad is going to put in charge of many of the things in our government, basically, on the basis that it saves money.

>In any case, knowing Musk, it won't be subtle

Or, that "conditioning" would have us assume as much.

  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
A bit apples to oranges on that comparison there.
It did give us racially diverse Nazis though. Not sure if grok would do that.
Looks impressive. OpenAI and Sam Altman might be cooked if its as capable as advertised.
Every competitors were done when Claude 3.5 was released, every competitors were done when o1 was released, the entire West was done when DeepSeek was released, the world was done when Mistral Le Chat was released, I guess now it's time for the solar system to be done because of Grok3. Let's see what new model dominates the galaxy next week.
There's a level of truth to many of those statements.

1) Claude 3.5 prevented OpenAI from making big monopoly profits on LLM inference 2) Open source models like Mistral and Llama effectively prevented any regulator from controlling how people fine tuned models, and what they used them for 3) Deepseek prevented the collective west from exerting control over the creation of base models

[flagged]
  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
"Please don't post insinuations about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data."

https://news.ycombinator.com/newsguidelines.html

https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...

Despite explicitly requesting the tetris/bejeweled hybrid to be "insanely impressive", the result was ugly and clunky. With that demo running in the background, they then segued into a hiring pitch for a new AI game studio. Consider me unimpressed.
From what sama says, looks like GPT4.5 dropping imminently. So, might up the game even more.

2025 going to be even more wild than the last two years. Ye gads.

[flagged]
Doesn’t OpenAI claim their work will lead to artificial general intelligence? That seems like a much steeper gradient to climb.

Hype is the fuel that bootstraps fortunes in techtopia.

> Hype is the fuel that bootstraps fortunes in techtopia.

Hype is the fuel that builds fortunes at the expense of the greater fool. See WeWork.

Arguably none of the magnificent seven was built on hype, other than Tesla (and even there, not sure it was).

It’s overvalued relative to its actual business, which is by definition hype
I agree it's wildly overvalued, but the hype came gradually over many years of actually shipping products that made money.

It's not the same kind of hype as, say, color.com

  • ben_w
  • ·
  • 4 days ago
  • ·
  • [ - ]
Yes, but we have a long track record of one overpromising and underdelivering, charging money for what doesn't yet exist; while the other released stuff on a minimal website with an associated blog post for fanfare, and actually (metaphorically) turned the world upsidedown, and only charges for what they actually deliver. Yes there's hype now, but that's how it began.

So I think people are less distrustful of Altman when he says "thousands of days".

  • hobs
  • ·
  • 4 days ago
  • ·
  • [ - ]
Lying about FSD for almost a decade now through your teeth isn't "hype" - it's just bullshit.
When it comes to lying to investors - it’s fraud.
It's bullshit that put him in a position to build xAI in less than a year.
Yes, but then they also redefined what that means half way through.
I have 2 words for you: different company
They might try to have a senator block it or make it a crime to use it…wait Elon is the president. I guess they cannot.
I missed the first half hour, any highlights?
It would be satisfying if he gets called out for repeating himself next time he tries to come out and say he is scared how powerful their next model is.
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
I used to think the same way wrt Nvidia stock when it tanked - compute is clearly diminishing returns. Tech companies subsequently announced capex equal to or greater than expected in compute. I smacked myself on the forehead when I realized - I'd been think too much like an engineer. Tech CEOs badly want to believe they have an edge over every upstart from San Francisco to Shanghai. Unlimited spending on compute gives them that reassurance. In fact, the more threatened they feel, the more they spend to cling onto it.

Kids have security blankets. Tech CEOs have security compute clusters.

This is the danger of being informed only by sensational headlines. Nvidia's stock has fully recovered and is again near an all-time high. You seem to be generalizing about "Tech CEOs" — but in this case, GPUs are the advantage. They are necessary to achieve the outcome, and yet they are severely supply constrained. It's smart to overpay now.

Apple did something similar with NAND storage for the iPad mini. They took a bet that could have been wrong. It was not wrong. Competitors had a hard time because of it.

  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
Overpaying for using them is not smart. They depreciate fast under heavy load.
but nobody needs to know what load they were used so it's "barely used" on the listings
  • toolz
  • ·
  • 3 days ago
  • ·
  • [ - ]
load isn't what causes degradation, it's heat and as someone who has mined crypto for years I'm aware that there are a lot of things that can be done to run hardware quite hard and keep thermals low. Whether or not that is what is being done, I have no idea. A GPU mining crypto for 5 years kept below 65C (rather easily done) is going to have far more life left than a GPU in some kids gaming PC that spikes frequently to 85C for even a year.
Everything is near an "all time high." Microstrategy stock is hovering near an all-time-high, and they're just a company that buys up Bitcoin and wastes some of it. Meme coins are floating up to all-time-highs. Stop using asset prices to justify anything people are doing, they're fully decoupled from anything happening below.
I don't think I was informed by sensational headlines. I was well into talking to people I knew about how DS's performance relative to compute was a game changer much before the stock crash.

It's not binary where you either have compute or not. You definitely do need GPUs, but there's already masses of compute, I believe it doubles every ten months or so just from Nvidia's chips. Many factors make it a very irrational decision

1) Companies were spending hundreds of billions collectively on AI capex. Meta alone was 75 billion projected this year. This is an extraordinary bet, given that the most revenue any AI company makes is a few billion by OpenAI.

2) When DS came out, it was a huge validation of the moatless idea. These SOTA companies have no moat, at best they are spending tens of billions to maintain a few months edge.

3) DS was also a huge validation of the compute saturation idea - that SOTA models were always massively efficient. At best it was traded for iteration speed.

4) Many other more technical arguments - Jevons paradox, data exhaustion (synthetic data can only be generated for a fixed set of things), apparent diminishing returns (performance relative to compute, the denominator has been exponential but the numerator logarithmic)

So on one hand you have these SOTA models which are becoming free. On the other hand you have this terrible business model. I strongly suspect that AI will go the way of Meta's Metaverse - a staggering cash burn with no realistic path to profitability.

It's one thing to invest in a new technology with tangible benefits to your product. It's another to spend vastly, vastly more into vague promises of AGI. To put it into perspective, Meta will spend on AI capex in a few months of 2025 as much as Apple spent on NAND in total. What advantage is there to be had with SOTA models? You do 20% better on some AIME/IQ/competitive coding benchmark, which still translates atrociously to real world issues.

But Nvidia will be very successful because these companies frankly have lost a lot of the plot and are FOMOing like mad. I still have memories of the 2013 AI gold rush where every tech company was grabbing anything with AI in them, which is how Google got DeepMind. They are being enormously rewarded by it by the stock market with Meta's price 6x since it's lows.

It is entirely possible that LLMs end up serving some useful purpose, but don't end up being great businesses.

I can think of a million different software services that have some value to users, but don't have some multi-trillion dollar revenue stream flowing from them.

There is an idea that these LLM companies are going to be able to insert their agents into the labour market and capture some percentage of the savings that firms realize from laying off their human workforce. Given the competitive environment, it is far more likely that these LLMs become an undifferentiated commodity good, and their value to the economy gets captured elsewhere. Currently the only value being captured is at the infrastructure level, and that is all predicated on a series of assumptions around software business models that have not materialized yet.

>For what? There is no ROI at that price point. There is no monetisation potential.

I think your whole argument is based on this being true, but you didn't give much argument about why there is no ROI. 400M USD isn't hard to generate...even a moderate ad engagement lift on X would generate ROI and that's just 1 customer.

Imagine going back in time and showing every VC how great the search business will be in 20-30 years. The only rational response would be to make giant bets on 20 different Googles...and I think that's what's happening. These all seem like rational investments to me.

Ken Griffin had an interview where he said something along the lines of the technologies dot com bubble pretty much turned out to be what everyone thought they would become at the time. The issue was valuations grew way too fast and it took much longer than expected for the companies to build out their products.

I think a similar thing is playing out with AI. In 5-10 more years these LLMs will replace a google search today (and maybe be even better).

Everyone I know has already switched from Google to ChatGPT for most of their search queries.
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
That's a red herring because it ignores the part where they could have done the same things spending a tiny fraction of the money.
_Could_ they have done the same thing with a tiny fraction of the money? Grok 3 benchmarks are SOTA for both base model and reasoning. By definition, nobody has been able to do the same thing with any amount of money (discounting o3 which has been teased but is unreleased). That may change in the future! But as of now this is the case.
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
So apart from the part where SOTA doesn't mean anything in the real world (there is no monetisation, there's no moat), please, it's benchmarks, we all know how you beat those since 2023.

Time to review https://arxiv.org/abs/2309.08632 AI-CEO.org's best friend

(and actually o3-mini-high beat them in a bunch of benchmarks so they removed it from those charts in the livestream)

Why don't you do it then? If you are talking about Deepseek "$5M", then you would be interested to know that they pay 7 digit salaries and reportedly have H100s worth $2B[1].

[1]: https://sherwood.news/tech/the-trillion-dollar-mystery-surro...

Just wonder if it matters? If Google spent 10x as much in the first 5 years of its life would it be a worse company now? Giant TAM, winner takes all (or most?), all that matters is winning.
People like Demis Hasabis and Derio Amodei say that R1 efficiency gains are exaggerated. $5M training cost seems to be fake as sources suggest they own more GPUs.
You seem to be assuming that the full cost of the cluster is recouped by Grok 3. The real value will be in grok 5, 6, etc…

xAI also announced a few days ago they are starting an internal video game studio. How long before AI companies take over Hollywood and Disney? The value available to be captured is massive.

The cluster they’ve built is impressive compared to the competition, and grok 3 barely scratches what it’s capable of.

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
Yes. Why do get these replies on HN that seem to only consider the most shallow, surface details? It could well be that xAI wins the AI race by betting on hardware first and foremost - new ideas are quickly copied by everyone, but a compute edge is hard to match.
The compute edge belongs to those like Google (TPU) and Amazon/Anthropic (Trainium) building their own accelerators and not paying NVIDIAs 1000% cost markups. Microsoft just announced experimenting with Cerebras wafer scale chips for LLM inference which are also a cost savings.

Microsoft is in process of building optical links between existing datacenters to create meta-clusters, and I'd expect that others like Amazon and Meta may be doing the same.

Of course for Musk this is an irrational ego-driven pursuit, so he can throw as much money at it as he has available, but trying to sell AI when you're paying 10x the competition for FLOPs seems problematic, even you you are capable of building a competitive product.

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
Timing matters. A long term strategy for superior hardware might bear fruit too late.
I'm not sure about that - I expect AI is going to become a commodity market, so it doesn't matter how late you are if you've got a cheaper price.

In terms of who's got a lead on cheap (non-NVIDIA) hardware, I guess you have to give it to Google who are on their 6th generation TPU.

I wonder how Tesla's training computer Dojo is doing. Although I guess there's a reason for buying so much Nvidia hardware...
Curious where you saw the Microsoft/Cerebras experimentation noted online? That's very interesting.
It was mentioned in Anthropic Jack Clark's "Import AI" newsletter.

https://jack-clark.net/2025/02/17/import-ai-400-distillation...

  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
DeepSeek just showed the compute edge is not that hard to match. They could have chosen to keep the gains proprietary but probably made good money playing the market instead, quants as they are.

https://centreforaileadership.org/resources/deepseeks_narrat...

If you’re using your compute capacity at 1.25% efficiency, you are not going to win because your iteration time is just going to be too long to stay competitive.

Software and algorithmic improvements diffuse faster than hardware, even with attempts to keep them secret. Maybe a company doubles the efficiency, but in 3 months, it's leaked and everyone is using it. And then the compute edge becomes that much more durable.
Optimisation efforts don’t negate investment in capacity but multiply output.
  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
Sorry, you missed the point - DeepSeek tried some new software ideas, they did not manage to secure the same computation capacity.
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
They achieved the same results for 1.25% of the computation cost... If they actually had that computation capacity, it would be game over with the AGI race by the same logic.
> but a compute edge is hard to match.

xAI bought hardware off the open market. Their compute edge could dissappear in a month if Google or Amazon wanted to raise their compute by a whole xAI

  • Tycho
  • ·
  • 3 days ago
  • ·
  • [ - ]
Not if there’s a hardware shortage.
Ok, 2 months.

Remember, the new B200 have 2.2x the performance of xAI’s current H100 “hardware edge”. So it only takes an order half the size.

Or you could order the old H100 instead and avoid the B200 shortage.

[flagged]
There seems to be a coordinated effort to control the narrative. Grok3's release is pretty important, no matter what you think of it, and initially this story quickly fell off the front page, likely from malicious mass flagging.

One thing that's taken over Reddit and unfortunately has spread to the rest of the internet is people thinking of themselves as online activists, who are saving the world by controlling what people can talk about and steering the conversation in the direction they want it to go. It's becoming harder and harder to have a normal conversation without someone trying to derail it with their own personal crusade.

>Grok3's release is pretty important

How? After an enormous investment the latest version of some software is a bit better than the previous versions of some software from it's competitors and will likely be worse than the future versions from it's competitors. There's nothing novel about this.

They just started, the velocity of xAI is novel.

NVIDIA's CEO Jensen Huang: “Building a massive [supercomputer] factory in the short time that was done, that is superhuman. There's only one person in the world who could do that. What Elon and the xAI team did is singular. Never been done before.”

>only one person in the world who could do that. What Elon and the xAI team

That is literally more than one person.

One billionaire glazing another because it might enrich himself further hardly seems noteworthy. That quote is superfluous at best.
Largest supercluster in the world created in a small time frame is pretty important. 4 years typically, cut down to 19 days. That's an incredible achievement and I, along with many others, think it's important.

https://nvidianews.nvidia.com/news/spectrum-x-ethernet-netwo...

https://www.tomshardware.com/pc-components/gpus/elon-musk-to...

Okay but that's obviously a nonsense claim. Find me a computer on the https://en.wikipedia.org/wiki/TOP500 that was built 4 years after the chips it uses debuted.

H100s aren't even 3 years old.

> There seems to be a coordinated effort to control the narrative.

Do you have any evidence for this? Who would want to coordinate such an effort, and how would they manipulate HN users to comment/vote in a certain way? I think it is far more plausible that some people on here have similar views.

> [people] controlling what people can talk about

That's called 'moderation' and protects communities against trolls and timewasters, no?

> and steering the conversation in the direction they want it to go

That's exactly what conversation is about, I'd say. Of course I want to talk about stuff that I am interested in, and convince others of my arguments. How is this unfortunate?

>Grok3's release is pretty important

Is it? It's Yet Another LLM, barely pipping competitors at cherry picked comparisons. DeepSeek R1 was news entirely because of the minuscule resources it was trained on (with an innovative new approach), and this "pretty important" Grok release beats it in chatbox arena by a whole 3%.

We're at the point where this stuff isn't that big of news unless something really jumps ahead. Like all of the new Gemini models and approaches got zero attention on here. Which is fair because it's basically "Company with big money puts out slightly better model".

I'd say Grok 3 is getting exactly the normal attention, but there is a "Leave Britney Alone" contingent who need to run to the defence.

Noticed this also. It doesn’t feel organic.
I mean, the honest truth is something closer to:

We have no clue how all this is going to play out, what value is captureable and what parts of a lead are likely to stay protected. This race is essentially the collective belief in a generationally big prize and no idea how it unlocks.

The problem with that for a comment section is it reduces ALL comments to gossip and guessing, which makes people feel stupid.

i think it's astroturfing
  • api
  • ·
  • 3 days ago
  • ·
  • [ - ]
Reddit today feels like it's absolutely overrun by bots. So much of the comment content is so superficial and cookie-cutter I find it hard to believe it's all produced by human beings. A lot of it reads like the output of small cheap LLMs of the sort that would be used for spam bots.

Of course we know X, Facebook, and probably most other social media is also overrun by bots. I don't think you can assume that humans are on the other end anymore.

  • jaco6
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
  • kmac_
  • ·
  • 3 days ago
  • ·
  • [ - ]
The point is that it is inefficient. Others achieved similar results much cheaper, meaning they can go much further. Compute is important, but model architecture and compute methods still outweigh it.
How quickly will Grok 4/5/6 be released? Of course you can choose to keep running older GPUs for years, but if you want bleeding edge performance then you need to upgrade, so I'm not sure how many model generations the cost can really be spread over.

Also, what isn't clear is how RL-based reasoning model training compute requirements compares to earlier models. OpenAI have announced that GPT 4.5 will be their last non-reasoning model, so it seems we're definitely at a transition point now.

  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
At current efficiency? Not nearly as fast as DeepSeek 4 ;)
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
None of which explains this massive waste of money for zero gain.
It's not going to be from this unless it's forced upon us by the federal government. All the other companies are ahead and aren't just going to stop.
> xAI also announced a few days ago they are starting an internal video game studio.

Ha ha. I'm sure their play to claim airdrop idle game will be groundbreaking.

  • nilkn
  • ·
  • 3 days ago
  • ·
  • [ - ]
xAI is not trying to make an immediate profit -- ironically, just like DeepSeek. They will undoubtedly use more efficient training processes in future runs and they will scale that across their massive GPU cluster. Just because they didn't cancel the training of Grok 3 and start over absolutely does not mean they will not incorporate all the work from R1 and more in the next run.

What you're seeing right now is pure flex and a signal for the future and competition. A much maligned AI team that hasn't even been around for very long at all just matched or topped the competition without making use of the latest training techniques yet. The message this is intended to send is that xAI is a serious player in the space.

  • ctoth
  • ·
  • 3 days ago
  • ·
  • [ - ]
> DeeoSeek trained r1 for 1.25% (5M) of that money (using the same spot price) on 2048 crippled export H800s and is maybe a month behind.

This is a great example of how a misleading narrative can take hold and dominate discussion even when it's fundamentally incorrect.

SemiAnalysis documents that DeepSeek has spent well over $500M on GPUs alone, with total infrastructure costs around $2.5B when including operating costs[0].

The more-interesting question is probably why do people keep repeating this? Why do they want it to be true so badly?

[0]: https://semianalysis.com/2025/01/31/deepseek-debates/#:~:tex...

SemiAnalysis is wrong. They just made their numbers up (among many other things they have invented - they are not to be trusted). I have observed many errors of understanding, analysis and calculation in their writing.

Deep Seek R1 is literally an open weight model. It has <40bln active parameters. We know that for a fact. That size of model is definitely roughly optimally trained over the time period and server times claimed. In fact, the 70bln parameter Llama 3 model used almost exactly the same compute as the DeepSeek V3/R1 claims (which makes sense, as you would expect a bit less efficiency for the H800 and for the complex DeepSeek MoE architecture).

Active parameters is definitely the wrong metric to use for evaluating the cost to train a model
> For what? There is no ROI at that price point. There is no monetization potential.

It appears that LLM chat interfaces will replace Google SERPs as the arbiters of truth. Getting people to use your LLM allows you to push your world view. Pushing his "unique" world view appears to be the most important thing to modern Musk.

In that light, paying 40B for Twitter, and billions for Grok training makes perfect sense.

It's a race for AGI, a VC's wet dream.

The beauty of a failed investment is that it never goes below zero. So upside is the only thing they care about. Why invest in a near-zero chance for a random SAAS to take off, when you can invest in a near-zero chance of creating superhuman artificial life?

  • rtsil
  • ·
  • 3 days ago
  • ·
  • [ - ]
> It's a race for AGI, a VC's wet dream.

Yes but why? This is what I really don't understand.

Say AGI is achieved within a reasonable timeframe. Odds are that no single company will achieve that, there will be no monopoly. If that's the case, where is the trillion dollars value for investors? From every claim we hear about it, AGI will lead to hundreds of millions of jobs disappearing (all white-collar jobs), and tens of millions of companies disappearing (all the companies that provide human-produced services). Who is going to buy your AGI-made products or services when nobody is paid anymore, when other companies, big and small, has ceased to exist? Sure, you can make extraordinary accomplishments and advance humanity far, far ahead, but who is going to pay for that? Even states won't be able to pay if their taxable population (individuals and corporations) disappear.

So where will the money come from? How does it work?

Also, profitability won't materialize in an environment with so many competitors offering comparable products. Perfect competition destroys profit. The good becomes a commodity, and the price people will pay simply becomes the marginal cost of production (or in this case, less, while the dumb money is still chasing the hype).
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
Works well when you see the company stuffing dollar bills into their sports car to race on 1.25% fuel efficiency against a chinese family sedan with a hand tuned ICE.
A failed investment never goes below zero for the investor. For everyone else on the other hand...
As a consumer, I'm just happy that base models are improving again after a ~quarter or more of relative stagnation (last big base model drop was Sonnet v2 in October). Many use cases can't use o1, r1, or o3[-mini] due to the additional reasoning latency.
Yes, and the scaling laws survive! so, hopefully, more on the way...

> due to the additional reasoning latency.

They're also less creative for non-STEM topics

DeepSeek wouldn't be able to train R1 without their ~600B parameters base model, so you should consider the cost of that model when you compare with Grok.

In any case, Elon won't win this race cause the best talent will not work for him. He used to have good reputation and a lot of money, which is a deadly combination. Now he only has the latter -- not enough when leading AI people can make 7 figures in other companies.

To be clear 1: I'm not saying that people who currently work on Grok are not great. It's not about hiring some great people. It's about competing in the long run - people with other options (e.g. offers from leading AI labs) are more likely to accept those offers than joining his research lab.

To be clear 2: I'm not talking about Elon's reputation due to his politics. I'm only talking about his reputation as an employer.

He has the vision and marketing skills but it's not going to be enough for leading the AI race.

Actually, the 5 million figure is for the compute cost for the base 600B parameter model. Training R1 was just 8000 steps of reinforcement learning, so I expect that the vast, vast majority of the training cost is already included in the pretraining stage.
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
It’s not like Grok3 didn’t have precious work to build on either, but point taken.

I think the situations are a bit comparable given timelines however.

A perfect analogy for AI … your ability to replace talent with money. And if you don’t have the talent, it’s gonna cost you 100x more.

> your ability to replace talent with money

That sure seems to be the message given in Apple AI commercials. From those commercials the tag line for AI should be "enabling idiots everywhere".

  • ·
  • 3 days ago
  • ·
  • [ - ]
> until Claude4 snuffs it out later this month

Any source? I’m a heavy user of Claude and pay for the Teams plan just for myself so I won’t get throttled. Love it. But I’ve been impressed with O1 Pro lately. That said, I don’t like paying both €166 for Claude Teams and €238 for OpenAI Pro. :)

  • ·
  • 3 days ago
  • ·
  • [ - ]
> This all by the man in charge of “government spending efficiency”.

Per court filings by the administration, Musk is not in charge of DOGE, nor does he have any role in DOGE, nor any decision-making function in government at all, he is a White House advisor unconnected to DOGE.

[flagged]
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
What is hilarious is your disdain for their achievements which occurred in less than two years. This is just the beginning.
what makes you think there won't be an ROI?
  • gmerc
  • ·
  • 3 days ago
  • ·
  • [ - ]
I think at this point you're going to have to answer "what makes you think there will be any"
> There is no monetisation potential.

DOGE uses only X links, and I am sure Grok will be the next gov contract. After all he has all the data on everybody down to your IRS tax returns.

  • nicce
  • ·
  • 3 days ago
  • ·
  • [ - ]
How this is even legal. Don’t they have any sort of competitive tendering?
We are long past the rule of law in the US. Whatever is left is residual, running on fumes. China-style corruption is here to stay.
Corruption aside, China is run by smart leaders who execute on a long-term plan, and are gradually extending their influence over the world. The US is doing the opposite.
It’s easy to execute on a long term plan whenever your government is totalitarian run by cult of personality, and doesn’t have any concern for individual rights.
The US has executed on long-term plans in the past. We're just choosing not to right now. This is something we need to change, and very quickly.
Why do you think legality matters?
  • nicce
  • ·
  • 3 days ago
  • ·
  • [ - ]
I have still some hope. Everyone how is capable of, should challenge these in court.
And then what? "They've made their decision, let's see if they can enforce it."

Capturing the executive turns out to be the winning move. Maybe it's what Gödel saw coming (https://en.wikipedia.org/wiki/G%C3%B6del%27s_Loophole).

  • Hasu
  • ·
  • 3 days ago
  • ·
  • [ - ]
If you think an executive that ignores court orders is going to survive for a long time in America, I am willing to bet any amount of dollars against you at any odds.

It's a good bet for me, because if I lose, dollars won't be worth anything anyway.

  • nicce
  • ·
  • 3 days ago
  • ·
  • [ - ]
Because that is the time when the rest of the people realize, that this is serious. Sooner is better.
Yep, hence 'constitutional crisis'.
> How this is even legal.

You're talking about Musk and Trump. Legality doesn't even enter into the conversation.

[dead]
I don't understand how and why Grok would be related to "understanding the nature of the universe", as Musk puts it. Please correct me if I'm wrong, but they basically just burned more cash than any human should have to buy Nvidia GPUs and make them predict natural language, right? So, they are somewhat on-par with all the other companies that did the same.

This is not innovation, this is baseless hype over a mediocre technology. I use AI every day, so it's not like I don't see its uses, it's just not that big of a deal.

There are two answers to this.

Answer 1: Some people think that LLMs are a path to the singularity, a self-improving intelligent program that will vastly exceed human intelligence and will be able to increase its knowledge exponentially, quickly answering all answerable scientific questions.

Answer 2: LLM companies need to keep the hype train rolling. I didn't watch the whole clip; I jumped around a bit, but I noticed that every time Musk interjected something, it was to exaggerate what was previously said. "Grok contains the whole internet"—"the whole of human knowledge, actually!"

I think that both answer 1 and answer 2 apply to Musk. He seems to believe that they're building a god-like entity, and he also needs to keep the money train rolling.

>he also needs to keep the money train rolling.

this and only this, everything he says when talking about how good his products are, he lies and exaggerates to get investors - from the promise of 2 manned missions to mars in 2024, to a 300 ton payload in space, and FSD.

Whatever it takes to pad the wallet.

There's a more short-term goal for Grok, which is to replace what is left of the federal government with AI. That will significantly boost the money train, but is also a utopian (for some, dystopian for others) goal of replacing the expensive 'deep state' with a slim set of impartial algorithms.
LLMs don't seem to be very impartial so far. Quite the opposite in fact, they're entirely beholden to the prejudices of their trainers.
Thankfully grok does not seem to have the bias the Elon promised on twitter.
Glad I don't live in the US, that sounds like a miserable idea.
Is there a citation for this?
"Elon Musk Ally Tells Staff ‘AI-First’ Is the Future of Key Government Agency" from Wired

https://archive.is/jyFCy

This isn't unequivocal proof, but the broad goal automation lends itself pretty strongly to LLMs, and oh boy what LLM technology do you think they want to use.

  • ·
  • 4 days ago
  • ·
  • [ - ]
I'm pretty sure it doesn't contain the whole of human knowledge... I doubt grok knows my youtube password or my bank PIN
an ai god would fit well with the dark enlightenment ideas of musk and his cronies
he may have deleted it but… around 2020 or so there was starting to be a lot of hype about llms. elon musk responded to a “doomer” on twitter saying that he “didnt see the potential for that” referring to LLMs achieving AGI. it was a 100% dismissal of everything he is saying now. at that point elon musk had already been saying publicly for years “AI is more dangerous than nukes.” but he also had voluntarily walked away from openAI which he would never do if he thought there was any chance of AGI. i just want to known the truth… is this really just advanced search and some jobs will be lost because they ended up being nothing more than search tasks (ie coding boilerplate) or are we really on the cusp of AGI (and therefore in a great deal of danger)? its impossible to say whether or not elon musk really believes what he is saying… there are public figures on both sides providing conflicting explanations.

as i watched the grok3 stream i became very angry. so very tired of being jerked around and not knowing whether or not i should be planning for the future or investing in the world as it is now… its really a form of psychological torture

I work in AI and love the technology. But all the hype and grandiose claims make it awkward when people ask what I do, and it makes hiring harder when experienced developers hear 'AI development' and walk away - even though it's mostly just solid full-stack engineering work.
I am always looking for roles, and I have pretty good full stack experience (a few years of C++, C#, some JS, TS, backend and frontend web, C, Zig, Rust, built a few hobby compilers and other stuff).

I apply to pretty much every job that sounds reasonably good in terms of work-life balance, but I completely ignore anything that says AI. I really, really, really do not want to be part of a company that lies to itself, and so far all AI companies look like they are. It's not AGI. It's not gonna be AGI. Ride the hype train, cash out and lay off 80% of the workforce and jump on the next hype train, whatever. But don't hope that people who want a stable job want to hop on something that delivers such a shaky definition of value.

Even if you're an AI-skeptic, it's hard to argue that companies building AI customer support for example aren't en route to improving the whole "calling your ISP's team in India" experience.
Absolutely, but that doesn't seem to be most companies I see
You must be new to Elon's modus operandi.
I absolutely hate the Elon hypetrain, but I also don't understand the social media hate I see for AI, like comparing every ChatGPT answer to one wasted bottle of water.

Can we stop for a second and just marvel at a new piece of human ingenuity? Let's not give Elon too much credit, but I think that AI as a whole helps us all understand the nature of intelligence, and therefore humans' place in the universe.

One of the fundamental questions of human existence is: what does it mean to exist and think? Every time we build a new human-like thing it helps us understand the context of our own existence. (Not just computers or AI, but also airplanes, factories, etc.)

True AGI would force us to rethink what it means to be a thinking human being, and I think current LLMs already should and do.

> I also don't understand the social media hate I see for AI, like comparing every ChatGPT answer to one wasted bottle of water. Can we stop for a second and just marvel at a new piece of human ingenuity?

I don't know, man. We're staring down the barrel of at best a WW3-event and at worst an extinction-event. We're doing absolutely nothing to stop it, even though we have all the answers and the resources to do so. Instead, we're making the problem even worse all so some marketers and scammers can spend someone else's money to generate garbage pictures and SEO spam, so the worst people on the planet can gain even more money and power than they already have.

I'd love to be positive about this tech, I'm sure it's cool or whatever, but it's really hard to be positive about anything right now, especially when the tech in question is speeding us straight along the path to mass death. The world sucks and the people running the LLM stuff are amoral monsters putting all of their resources into making it worse. I'm not excited about any of this.

What's happening definitely makes me nervous, but "at best a WW3-event and at worst an extinction-event" seems a bit much. Mainly because there are a _lot_ of unknowns. Better try to get comfortable with just riding this out.
It really isn't. Climate change is going to make large amounts of land unlivable. That's going to cause a climate refugee crisis. I agree the effects of that refugee crisis are unknown, but I can't see any resolution that doesn't involve increased nationalism, civil wars, and violent resource conflicts. Given this is a global crisis, that's a recipe for WW3.

This was all avoidable, of course. But instead of fixing it, we spent decades fiddling around with toys like LLMs. Whee.

LLMs don't make me question what we know about humans and thinking. They are really good at convincing us that they're good, but really, that's other humans building stuff to convince us that it's good. There is no intelligence here, other than the perceived intelligence of predicting words intelligent people have written previously.
> There is no intelligence here, other than the perceived intelligence of predicting words intelligent people have written previously.

I think this is my main point- isn't it amazing that a thing that predicts words other humans have previously written manages to appear intelligent, or, more pointedly, have utility in communicating real thoughts and ideas?

If you've ever asked an LLM a question and gotten a satisfying answer, that means that there is some human-level intelligence somewhere in the token filtering / recombinating that an LLM does.

Specifically I think the test of human-like intelligence is literally the output- If we get utility from the arrangements of the tokens it outputs, that in and of itself demonstrates that some portion of human intelligence could be this same token generation mechanic.

No. Just means we are easy to fool. Like apes who see themselves in the mirror and fail to recognize they are seeing themselves in the mirror, thinking it is a different ape (and trying to mate with or attack it).
The invention of the mirror by humans probably provided an interesting insight into our own existence... I wonder what it would have been like to see your own reflection for the first time as a technology. How would that change your outlook on your self-hood and identity?
Great question to ponder. Surely people would have seen their own reflections in water, but the mirror itself would have made "Reflection Technology" for "Artificial Introspection" more scalable. I suspect the mirror offered modern people a new viewpoint, allowing one to see how one is perceived by others. I do not think selfhood and identity would have been affected. My main question is about when people came to behave differently than apes. Douglas Hofstadter's "The Mind's I" may have a few hints on perception of self for you.
It does make me question humans and thinking but in the opposite direction.

It is like sitting down at a piano, sight reading a piece from sheet music and then someone who has no idea what they are talking about claiming you composed the music on the fly. Then when you point out the sheet music they just double down on some bullshit as to why they are still right and that is still composing even though obviously it is not.

Best analogy so far. I am adopting this for the next wave of "wait until the next model" and "but humans hallucinate, too" comments. Yes, when we feed back our own output (language on the web) into ourselves, things become tricky to tease apart, and it would seem like intelligence to us. Then again, the mechanical turk appears intelligent, too. If we point out how it works, then the "magic" should vanish.
> There is no intelligence here

Can you list a few demonstrations from a text-outputting computer program that would each convince you that there is intelligence here? Eg writing a bestselling novel, proving an unsolved number theory conjecture, etc. Or is your belief uncontestable?

That's not really a fair question. To answer it, the OP would have to define intelligence. If you have done so already, then by all means, do share your definition. If not, then you are in no better position to claim intelligence than the OP is in claiming lack thereof.
I think he just needs the model to be installed in a humanoid robot.
It's cringe but not so much more than deepmind's OG "solve intelligence then use it to solve everything else"
I feel like that's part of what Elon is flexing. Teslabot was late comer compared to competitors like BD.

Elon is showing off he can marshal enough resources and talents to be on par (kinda) with state of the art products in crazy time. That's been most of his superpower so far - not breakthrough tech that didnt exist before. We've had rockets before.

I don't like Elon either, but not only has SpaceX created breakthrough tech that didn't exist by landing an orbital class rocket, as of today still nobody else has done it.

Landing a rocket was considered impossible and unthinkable 10 years ago, and then SpaceX completely changed the game. And they're reinventing rocket tech again with Starship by catching it midair.

It still blows my mind that nobody has meaningfully replicated the Falcon 9 after thirteen years of it flying commercially.
  • jprd
  • ·
  • 4 days ago
  • ·
  • [ - ]
We can all thank Gwynne Shotwell for this though.
Have you listened to her interviews? You can find some on YouTube.

I'm sure she has been very helpful in navigating the US govt/NASA bureaucracy and winning SpaceX deals, but she's clearly not a visionary.

Gwynee Shotwell works for Elon Musk.
  • TFYS
  • ·
  • 3 days ago
  • ·
  • [ - ]
The number of people that have the capital and connections required to even attempt such things is very small, so it's not necessarily Musk's abilities that made those things happen, just the combination of having the power to allocate enough resources and an interest in such things.
He started both Tesla and SpaceX when he had "only" a few hundred million to his name and no more connections than would be expected of a Silicon Valley guy making payment software. And lots of brilliant guys, including John Carmack for instance, have tried their hand at aerospace - and failed. Jeff Bezos started Blue Origin before SpaceX was even founded, and it was literally only last month that they finally managed to get a rocket into orbit for the first time. There's a joke in the industry: 'How do you become a millionaire in the aerospace industry? Start out as a billionaire in the aerospace industry!'

And we live in a world of millions of millionaires, and thousands of billionaires. For that matter, even China is trying their hardest to replicate SpaceX tech given all the resources of the world's largest economy, and 1.4 billion people (meaning a proportionally larger chunk of intellectual outliers), and defacto authoritarian power to make it all happen. Yet they remain (in terms of rocket technology) behind SpaceX.

  • TFYS
  • ·
  • 3 days ago
  • ·
  • [ - ]
Being the most successful out of three or even a dozen doesn't make someone exceptional. Because so few people with interest in space have "only" a few hundred million, we can't really say if it's actually his talent that made it possible or simply the result of having access to resources that the vast majority of people could never dream of.

The U.S. has a long history of aerospace innovation, from NASA to private contractors, and Musk was able to use this ecosystem. China doesn't have that.

WEF cites a global space economy at $630 billion, alongside investments of $70 billion. [1] And as anybody with half a head on their shoulder can see, space will be where the big future economic growth will come from. Even if somebody has 0 interest in space, which I think is very few people, that's where the next 'big boom' in economics will come from. And SpaceX was started on a fraction of $0.3 billion with Carmack and Bezos just being a couple of names people on here would be familiar with, amongst tens of thousands. Yet no competitor is anywhere to be found.

And the US doesn't have a long history of aerospace innovation. In 1962 Kennedy gave his 'to the Moon' speech, 7 years later in 1969 we'd go from having nothing to putting a man on the Moon. From 1969 (well 1972 in particular) to the birth of SpaceX (early 2000s) US space technology not only stagnated but regressed. This is why Boeing (who was a major part of the original space race) can't manage to even begin to replicate what we achieved in the 60s, in 7 years no less!

Incidentally this is also a big part of what motivated Elon to start SpaceX. He was looking at NASA's future plans for human spaceflight and they were basically nonexistent. So he wanted to launch a greenhouse to Mars and stream it growing, to inspire people and hopefully get things moving in the right direction again. NASA wasn't interested in any such things, the Russians wanted too much $$$, and so SpaceX was born.

[1] - https://www.weforum.org/stories/2024/04/space-economy-techno...

  • beAbU
  • ·
  • 3 days ago
  • ·
  • [ - ]
Nit: musk is not a tesla founder. He bought his right to be called that for $6M
While technically correct, the Tesla that Elon bought has basically nothing in common with current Tesla.
When Musk 'joined' Tesla it was a name and two other guys. The latter two of whom left the company before a single car had been produced. They then sued for the right to be called founders a couple of years after they left, and once it became clear the company would stand a reasonable chance of success.
If my memory serves me correctly, they had put some Sony Handicam batteries on a chasis and driven it around before Musk. Musk was there for every actual product and its development.
This is a patently false retelling — check your sources.
> I feel like that's part of what Elon is flexing. Teslabot was late comer compared to competitors like BD.

When it come to bipedal robots, Tesla is far ahead of Boston Dynamics in terms of actually creating a product.

Have you seen Unitree robots? They started mass producing them. It's Chinese company.
  • ·
  • 3 days ago
  • ·
  • [ - ]
We've had rockets before

Yeah, but we didn't have reusable orbital rockets, and that's a distinction with a big difference.

  • kubb
  • ·
  • 4 days ago
  • ·
  • [ - ]
You don’t understand the Musk business model. It has been the same for years. His wealth doesn’t come from his products, but from his fanbase buying his stock. The purpose of everything he does is to influence the public opinion to make him the tech genius of today in the collective psyche.

Of course, he needs to do impressive things, stuff that a normal person wouldn’t have the resources to achieve. It’s similar to Mr. Beast’s channel on YouTube, just on a way bigger scale. Do things that people can’t see anywhere else.

Musk’s money will come from his fans. And ETFs, trust funds and such will amplify this when he reaches a certain market cap. His crypto coins are the exact same scheme. Once you stop thinking in classic business school terms, it starts making way more sense.

Some of his ventures actually produce value! But that’s not where the money comes from. It comes from the belief, the adoration and the celebrity status that he has.

This is the real power in today’s world. People need to know you from the screen. This clout catapulted him to the government of the US, made him the most wealthy man in the world and given him the license to do anything he wants publicly without repercussions.

  • boxed
  • ·
  • 4 days ago
  • ·
  • [ - ]
> His profits don’t come from his products, but from his fanbase buying his stock.

SpaceX is private, Starlink makes real money from real users.

> Everything he does is done to influence the public opinion to make him the tech genius of today in the collective psyche.

Well that's clearly not right. He's doing a lot of things to make himself seem like a total tool that we should all boycott no matter how good the products are. If he actually did what you say, he wouldn't be burning all these bridges.

  • kubb
  • ·
  • 4 days ago
  • ·
  • [ - ]
The bulk of his wealth is the Tesla stock. I know that SpaceX produces value. Some of his ventures do. But the image of a genius entrepreneur is way more valuable than any government contract he might get.

Remember he was way less crazy before his market cap skyrocketed. Now he can afford being polarizing as a PR strategy once his fanbase has reached a certain critical mass. He’s been constantly testing what works.

He'd still be one of the richest people in the world just based on his stake in SpaceX (IIRC valued over $100B)
He would be one of the richest people in the world even if he didn't start both SpaceX and Tesla, just from what he made from the sale of Paypal. Hell, he'll be one of the richest people in the world even if he didn't start the original X and just lived off the $13m he made from Zip2. $13m still places you in the top 0.5% in the world today.
SpaceX is not where most of his valuation comes from: it principally comes from Tesla stock, which is vastly, hilariously overvalued compared to it's performance as a car company, and definitely compared to it's performance as a technology company.

Even more importantly, analysis of Elon's tweeting patterns versus Tesla stock valuation makes the why Twitter became so central to him obvious[1] - it was a massive driver of Tesla stock value. Buying it was a good move from the perspective that he really couldn't afford to be banned from Twitter.

[1] https://www.sciencepublishinggroup.com/article/10.11648/j.ij...

  • boxed
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Buying [twitter] was a good move from the perspective that he really couldn't afford to be banned from Twitter.

Another good move would have been to not be an ass. I mean, if he really did care about the stock price like this thread implies. Being seen as a genius entrepreneur doesn't imply you should also be a racist conspiracy theory nut. I'll repeat it again: the thesis that he's doing all these things due to competency is absurd. Never attribute to competence what can be sufficiently explained by stupidity.

He's doing these things because he's lost his marbles. Trying to make it out like he's doing it for reasonable reasons is like trying to claim Trump is playing 5-D chess. It's the same as Q-anon logic. It just falls flat against Occams Razor.

  • rvnx
  • ·
  • 4 days ago
  • ·
  • [ - ]
There is an exception: Tesla FSD (the US version, not the horrible EU version) this is rather cool and impressive, and unbeaten in the market for now.

Though free and open-source solutions are not that bad like https://github.com/ApolloAuto/apollo

But the build quality of a Tesla car itself, omg. It feels like a carton box with an amazing battery.

  • kubb
  • ·
  • 3 days ago
  • ·
  • [ - ]
See my other comment. Some of his ventures do produce value. That’s not where the money comes from.
  • slig
  • ·
  • 3 days ago
  • ·
  • [ - ]
>his fanbase buying his stock

That doesn't make sense as most (66%) of the stock is owned by institutions. [1]

[1]: https://www.marketbeat.com/stocks/NASDAQ/TSLA/institutional-...

Only people engaged in active buying and selling set the price of Tesla stock. It's called "price discovery." Any institution such as an index fund passively holding stock does not effect the value of the stock, so the percentage of institutional ownership doesn't itself matter.
  • kubb
  • ·
  • 3 days ago
  • ·
  • [ - ]
Institutions are organized people. They aren’t more immune to the information firehose than me or you.
  • rs186
  • ·
  • 3 days ago
  • ·
  • [ - ]
That's pretty normal for a company like this. Also, let's be honest, based on historic data, it has been a good investment.
I'm interested in what you're saying about classical business terms. Can you elaborate on that a little? I've always found these kinds of people hard to understand.

The man has almost unlimited wealth and his motivations seem consistently petty and strange or just downright ludicrous. He's like an alien to me.

I've noted the same feeling when seeing VCs/business people speak when I've encountered them.

  • kubb
  • ·
  • 3 days ago
  • ·
  • [ - ]
The classic business is about producing valuable economic outputs and creating a stable revenue flow from bringing them to market.

The modern era post-business is about dipping into everyone’s pockets, by securing cash flow from the stock market and the government.

Here building a profitable business model is less important than convincing people and the government to give you that dough. And the best way to do it is to have clout.

> his motivations seem consistently petty and strange or just downright ludicrous. He's like an alien to me.

I think it could help to try to think of a historical figure that has done impressive things but which you don't have an overly negative view of. A lot of them seem really weird or alien. In democracies, political leaders are (sometimes!) more "normal" because they have to get elected. So think of a CEO/Founder you like (Jobs?) or earlier people (Napoleon? I dunno, pick yours)

Read a bio on them, they're pretty strange (I like Churchill, dude was wild). It seems that to do extraordinary things you need some traits that make you ludicrous. I don't really know, but it's definitely a pattern

this is 100% false..
  • rs186
  • ·
  • 3 days ago
  • ·
  • [ - ]
How so? Care to provide any meaningful information to support that?
[flagged]
[flagged]
  • cbg0
  • ·
  • 3 days ago
  • ·
  • [ - ]
Refrain from personal attacks and try to come up with some arguments if you disagree.
It's not much better than DeepSeek's old slogan "Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism."
  • croes
  • ·
  • 4 days ago
  • ·
  • [ - ]
Musk promises revolution and sells evolution.

Promises FSD, sells EVs.

Promises Mars colony, sells self landing rockets and sattelite internet.

Promises faster tunnel boring, sells smaller tunnel boring machine that drills smaller tunnels.

Promises less corruption and bureaucracy, just fires people.

Overpromising and underdelivering are the cornerstones of advertising/marketing/sales. "Use this deodorant and a gorgeous woman will want you".

Do you know why people do it? Because it works.

  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
That's great for a startup, but bad for nuclear weapons and financial data.
In fact governments are among the worst offenders I can think of in over promising and under delivering.

And that's not just Trump's government either, how's this whopper? https://edition.cnn.com/videos/politics/2019/06/12/joe-biden...

Is this the first time you're hearing Elon Musk speak? His entire public presence consists of stuttering out vapid stentences like these.
[flagged]
  • 7bit
  • ·
  • 4 days ago
  • ·
  • [ - ]
[flagged]
you sound a little hurt. Why the hate?
[dead]
  • ·
  • 3 days ago
  • ·
  • [ - ]
[dead]
[flagged]
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
[flagged]
[flagged]
[flagged]
You can’t name one single lie. If so, please do.
  • ddxv
  • ·
  • 3 days ago
  • ·
  • [ - ]
What lie are you talking about? That he supports far right parties? That he did some kind of gesture which the video looks like a Nazi salute but he says was just the my heart goes out to you gesture?

Neither seem to be that the OP was lying, but I understand you have a different opinion than they do.

OP said he did a Nazi Salute. He did not do a Nazi salute.

He did something that people who fetishize the downfall of their enemy desperately want to be Nazi salute, so they call it that, but it was not a nazi salute.

If it was he would have said so, since at that point you are basically showing the world you are a nazi. But that is not what happened.

People really don't take their credibility seriously, and will cry wolf at anything that moves, seriously undermining any argument they make.

There are ample ways to hate on Elon using factually true things he has done. Sticking to those makes your arguments rock solid, and keeps your credibility high. So much is wasted by idiots slinging smoke because it makes themselves feel right.

  • ddxv
  • ·
  • 3 days ago
  • ·
  • [ - ]
What if he does it again and just says "My heart goes out to you" and does the same hand throwing gesture? Just curious what you think in that hypothetical.

To me it seemed like he did it to troll 'libs' who dislike gestures like that.

There always exists a difference between what a person thinks of their own motive and how it is seen by others. In this case, while he might say he did one thing, many people took it to mean something else.

Both those are true.

If it looks like a Nazi salute and is done by a Nazi to honor another Nazi while adoring Nazis cheer, why do I care that you think it’s not Nazi?
[flagged]
[flagged]
[flagged]
That's because here and on Reddit, there is a strong element of EDS (Elon Derangement Syndrome), which is characterized by not being able to discuss anything SpaceX, Tesla, X, The Boring Company, &c. do without completely politicizing it and completely avoiding talking about its technical or engineering merits.
[flagged]
I'm no longer considering a Tesla but I still think Starlink is great..
  • man4
  • ·
  • 4 days ago
  • ·
  • [ - ]
[dead]
[flagged]
I imagine it's been manually vouched by the mods.
[flagged]
[flagged]
[flagged]
I will gladly give him more money. I have no vendetta against him or his actions, you're free to do as you please but don't enforce your political agenda on everyone.
Same. I support everything that's going on at DOGE. There's a certain type around here that needs to get used to the fact that their political views do not enjoy industry wide hegemony.
What's wrong with DOGE? How is eliminating bureaucracy/spending a bad thing? This has been done about 100 years ago, research on what happened after. It feels like people that scream the loudest are the ones leaching from the system.
What's wrong? Let me count the ways.

- Much of the spending they're eliminating is good and important (E.g. USAID).

- The way they're cutting is reckless (They accidentally fired nuclear safety workers).

- Many of the workers are Musk sycophants, who were hired from Twitter/Tesla/SpaceX.

- There's a tremendous conflict of interest in this agency being run by a massive government contractor (NASA and the military are both avid SpaceX customers).

- The workers are not experienced with the data they're working with, and misinterpret it constantly in public before verifying it.

- Despite claims of "transparency", Musk asserted that it was illegal to publish the names of his employees.

- Their one product, their government spending portal, is a worse version of a spending portal which already exists, and they didn't even secure their database.

- They say they "use AI" for their decisions. Which AI? Where is confidential data going?

- Do the staff have security clearance?

how does that weigh against the good they are doing? the spending crisis is critical, so of course some collateral damage will be necessary
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
The only "crisis" in US spending is that each party keeps shutting down the government when they don't like what the other one is doing. That's not going to go away with balanced books, but it sure is a recipe for disaster. It stops being a democracy when the decision making process stops being about majorities and starts being a game of "whoever doesn't blink sets the rules".

The country prints its own money; and right now it's the world reserve currency, giving the US a huge advantage when it comes to borrowing whatever it wants — the biggest threat to continuing to be the world reserve currency right now, is that the scale of cuts being talked about can only be met by cutting at least one of interest payments on government loans or things the US government considers to be "mandatory" such as pensions, social security, etc.

There are both immediate and long term consequences to the debt. In the short term we are paying a large share of revenue into debt servicing. We are all working months a year to pay interest. In the long term, we are impoverishing the next generations.

The govt can debase the currency by printing money. That only impoverishes people , except for the wealthy, causing greater wealth gap.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
> There are both immediate and long term consequences to the debt. In the short term we are paying a large share of revenue into debt servicing. We are all working months a year to pay interest. In the long term, we are impoverishing the next generations.

Only if your economy doesn't grow.

> The govt can debase the currency by printing money. That only impoverishes people , except for the wealthy, causing greater wealth gap.

It impoverishes lenders and savers, but not borrowers. It's not as simple as wealthy or poor, as any can be any.

that's not what's happened in the past 5 years. assets have ballooned. cost of living has skyrocketed. wages have not kept pace. Rich got richer, poor got poorer.
How is growing dept a good thing? How is cutting costs a bad thing? Have you seen Argentina example? How did it turn out? Can the same thing be done in USA? Only reason to say no is because someone is directly profiting from not cutting costs. If you know Musk's story then you know why he is the only best candidate to do so.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
> How is growing dept a good thing?

You can afford it, it's fairly close to a neutral thing, for a government to have debt.

Right now, the US gets to set the terms for those loans.

> How is cutting costs a bad thing?

Consider weight as an analogy: Most people can do with loosing weight, losing weight by getting enthusiastic amateurs to perform a quadruple amputation is not advisable.

Musk's target can only be met by deleting your entire armed forces.

And then you have to find another $1.2 trillion.

So the military and the entire CIA, FBI, NSA, DHS, NASA, Federal Highway Administration, FAA, the Department of Agriculture, the Department of Commerce, Department of Justice, the Treasury, …

… all that plus the military still doesn't add up to Musk's target.

Unless you want to cut stuff that's considered "mandatory" (like military pensions), or the interest repayments on the very loans you wish you didn't have.

First two points are pretty much the same and ignore all nuance.

Third point is opinion, at best.

Forth point; couldn't this be said about any politician?

Fifth point; So you're saying that no outside group is capable of auditing.

I'll stop there. You've drank the koolaid.

The world’s richest man is gutting the regulatory bodies which were designed to keep him in check, and you’re defending it. I’d say you’ve drank the Kool-aid.
No, seems like you are defending cost cutting and it doesn't make sense why are you so against it? USA is democracy, if after 4 years things go south you can always vote different and get everything back the way it was. Most probable outcome here is financial situation in USA will get better, you make it sound like USA would collapse to stone age era, it wont.
We have a mechanism for that: it's called congress. DOGE is an executive abuse of power that's resulting in national security critical roles being fired and then attempted to be rehired because they're so incompetent.
  • ·
  • 3 days ago
  • ·
  • [ - ]
What happens when Congress is asleep at the wheel or in on the take?
You campaign for other congresspeople and try and get them thrown out. Congressional representatives are very receptive to constituent calls.

I mean, we could also do your plan and just hand off essentially dictatorial power over spending to the richest man in the country.

Article 2 says “the executive Power shall be vested in a President of the United States of America.” That means the power to “enforce laws, manage national affairs, and oversee government agencies.”

It’s Congress’s role to allocate funds to certain purposes, and the President’s to “take Care that the Laws be faithfully executed,” including overseeing the bureaucracy that implements them.

The President already has dictatorial power over the bureaucracy, as per the Constitution.

Not over spending, which Elon's seizure of payments infrastructure has made.

Unless you extend it to that, in which case why have courts? Its not like they have enforcement power and the president can stretch discretion to the limit.

Seriously the number of people willing to burn down the whole government over grievances that often haven’t even impacted them is incredibly scary.

You do not throw out the baby with the bath water.

Everyone in the United States is impacted by government insolvency and the economic collapse it will inevitably lead to if unchecked.

Everyone is impacted by the fact that money-printer-fueled government spending crowds out private spending / investment / growth.

and your solution is empowering a dictator and... hoping it goes well?
I don't think that's what DOGE is doing. Seems extremely vindictive and ideological in the way it's acting and time will tell but I would not be surprised if it ends costing the taxpayers more in the long run.
Wouldn't Argentina be a good example of what is DOGE doing now? Financially it has been a good experiment for Argentina. What are the cons?
Argentina and US are very different countries, starting these cuts with very different economic realities. For example, 55% of all registered workers are employed by the government in Argentina. Although not a directly comparable metric (since in the US you also need to account for state and local civil workers), the US federal government employs around 3 million people. That's just 1.87% of the entire civilian workforce.

Again, DOGE operates from the premise that the federal government is bloated. Although this is a very popular message, I'd love to see some more objective data to support this and I doubt that CDC or USAID are the agencies where the bloat is. Like I said, their actions seem vindictive and careless. Also, likely to result in legal cases that will drag for years and end up costing taxpayer more than the supposed savings.

The main con is that once you fire the workers that you thought you didn't need (but that you did indeed need) hiring them back becomes more expensive and a lengthy process. Some of the firings are already causing chaos in vital teams among several agencies and have forced DOGE to try to reverse course (bird flu monitoring, nuclear response...).

And that's not to mention the dire situation you put the people you are firing in. Laying off people from their jobs is never "an experiment" unless you are willing to suspend every trace of empathy.

Eliminating waste is a great idea. But it’s unclear that that’s all he’s doing, it’s unclear how or how well it’s being done, he’s brought in people without security review (which means they, and the systems they are opening up and creating, can be more easily compromised by our enemies), and he has enormous conflicts of interest.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
Eliminating superfluous bureaucracy is fine.

Note that DOGE fired, and is struggling to rehire, the team whose job was to mainain America's nuclear arsenal.

Also note that the stated goal of DOGE, $2T, exceeds the entire discretionary budget of the federal government, even though half the discretionary budget is your entire active military.

Even treating $2T as a stretch goal, eliminating literally everything but the military from the discretionary budget doesn't reach the lower $1T that Musk says he has a "good shot" at.

Cuts on this scale directly, all by themselves, even aside from all the other economic diasters that America expressly chose to vote for as a package deal with this, will shrink the US economy about 6%.

  • ·
  • 3 days ago
  • ·
  • [ - ]
US Americans seem quite green, have never had this done to them before on such a scale and so haven't seen the trick before. The art of the hostile takeover. [0] Curtis documents it well, but at tedious length unless you're British, on the Slater, Goldsmith and Rowland gang. "Efficiency" is an entry point, a common bullshit word that's a perfect cover for hostile takeover, because nobody argues with it - it's a "STFU and agree" word [1].

[0] https://en.wikipedia.org/wiki/The_Mayfair_Set

[1] https://cybershow.uk/blog/posts/efficiency/

There's nothing wrong with it. Our current deficit is increasing something like 1 trillion every 3 months. Getting creative to reduce government spending seems necessary at this point.

All the hysteria over this is just partisan politics as usual.

Doge disrupts everything except for spending: https://www.economist.com/finance-and-economics/2025/02/12/e...

stop falling for branding and actually concentrate on the numbers: spending is going up, not down, and only touching entitlement programs, the military, or offsetting monetary loosening (via debt interest) will change that.

Well.

Only announce things your opponents do or even lie about it and spread the hate.

I'm 100% sure he won't direct the subsidies during COVID-19 for the hospitality sector, which Trump was quite happy about during his reign.

Or the expensive hotel stays for his security at trump hotels :)

DOGE has revealed that 400 million are rookie numbers when it comes to phony deals.

Forcing the government to spend money has always been the infinite money glitch.

On one side you have healthcare and pharma companies making sure everything is excessively overpriced, then they lobby the government to make sure everyone has government sponsored healthcare, i.e.: turn all tax revenue into healthcare revenue. Then they pay the media to convince everyone that it is their moral obligation to subsidize $1000000 insulin while making it a taboo to ask why healthcare is so expensive.

On another side you have mass incarceration where each inmate costs more per night than a 5 star hotel.

On another side you have nonsense conflicts where the weapon of choice are thousands of single use weapons that cost at least 100,000 each. Or to simply leave it behind for the enemy so then it has to be repurchased.

On another side you have tax loopholes of billions of dollars.

Everyone is stealing. Did you pay 30% tax and then sales tax on everything you bought with your income? Is your effective tax rate around 50%? then you worked 6 months for the government so they can take that 6 months of your life and turn it into a dumb single use weapon to destroy a $1 tent.

Specifics are the enemy of populism:

- healthcare and pharma is overpriced because of information frictions, institutional supply constraints (this mostly means strict immigration controls), and people just really want healthcare relative to other wants! See: https://www.reddit.com/r/badeconomics/comments/1gsezga/comme...

also biden made insulin capped at a price so.

- mass incarceration is a SUPER populist thing! How many times do we hear "we need to be tough on crime"? This sure seems like the voters behind it, it's not like people are clamoring for shorter sentences.

- Indeed, our military is expensive partially because we require domestic production and have to pay really high domestic costs, and partially because the way the US fights war places a SUPER high value on human life. Desert storm was both expensive and only resulted in ~200 coalition deaths to take down the world's 4th largest military, whereas Russia has made it not very far into ukraine and taken over 200k deaths (and it hasn't even been substantially cheaper to boot, just a bit cheaper).

- The tax loopholes exist and are bad, although I challenge you to give me specific loopholes that cost high billions and should obviously be repealed in a way that both constituents are clamoring for and the representatives don't actually do. I don't think they exist.

You know what does actually degrade the fiber of the country? The richest person in the world taking personal control over every payment and arbitrarily destroying institutional knowledge by firing every government employee he has control over or who stands up to him. But no, instead we get "he's saving money" when (see the earlier comment from me) we're not even making outlays go down! A script kiddie who randomly rewrites lines into bad assembly while destroying the build system isn't a perf eng, they're a danger.

In the US healthcare system a bag of water with salt costs hundreds of dollars. You can cap the price of one thing and then the whack a mole game starts.

The conflicts in the middle east costed trillions of dollars and there is absolutely nothing to show for it.

Quantitative easing costed trillions of dollars, most of the people responsible for causing the crisis got a big pay day from it.

If someome became "the richest person" probably you can learn something from that person. Without SpaceX, the US would have to use Russian rockets to put stuff in space because NASA shuttles were retired. Is that something you would like more?

The US is at the verge of bankruptcy and it is not because of $400m in trucks.

And it is not a Democrat or Republican issue, as I said, everyone is getting rich at the expense of the taxpayer, even taxpayers that haven't even been born yet are in debt already thanks to a wasteful mentality.

Would you bet the republic on your understanding of QE? I sure hope not
Whatever solution to the crisis I think it should have involved some jail time.
Agreed
The $400m cybertruck purchase was planned months ago under Biden.

There are many criticisms founded in genuine conflict of. interest, it helps everyone to stick to those.

I work, and the fact that my tax money is going into a black hole makes my blood boil. God bless Musk and DOGE for what they do.

Here is just one headline from today, The Elon Musk-led Department of Government Efficiency (DOGE) on Monday revealed its finding that $4.7 trillion in disbursements by the US Treasury are "almost impossible" to trace, thanks to a rampant disregard for the basic accounting practice of using of tracking codes when dishing out money.

Will you go on the record and say that $4.7T in a year is fraudulent or misspent? I want to be crystal clear with what you're insinuating, because that's a massive amount of money, easily the biggest fraud of all time by a factor of almost 30.
That's not what he said. He said the accounting is bad making it impossible to know how much is misspent.
I don't understand what this means though, almost all of our money passes an audit which necessarily has a paper trail. The few agencies which don't usually have very idiosyncratic audit misses which are, in any event, overseen by inspectors general (or were until trump fired all of them) which have been very zealous to jump on this.
That's not accurate. The Pentagon, for example, has not passed its annual financial audit since it was first required to undergo them in 2018.
This is definitely not accurate.
The poster or the headline doesn’t say they are all fraudulent. It says the payments are nearly impossible to trace.
so then misspent? Must be, because if it's not misspent than impossible to trace is a little irrelevant. It can't be unaudited, because every department passes an audit every year (except the DoD, but they basically pass an audit and the reasons they currently don't are mostly technical)
Well, it’s going to have to pass the Elon Misk audit now. Bless his efforts.
"insinuate -- suggest or hint (something bad or reprehensible) in an indirect and unpleasant way."

I am not "insinuating" but saying that I would like to know where my money goes. If you pay taxes, would not you?

Can you not find it? I can find basically any spending data I want at the tip of my fingers (well, less so now that it's unclear what's being paid) - anything specific that you feel is missing that you want to see?
If you know where the 4.7T went, please reach out to Elon.
I. don't. know. what he's talking about. that's the problem. If I tell you "I'm thiking of $5T, tell me where it is" where does that leave you?
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Here is just one headline from today, The Elon Musk-led Department of Government Efficiency (DOGE) on Monday revealed its finding that $4.7 trillion in disbursements by the US Treasury are "almost impossible" to trace, thanks to a rampant disregard for the basic accounting practice of using of tracking codes when dishing out money.

And you believe them?

This is a department that fired multiple different nuclear weapons inspector and maintenance teams without knowing what their job was.

Had to re-hire them. They weren't redundant teams. DOGE just didn't understand what they (or the teams) were doing.

Now, I'm very happy for the US nuclear stockpile to shrink. I sure think you have too many of them. But then, I'm foreign and a hippy, so I would. But (1) do you?, and (2) do you want it to shrink by some of them accidentally exploding? Or being relocated by a hostile power taking advantage of the inspectors all being on early retirement?

I am not jumping to conclusions and will reserve the judgement for later. They provided no proof so far, but hopefully it will be forthcoming, and I would not dismiss their claim outright.

Relocated where and by whom? Just curious.

  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Relocated where and by whom? Just curious.

As I'm asking you if you want this done, take your pick.

Loss of oversight made a bunch of USSR suitcase nukes, ~100 or so, go walkabout when they collapsed. Russia denies this, of course. They might be fine, or not, nobody (in public) really knows. Probably not a huge risk without maintenance, if you nick it but don't know what it is you might scrap it for parts and mistake the core for tungsten or lead, but… not great, unless it was existing nuclear powers who took them.

And even then, not great for Russia.

They deny that, but of course you know for sure that they are lying, that the nukes went missing, and you have the proof. Just like I know for a fact that there are alien craft hidden in Area 51.
  • ben_w
  • ·
  • 3 days ago
  • ·
  • [ - ]
It was a statement made by General Aleksandr Lebed, former Secretary of the Russian National Security Council, in a meeting with an American congressional delegation.

Here's the first US government report I found on it with all of the entirely negligible effort I am willing to entertain: https://commdocs.house.gov/committees/security/has078010.000...

Perhaps he was drunk, or lying, or just plain unable to find the people who knew which cupboard the devices were safely locked in. But he did make those claims. And you are missing the wood for the trees.

This "wood" (and the US report) consists of exactly one person who made this claim, and a member of corrupt Yeltsin's entourage to boot. I'd say if these nukes were real, they would have exploded somewhere by now. Try harder.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Most people, based on my experience, would rather support Elon Musk than support a strategy recommended by an individual who believes that the MSM should be trusted.

Not to mention that most of your sources to support your points are from far left MSM sources.

Your reasoning is exactly why there is so much support for Elon Musk. You probably made more Elon Musk fans just by your post.

Most people seem to believe the government is broken and MSM is a huge reason for this.

> Most people seem to believe the government is broken and MSM is a huge reason for this.

Is it because Fox News, the most influential channel of the so-called MSM, constantly repeats conservative talking points about the alleged inefficiencies of the government and downplaying the government's important work in protecting citizens?

Americans will soon get to experience what a real broken government is like, and I hope it provides them an education.

MSM is corrupt. That was the point. It does not matter the side.
a fact-free post. Nowhere hear is "he reduced spending by x" or "firing y is good."

It's all vibes, the deficit could double and the vibes would stay the same, he could be dictator and the vibes would never change.

You need to be vibe checked.
I believe you are mistaken about the timeline and details here. The $400m Tesla procurement was initiated under Biden. Trump actually cancelled it.
People who buy into Grok are willingly submitting themselves to the far-right propaganda machine. I’m sure it’s nice and tidied up for release, but there is zero chance that Musk will not use this tool to push his ideological agenda given its reach and impact.
So the Rust code it generates for me has a right-of-centre bias?
Your code will list and go round in wide circles.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Well, if you see any kind of propaganda, you can denounce and I’m sure this community will respond just like we responded to china propaganda and censorship on DeepSeek.

No good will come from of denying progress just because you don’t like someone else is opinions and worldviews.

  • ·
  • 3 days ago
  • ·
  • [ - ]
Was Hitler mentioned yet?
  • ·
  • 3 days ago
  • ·
  • [ - ]
> My question is how may of you are actually willing to give Musk more money after the questionable, legal, and ethical behavior he's exhibited while working for DOGE.

Oh yes, far better to give to alt-man, google or Facebook - those are morally responsible companies!

Far better to give elected representatives this responsibility, IMHO. This is bizarre. There are a lot of billionaires jockeying for influence and resources here. It's almost like a free for all. Musk could also use his position to force Sam Altman's hand in the acquisition of OpenAI, https://www.wsj.com/tech/elon-musk-sam-altman-relationship-6.... I'm not interested in either party, but it's clear there are huge conflicts of interests here. Musk also expressed disappointment when not getting a piece of this pie, https://www.axios.com/2025/01/22/stargate-elon-musk-trump-al.... I've also read more concerning material regarding JD Vances connections with Peter Theil and their interests in side lining the constitution for some other efforts prepping for "networked states", https://www.nytimes.com/2025/01/18/magazine/curtis-yarvin-in.... Much of this is impossible to follow closely. Like I said, this administration seems to be flooding the zone with shit to distract others from what their real intent is. I think it's worth vetting and questioning positions in government, you can't just blindly trust these ppl. Something seems really off. I say question everything at this point. I don't trust billionaires to fix the worlds problems. Democracy and the constitution should be upheld and well guarded.
I mean relatively speaking yes, only one of them is acting as an unelected dictator, circumventing our whole constitutional appropriations process by taking direct control over payment infrastructure.
I read GP to say we need to be skeptical of all LLM providers which I think is a fair point.

Saying we need to be skeptical about OpenAI (haha Open) does not mean we support Musk.

[flagged]
This is a question for x, not the submitter. Twitter & X links still list twitter as the canonical URL (which HN uses)
Even better, share xcancel.com links.
  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
Nostalgia I suppose.
[flagged]
Because running a company is different to running a state and we doubt his intentions, not his skills.

Someone that is not elected gets 8mill/day from the government now overseas the government with some 20-year old fanatics that can’t even put up a secure website for DOGE?

Yeah, most of the government isn't elected, that's how it works. You only vote for 3 roles in the Federal government (I guess 4 if you count the veep). The rest of them are hired.

For those downvoting, the roles are President, Vice President, Senator, and Representative.

Outside of that, everyone is hired / appointed.

Being appointed is still different from being hired. It's a process that includes oversight and background checks.
Yes but in most modern societies we have laws against conflict of interest…
“Outside of that, everyone is hired / appointed.”

Tell me you don’t know how the US government works without telling me you don’t know how the US government works, why don’t you?

Article II, Section 2, Clause 2 of the US Constitution:

""[The President] shall nominate, and by and with the Advice and Consent of the Senate, shall appoint Ambassadors, other public Ministers and Consuls, Judges of the supreme Court, and all other Officers of the United States, whose Appointments are not herein otherwise provided for, and which shall be established by Law: *but the Congress may by Law vest the Appointment of such inferior Officers, as they think proper, in the President alone, in the Courts of Law, or in the Heads of Departments.*"

Emphasis on the last sentence. There have been a plethora of such vestments in the Executive branch over the decades.

5 U.S.C. § 105 - This statute authorizes the creation of "necessary agencies" within the Executive Office, giving the President flexibility to establish entities like the USDS and staff them as needed, subject to funding.

5 U.S.C. § 3101 - This law states that "each Executive agency" (including the EOP) "may employ such number of employees ... as Congress may appropriate for." It implies broad authority to hire staff, with Congress controlling the budget but not necessarily the individual appointments.

Excepted Service Authority (5 U.S.C. § 3301 and Schedule A) - Under 5 CFR § 213.3102, agencies like the OMB can use Schedule A hiring authority for positions requiring specialized skills (e.g., tech expertise) that aren’t practical to fill through standard civil service exams.

This authority, delegated by Congress via the Civil Service Reform Act of 1978 (Public Law 95-454) and regulations from the Office of Personnel Management (OPM), allows the President (or OMB leadership) to appoint USDS personnel directly.

Term Appointments - Many roles are temporary or term-limited (e.g., 2-4 years), often filled by detailees from other agencies or private-sector experts. These don’t require Senate confirmation because they aren’t permanent "officers." This flexibility is supported by 5 U.S.C. § 3161, which allows temporary organizations within the executive branch to hire staff for specific projects.

Thanks, I’ve read the constitution too. Ever heard of congress, full of elected officials? Or the senate? Your claim that there are so few elected officials is patently absurd.
Because he and his organization have demonstrated ignorance of the services he's not only auditing, but making pretty substantial cuts to. One example I'm familiar with, cutting up to 10% of the personnel to the Technology Transformation Services at GSA is quite likely to reduce the efficiency of both government and private sector government contractors.

https://news.ycombinator.com/item?id=43037624

Because he owns companies that contract with the government and are affected by its policies. It's the very definition of a conflict of interest.

And he's not even "auditing" the government. When you're auditing, you emit a report that the audited party later analyzes and acts upon. He's been given freeway to fire government workers as he pleases as if he's an elected officer, which he's not.

Washington has been known for revolving doors among particular industries for quite a few decades! Why the hoopla over this one?

They are auditing as part of their process of cutting costs. They're literally tracing trillions of dollars in financial records.

He's doing everything he's doing by executive order of the President of the United States, who was elected.

> He's doing everything he's doing by executive order of the President of the United States, who was elected.

And already has a number of lawsuits started because he's trying to do things neither he nor the president are allowed to do. Getting an EO to do something doesn't mean it's automatically legal. Multiple big decisions have already been reverted or are held until judges can review them. Even things like the promised payout for quitting are not practical, because only congress can approve the money for that.

There's lawsuits naturally as lawfare is a normal part of modern politics. All the laws necessary to do payouts for voluntary separation already exist, as long as it fits within the budgetary appropriations already set by Congress.
Correct and this one didn't. Legal Eagle posts good summaries of the actual legal failures of those.
  • ddxv
  • ·
  • 3 days ago
  • ·
  • [ - ]
I'm sure he's trying his best. But I don't doubt that, even if not doing it on purpose, he will mostly cut departments and services that do not hurt him or indirectly benefit any of his many businesses.

He, a single person, has far too much control of our system.

Lets just then let it slide?

Thats not how it works.

Btw. i think having the richest man in the world in his current position is very very unique.

Success doesn’t imply honesty, good faith or absence of bias. You already know this.
Look who builds it. Its not Musk. Its his money who bought smart people.

If he does to the USA Gov what he did to Twitter, he will destroy the brand, reduce the workforce by 80% and reduce the value by 80% too.

The issue with him is, tha tin Twitter, the affected people had money. A missed payment of USA can literaly kill people.

Do you believe that Elon regrets acquiring Twitter? Despite being constantly told how much we was fucking up, it seems to have worked out OK for him.
It's lost 80% of it's value in 2 years, which usually isn't great. The most charitable view of X/Twitter is that it's now a propaganda platform that Musk doesn't mind taking a loss on in order to enact political change.
Im pretty sure he doesn't like that he is not able to make it more successful but i don't believe he regrets it.

He would have regretted it if it wouldn't have played out (and this game he is currently playing, is not finished yet) like it currently does. He said in an interview that he puts everything on one card now.

Edit: Also he gets a lot of valuation due to him being a cult or whatever. From companies surviving the AI phase we are in right now, if he can't get the nazi people on his side to buy his stuff, he is a very high risk.

He destroyed twitters brand and bluesky emerged. He destroys Tesla and other car makers making ground. SpaceX needs a lot of subsidies and his goal for mars is only a cult topic not a financial success topic.

  • croes
  • ·
  • 4 days ago
  • ·
  • [ - ]
Because he brought coders to a financial audit.

Wrong tools.

I disagree. I believe engineers are generally smarter than accountants.
But are they well-versed in the things accountants specialize in? Is there a possibility that not every programmer can be a good accountant, or that accountants know things you're unaware of when you wrote that statement?
Even better, they're well versed in things that accountants aren't. When you're auditing trillions of dollars in spending, it helps to have software, data science and analytics experts that can use modern tools beyond COBOL written 62 years ago.

They can use data processing, detect anomalies better, leverage AI models, automate data extraction from analog records, ingest unstructured data like emails and memos, build complex financial dependency graphs, detect leaks, build custom scrapers, etc etc.

I'm sure there's at least one accountant in the loop, but you really want the team to consist mostly of data nerds.

What about GAAP/IFRS? How do you endow these software engineers with knowledge of common patterns of fraud or leaks so they can actually write the correct software to find them automatically? How do they identify material misstatements?

You also seem confused; COBOL might be used, but it isn't the only tool available to accountants working for the government. COBOL is a straw-man. What you're describing here—software engineers who presumably have training in accounting—already exists, and they work inside and out of the government. This is an existing career path.

You're speaking about this as though you know a better way to do something, but it's already happening, and has been for years. Accountants aren't writing 62 year old programming languages waiting to die in their chairs while the world continues to progress without them.

Accounting just about anywhere you find it is already accomplished by accountants, some of them technically trained, as well as data scientists and software engineers. It's an interdisciplinary collaboration in any serious organization.

As a person who works with data and has done both consulting and product building in Data Science and lack of domain knowledge is what makes or breaks the end result. Too often do technical people think they know better and then build mediocre solutions that don't get used.
If technical people were so good at these things, technical people would have a hell of a lot more successful startups for one thing.
[dead]
An audit accountant can ask an engineer to implement whatever is needed to achieve a goal they understand. An engineer with no finance background will have no idea where to start or what questions they can ask an accountant.

Or they will have absolutely no idea about the context and for example reveal secret information while they think they're just looking at money https://www.huffpost.com/entry/elon-musk-doge-posts-classifi...

  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
Accounting isn't about being smart.
  • 42lux
  • ·
  • 3 days ago
  • ·
  • [ - ]
Let me guess you are an engineer?
  • tzury
  • ·
  • 3 days ago
  • ·
  • [ - ]
“Figures don’t lie, but liars can figure”.

You can easily get drowned by a see of numbers and get confused and gaslighted, unless you don’t make sure all data is available and computable.

Not sure how this release, which impressive by all means transformed into an attack on DOGE which is the exact approach startups are taking to disrupt an industry.

  • croes
  • ·
  • 3 days ago
  • ·
  • [ - ]
Because it's not their data.

How much disruption started with massive failures?

You don't start with a live system or did SpaceX put astronauts in therir first rockets?

No one doubts his abilities, and reasonable people are grateful for his work with DOGE and support of free speech. Unfortunately, this platform has become an echo chamber for mainstream media, merely repeating news and links from sources like verge/bbc/politico etc. This is just a bias in the user groups. Still, we should hopefully put politics aside and focus on more tech related subject in this website :)
I think they just don’t like him for political views and they feed themselves with mainstream media
You should be more respectful of other people's intelligence. Not everyone who disagrees with you is brainwashed.
I think a lot of that comes from people thinking the eye-catching memorable views they've read are the most common views. When they're really not and are mire likely just the views of the most passionate 1%.
I used to play a lot of RPGs, and I believe that intelligence and wisdom are two separate traits. Not everyone who’s smart is actually wise.
That was a perfectly respectable critique.
For the same reason I doubt Einstein's abilities as a painter. History is littered with the stories of smart people who tried to treat government dysfunction like corporate dysfunction and failed spectacularly.
I would doubt Feynman's skill at parkour
Because people have different opinions on which things the government should spend on. The objective function to optimize for is disputed.
As he says in the video: He does nothing
The government puts money into his corporations, maybe you could elaborate on what extent said "success" is dependent on this relation?
He bids for contracts just like anyone else and most of those contracts were won under Democrat presidents, but I digress. If you're saying government contractors can't work for the government, then you're going to have to explain the military-industrial complex to me.
Government contractors can't work for the government in roles where there's a conflict of interest. Even in the MIC.
I don't see the relevance.
It’s not a doubt of abilities, it’s a doubt of his interests aligning with the interests of US citizens.

Here is an unelected NAZI and ILLEGAL IMMIGRANT (worked illegally on a student visa) who did a seig heil at the presidential inauguration taking up an informal unconfirmed-by-congress department head role (DOGE is just US Digital Services renamed) and getting wide access to government systems, and seemingly firing thousands of government employees.

Billionaires, who should not exist, are so rich that they don’t need government services and would rather the government go away so they can make more money. But regular people do need a government, and that’s just one reason of many why Elon shouldn’t be anywhere near policy decisions.

Now, you might say I’m being dramatic. But I’ll say there is no criticism of this man is unfair. He is one of the world’s biggest hypocrites, along with the other MAGA Nazis in his camp.

[flagged]
How is he "looting" government funds ?
The (unelected) richest person in the world, with a sprawling business empire that has many interactions with the federal government, has been given free-reign with no oversight to fire any federal workers that he wants and has usurped Congress' power of the purse by stopping Congressionally appropriated spending.

This creates numerous conflicts of interests and opportunities for self-dealing.

Consider a NASA employee that is awarding government contracts. They know all of the above. They have three bids in front of them. One from Boeing (lol), one from SpaceX, and one from Axiom Space. They NASA employee thinks the bid from Axiom Space is the best value and fits the requirements the best. But will they select them, or will they select SpaceX, knowing that they could be fired tomorrow by Musk's whim?

Repeat this scenario across every interaction any of Musk's companies have with the federal government.

This isn't a novel scenario. Putin's Russia is a great example of what happens when oligarchs are granted significant autonomy over organs of the government. It is a system designed to facilitate corruption.

You could assuage my concerns, though, by describing the ways that there is effective oversight over Musk, or by describing the anti-corruption anti-self-dealing measures that have been imposed on Musk. The Press Secretary gave a statement on this saying: "As for concerns regarding conflicts of interest between Elon Musk and DOGE, President Trump has stated he will not allow conflicts, and Elon himself has committed to recusing himself from potential conflicts." That...does not resolve my concerns.

Fixing the deficit is the opposite of looting, actually.
> Fixing the deficit is the opposite of looting, actually.

I think this has two errors.

First, I don't agree that he's fixing the deficit. I think that's an assumption not in evidence. We'll see in a few years time, though. I'd be willing to bet in 4 years the deficit is > 0, and likely larger than it is today.

But let's assume arguendo that he is fixing the deficit. It's still possible to loot the treasury while fixing the deficit, which shows that they aren't actually the opposite.

Consider this example with completely made up numbers:

Before

- Revenue: $1T

- Defense Spending: $500B

- Benefits Spending: $1T

- Public Services Spending: $499.9B

- Government Contracts with Musk's Companies: $100M

The before scenario has $1T in revenue, and $2T in spending, for a deficit of $1T. Now, let's allow hypothetical Musk to have free-reign to "fix the deficit"

After

- Revenue: $800B

- Defense Spending: $300B

- Benefits Spending: $300B

- Public Services Spending: $100B

- Government Contracts with Musk's Companies: $100B

In this scenario the deficit has been reduced to $0, while Musk has enriched himself and his companies with $99.9B in government funds. This would be an extreme example of Musk looting the treasury, while still completely resolving the deficit.

> Fixing the deficit is the opposite of looting, actually.

What evidence is there that the current moves will lead to "fixing the deficit?"

Illegally redistributing of appropriated spending could be easily understood as looting in most context. Not sure how this would be excluded.

There is no solid evidence of a path to fixing the deficit at the moment.

There is no evidence of this happening, nor of a serviceable plan to do so.

All recovered expenses, to date, add up to a laughably small amount, and are one-time cutbacks. The strategy shows signs of costing the government in unexpected ways as well.

Most governments of developed nations operate in more sensible ways with clearer plans than this. I won't claim they are looting, but it's absurd to suggest they are fixing the deficit at the moment. The economy appears to be getting worse, not better.

Let me guess: the deficit will be fixed without taking the axe to any contracts to Musk-affiliated companies like SpaceX.
[flagged]
Hi Franz, it's 2025. We beat the Nazis 80 years ago. It's time to move on to the 21st century.
  • ·
  • 3 days ago
  • ·
  • [ - ]
> Hi Franz, it's 2025. We beat the Nazis 80 years ago. It's time to move on to the 21st century.

Then why the hell are they still waving flags?

https://www.nbcnews.com/news/amp/rcna191304

Your bias is showing.

It's bizarre how many people believe that was literally meant as a Nazi salute.
Rule of Goats.
Nazis would incarcerate people into work camps and turn them into soap or hang them in public squares, kind of different than what Elon did. There's a pretty big gap between doing something in poor taste like Nazi salutes in public because you have the intellectual maturity of a 12 year old edge-lord on Xbox live seeking attention, and being an a actual Nazi committing crimes against humanity.

When everyone goes around calling everyone they hate a Nazi, it only desensitizes people to the real Nazi behavior, kind of like the boy who cried wolf, since there's people out there committing actual atrocities against humans going under the public radar because they never do the Nazi salutes on camera to not draw attention. So then the Nazi term starts to loose any meaning, kind of like the overuse of calling everything "woke" today.

The problem is people as a whole are retarded due to mob behavior and too focused on optics and only judge based on feelings rather than facts, that's how we have actual criminal Nazis going free under the radar while innocent people being swatted and doxxed because they said something right wing on social media. Not all Nazis today wear jackboots and do heils, plenty go about appearing like normal people in public, they could even be your neighbor, police officer or local congressman.

So save your anger for those people instead, as Elon is just a 3 year old throwing tantrums seeking attention, annoying but relatively harmless. If people stopped giving him so much attention, he'd stop doing it.

[flagged]
>but not us

What makes you so sure/special in this regard? What are you gaining from this? If your were that enlightened as you claim, Germany wouldn't be in such a mess right now. If you only spent as much effort in securing your borders, energy independence and defence, as you spent lecturing others on imaginary Nazis and banning hate speech on social media, you'd be a respectable world power right now at the table with Trump and Putin ending this war before it even happened. The "we know better than you" arrogance is Germany's biggest problem.

>This is basically blaming the thing criticized on the people criticizing it.

It's not blaming, I'm just telling you what the simplest solution is. Ignoring attention seekers is better than giving them more attention. Which is why I'm also gonna ignore your future comments form now on.

> What makes you so sure/special in this regard?

Germany? the scope of the discussion is already limited to the AfD. What makes Germans special in their opinions about Nazis in Germany to is deep experience and knowledge with the subject you are belittling. And I'm also not spending any time banning hate speech on social media, heh. Whatever chip on your shoulder you have about Germany I can barely even decipher and you're right, it's best to agree to disagree.

  • Ray20
  • ·
  • 3 days ago
  • ·
  • [ - ]
[flagged]
[flagged]
[flagged]
I find it interesting that these two descriptions of news are treated as equivalent, where I think they're almost opposing:

"raw, unfiltered news"

"real, trustworthy news"

Raw and unfiltered almost cannot be "news" (by my definition of what I go seeking for as "news"). X provides raw, unfiltered information. But real, trustworthy news almost requires filtration in order to be deemed trustworthy.

I think the theory is that community notes are a more neutral way to tag bad information, compared to whatever the new york time and fox news are doing.
People actually trying the model report that it does not say anything like this when asked the same question. Elon somehow prompted the model to bash The Information for his screenshot.
I didn't know about this. But I asked it if Elon Musk and DOGE randomly firing as many government workers as they can from all federal departments might be dangerous to Americans and it was pretty honest that yeah, it could be.
Tweet this @musk and he’ll make sure grok jumps in line.
They've been training the model for a while, right? It's unlikely he could have known Trump would let him rampage through the federal government when they started training it.
  • ein0p
  • ·
  • 4 days ago
  • ·
  • [ - ]
[flagged]
So, uh, Fort Knox has been robbed then?

https://x.com/elonmusk/status/1891566355532349445

  • ein0p
  • ·
  • 4 days ago
  • ·
  • [ - ]
I guess we'll see won't we? There's no harm in checking, just like with everything else.
"It's like looking for groceries in a landfill."

(Yeah, you'll find some edible food but it takes time and meanwhile you're covered in garbage)

Yeah, that's a great analogy.
Isn't it just?

It's from the opening lines of the closing segment of the the video you linked. Not a bad discussion of the unfolding of news and social media responses follwing the shooting of Donald Trump's ear on the campaign trail.

Why did we start talking about news as "legacy media"?

I mean, at least it has journalistic standards and some semblance of fact checking compared to social media which has given us great gaffes such as identifying the wrong Boston Marathon bomber and getting the poor guy to kill himself, wrongly identifying the Hispanic white supremacist shooter, and many many more.

When someone here says they don't like the news I assume it's because the only newsworthy topic in their mind is "Just how COOL is the new Iphone? Very!"

Twitter is well suited to deliver the newest developments on this topic.

  • ein0p
  • ·
  • 4 days ago
  • ·
  • [ - ]
Here's why: https://news.gallup.com/poll/651977/americans-trust-media-re...

Mainstream outlets have viewerships that compare disfavorably to those of top youtubers. That has been the case for many years now. The only reason most mainstream outlets exist is the taxpayer money train that's coming to an abrupt stop as I write this.

Mainstream news had viewerships disfavorable to sports games and sitcoms. So what? It's not entertainment. It's informative. They of course have different viewerships
  • ein0p
  • ·
  • 3 days ago
  • ·
  • [ - ]
> It's informative.

This is not supported by evidence.

https://www.youtube.com/watch?v=Z4RYyYtD6i4

Your evidence is a YouTube video with 100 views? Wtf
  • ein0p
  • ·
  • 2 days ago
  • ·
  • [ - ]
The statement is true irrespective of the number of views. If you watch the news you are misinformed. You believe things that just aren't true at all. From time to time your view of the world clashes with reality, and you reject the reality and substitute it with your own. Many such cases.
[flagged]
I'm very sorry if this isn't the case, but this message really feels LLM-written.
  • jorvi
  • ·
  • 3 days ago
  • ·
  • [ - ]
Its because of the em dashes (- is a normal dash, — is an em dash). Very few real people use those outside of writing books or longform articles.

There's also some strange wordings like "back-pocket tests."

It's 100% LLM generated.

What is much scarier is that those "quick reply" blurbs on Android/Gmail (and iOS?) will be able to be trained on your entire e-mail and WhatsApp history. That model will have your writing mannerisms and even be a stochastic mimic of your reasoning. So, you won't be able to even realize a model answered you, not a real person. And the initial message the model is responding to might be written by the other person's personal model.

The future of digital interactions might have some sort of cryptographic signing guaranteeing you're talking to a human being, perhaps even with blocked copy-pasting (or well, that part of the text shows up as unverified) and cheat detection.

Going even a layer deeper / more meta: what does it ultimately matter? We humans yearn for connection, but for some reason that connection only feels genuine with another human. Whereas, what is the difference between a human typing a message to you, a human inhabiting a robot body, a model typing a message to you, and a model inhabiting a robot body, if they can all give you unique interactions?

I use em-dashes pretty often--it's a nice way to transition phrases...
You're using two en dashes to approximate it -- few people have the en dash character on hand.
People that care have it on hand. Option+Shift+dash on mac.
Everyone who uses a compose key has it available (via ---) — I do. You mean the em-dash though, not the en-dash, and Davidzheng is using hyphens for approximation, not en-dashes.
I'm one of the 17 people that has Alt+0151 memorized
  • snet0
  • ·
  • 3 days ago
  • ·
  • [ - ]
:*:\em::—
It’s gracious of you to say that you’d be sorry, and I did run my comment through 4o (perhaps ironically) which caught a slew of typos and weird grammar issues and offered some improvements. But the robotic sound and anything else you don’t like are my own responsibility. Do you, perhaps, have any thoughts on the substance of the comment?
It's not the robotic sound or the content. 4o has very easy tells that it wrote (rewrote) the content. It uses an insane amount of em dashes.
  • Igrom
  • ·
  • 3 days ago
  • ·
  • [ - ]
That's discomforting. My practice of sprinkling em-dashes like salt on a salad dates from my early days on various video game communities' forums. They comfortably mimic interrupted speech in writing. I hope I won't have to soon defend myself against accusations of AI usage just because I belong to the minority that read The Punctuation Guide[0] or a related resource.

[0] https://www.thepunctuationguide.com/em-dash.html

It's really the em dash along with superfluous language. I suspect you are fine. Models like 4o have a very specific pattern when folks don't specify their writing style.
- Very 'forced' expressions (back-pocket tests, 'The analysis is razor-sharp')

- The fact you're glazing AI so much means you probably uses it, it's like how it was with crypto bros during all the web3 stuff

- Lack of any substance, like, what does that post say? It regurgitates praises over the AI, but the only tangible feature you mention is the fact it can receive an URL as it's input

It always feels like people are irrationally critical of AI assisted stuff. Does the typical Hacker News comment have more substance?

- Informally benchmarked against 4 specific competitors: Gemini, OpenAI, o3, and Claude

- Identified two concrete features: URL content ingestion and integrated search

- Noted specific limitations: search engine occasionally misses key resources

- Provided a real-world test case: consulting business analysis where it found new opportunities other models missed

Hmmmm it is hard to really place the issue. I am very much in the bullish on AI camp but I don't like writing for the sake of writing and some of the models (4o in this case) have very obvious tells and write in such a way that it takes away from what substance may exist.
  • snet0
  • ·
  • 3 days ago
  • ·
  • [ - ]
One thing that concerns me is when you can't tell whether the comment was authored or just edited by AI. I'm uncomfortable with the idea that HN threads and reddit comments gradually tend towards the grey generic writing style of LLMs, but I don't really mind (save for the prospect of people not learning things they might otherwise!) when comments are edited (i.e. minor changes) for the sake of cleanliness or fixing issues.
I just re-read the post twice and I couldn't find any of the points you mentioned (again, other than using URLs in the input):

- Informal Benchmarks: I'm sorry, what? He mentions 'It’s picking up on nuances—and even uncovering entirely new angles—that other models have overlooked' and 'identified an entirely new sphere of possibility that I hadn’t seen nor had any of the other top models'. Not only it is complete horseshit by itself, but it does not benchmark in any way or form against the mentioned competitors. It's the exact stuff I'd expect out of a LLM.

- Real-World Test Case: As mentioned above, complete horseshit.

- 2 Concrete Features: Yes, I mentioned URLs in the input. I didn't consider 'Integrated Search' (which I'm assuming is searching the web for up-to-date data) because AFAIK it's already more or less a staple in LLM stuff, and his only remarks about is is that it is 'solid but misses sometimes'.

Also ai generated
And this is the reason, I have choose to write grammatically wrong content online. And basic english only, no fancy words.
I see what you did their
It may also be deliberate, I know a lot of people that are very dyslexic and are using AI for making themselves understood online.
It's the dashes that make it a dead-giveaway.
“ — “ is the giveaway.
Not really, as pointed out by others in the thread. Anecdotal of course, but I use em dashes all the time— even in emails and texts (not just long-form writing).
Same, and it's disturbing that this is going to be picked up on as a bogus "tell" that my writing isn't my own.
Why sorry? So what?

I often write things I want to post in bullets and then have it formulated better than I could by an LLM. But its just applying a style. The content comes from me.

My wife is dyslexic so she passes most things she writes through ChatGPT. Also not everyone is a native speaker.

TBH I've recently felt like that for ~70% of 'top-level replies' in HN, which has slowly pushed me to other mediums (mastodon and discord).

Could just be that the AI 'boom' brought a less programming-focused crowd into the site and those people lack the vocabulary that is constantly used here, who knows.

I'd go out on a limb and say I think probably LLMs made the general population aware of how the "general voice" feels/looks/reads like.

So rather than a lot of people adopting to write like how a LLM writes, the LLM writes as an average of how people been writing on the internet for a long time. So now when you start to recognize how "LLM prose" reads (which I'd say is "Internet General Prose"), you start to recognize how many people are writing in that style already.

I've been in the internet since the early 2000s, I can assure you it does not write like how 'someone on the internet' would write. And when I say that, I mean that for both sides of the internet: it doesn't sound like how 'old school' internet folks would write, but it also doesn't sound like how teens talk either. Neither of these groups write in 'very plain' English regurgitating useless information.

Recent trends/metas in video formats like tiktok and shorts encourage that kind of 'prose', but I haven't seen it being translated into text format in any platform, unless it's written by LLMs.

> I've been in the internet since the early 2000s

Same here :)

My point wasn't that it writes like any specific groups, but a general mix-match made up of everyone voice, but a boring average of it, rather than something specific and/or exciting.

Then of course it depends on what models you're talking about, I haven't tried Grok3 myself (which I think you're talking about, since you say "it"), so I can't say how the text looks/feels like. Some models are more "generic" than others, and have very different default prose-style.

  • Oarch
  • ·
  • 3 days ago
  • ·
  • [ - ]
I'm a big fan of sprinkling in a little profanity just to pass the LLM bullshit check
Here’s the conclusion of a much more refined initial review by Andrej Karpathy [1] which, I think overall, comports with the substance of my own hot take:

“As far as a quick vibe check over ~2 hours this morning, Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month), and slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Which is quite incredible considering that the team started from scratch ~1 year ago, this timescale to state of the art territory is unprecedented. Do also keep in mind the caveats - the models are stochastic and may give slightly different answers each time, and it is very early, so we'll have to wait for a lot more evaluations over a period of the next few days/weeks. The early LM arena results look quite encouraging indeed. For now, big congrats to the xAI team, they clearly have huge velocity and momentum and I am excited to add Grok 3 to my "LLM council" and hear what it thinks going forward.”

[1] Full review at: https://x.com/karpathy/status/1891720635363254772?s=46&t=91u...

I liked Grok 3 fiction writing style; catches lots of physics of mundane situations such as ringing echo in a closed bathroom we all know well; the prose feels very lively as the result. Kinda like R1 makes situations sharp with details, Grok 3 makes the other way around - rounded by using details.
That sounds like very evocative prose. Would you be up for sharing some of that fiction? I haven’t tried Grok 3 for that purpose and now I’m curious.
Update: Now I am a person who has used Grok 3 to generate evocative fiction (using your description as the catalyst): https://x.com/i/grok/share/r8XR6IdeuzDLDFTL7xDHLuADO
Well because you explicitly ask it to demonstrate the physics, it came out way too detailed, but point is that it adds details on its own to scenes, make more realistic, not that dry LLama 3.3 style.
Here is the sentence : (She screamed, which echoed off the tile walls. “This is my life now,” she said to her reflection, which looked back at her with a mix of disgust and pity.) Looks good to me. try it on Lmarena.ai.
  • ·
  • 3 days ago
  • ·
  • [ - ]
Can't stand Elon but happy to see this. We badly need a frontier model that is not so obsessed with "safety". That nonsense has held things back significantly, and leads to really stupid fake constraints.
https://grok.com/

500 Internal Server Error

nginx/1.27.4

We know RLHF and alignment degrades model quality. could it be that Grok, due to its less restrictive training guidelines (and the fact that its creators aren't afraid of getting sued), can achieve higher performance partly due to this simple factor?
> We know RLHF and alignment degrades model quality.

I feel you can't make statements like this without giving some sources.

IIUC, without RLHF/alignment, the model won't even be able to chat with you, it would just be a document completion engine.

You're both right because RLHF and fine-tuning are just techniques.

It's dependent on the training data and not as much the method.

So, if you make the RLHF/finetune data such that it avoids certain topics, then you reduce model quality in practice since your training data might accidentally cast a net wide enough that you make it avoid certain legitimate questions.

On benchmarks these things don't typically show up though.

But yes. Those techniques are required for making it chat. Otherwise it just autocompletes from the internet. It is also used in a couple of other places (reasoning/search(hallucination mitigation))

It blows my mind that Musk hasn't integrated Grok as an app inside their vehicles. A literal AI copilot is a completely novel and killer app that cannot be pulled off by any other vehicle manufacturer.
  • rtkwe
  • ·
  • 3 days ago
  • ·
  • [ - ]
Getting them to actually do something useful other than generating text is still a work in progress. What do you envision them actually doing in this integration?
Why does it need to work beyond text-based output?
Because all you need to do is Bluetooth your iPhone to your 1995 Ford Ranger, and install Gemini to have a voice conversation with your cars speakers. But then your original comment doesn't make any sense about it being only possible with a Tesla.
The iPhone screen size is too small, and nobody takes their iPad with them on every trip.
  • rtkwe
  • ·
  • 3 days ago
  • ·
  • [ - ]
Why do I need a big screen? The models mostly all have voice interfaces now. I shouldn't be sitting there reading and typing text input or output while driving anyways... What are you actually imagining doing with these models in the car? I still haven't heard what use they are.
>Why do I need a big screen?

You don't. You're free to use iPhone all day for work and play.

>What are you actually imagining doing with these models in the car?

The exact same things that are done on a laptop and desktop.

  • rtkwe
  • ·
  • 3 days ago
  • ·
  • [ - ]
> You don't. You're free to use iPhone all day for work and play.

You just said "The screen size is too small" when geor9e was talking about using the phone so I'm confused is a phone too small or just fine for this?

> The exact same things that are done on a laptop and desktop.

i.e. Not things people usually do in cars...

Not "cars" but Teslas.
  • rtkwe
  • ·
  • 3 days ago
  • ·
  • [ - ]
If it's not doing something actually related to/integrated with the car why does it need to be an app there instead of just living on your phone like all our existing digital assistants?
Simply due to the screen being much bigger than your phone
"Killer app" in the good way or the bad way?
I've been saying hey google drive home (for GPS directions), and play music, to the phone mounted on my dash for a decade. I drive a rusty old stick shift and alligator clamped a $10 bluetooth to the speaker. So I'm not sure what you're envisioning that can't also work on any other car. There is also https://comma.ai that adds self-driving to hundreds of newer cars via just a ODB2-like dongle and equivalent of a smartphone.
Interesting thing about this is that because of all the Musk-related overhyping that's gone on and because the launch is a video, the thread that marks the entry of another company into the select group of serious AI companies will go off the front page with possibly only 200 points!
[flagged]
  • dang
  • ·
  • 2 days ago
  • ·
  • [ - ]
Can you please stop breaking the site guidelines?

https://news.ycombinator.com/newsguidelines.html

Are there consequences?
[flagged]
  • dang
  • ·
  • 3 days ago
  • ·
  • [ - ]
Ok, but please don't respond by posting the same sort of thing as well. That only makes it worse.