https://static.simonwillison.net/static/2024/pelicans-on-bic...
Of the four two were a pelican riding a bicycle. One was a pelican just running along the road, one was a pelican perched on a stationary bicycle, and one had the pelican wearing a weird sort of pelican bicycle helmet.
All four were better than what I got from Sora: https://simonwillison.net/2024/Dec/9/sora/
My company (Nim) is hosting Hunyuan model, so here's a quick test (first attempt) at "pelican riding a bycicle" via Hunyuan on Nim: https://nim.video/explore/OGs4EM3MIpW8
I think it's as good, if not better than Sora / Veo
What does it produce for “A pelican riding a bicycle along a coastal path overlooking a harbor”?
Or, what do Sora and Veo produce for your verbose prompt?
The Pelican is doing some weird flying motion, motion blur is hiding a lack of detail, cycle is moving fast so background is blurred etc. I would even say SORA is better because I like the slow-motion and detail but it did do something very non physical.
Veo is clearly the best in this example. It has high detail but also feels the most physically grounded among the examples.
If you'd like to replicate, the sign-up process was very easy and I was easily able to run a single generation attempt. Maybe later when I want to generate video I'll use prompt enhancement. Without it, the video appears to have lost a notion of direction. Most image-generation models I'm aware of do prompt-enhancement. I've seen it on Grok+Flow/Aurora and ChatGPT+DallE.
Prompt
A pelican riding a bicycle along a coastal path overlooking a harbor
Seed
15185546
Resolution
720×480
Turning content blockers off does not make a difference.
Then there's Lightricks LTX-1 model and Genmo's Mochi-1. Even the research CogVideoX is making progress.
Open source video AI is just getting started, but it's off to a strong start.
oh, shit!
> Prompt: The sun rises slowly behind a perfectly plated breakfast scene. Thick, golden maple syrup pours in slow motion over a stack of fluffy pancakes, each one releasing a soft, warm steam cloud. A close-up of crispy bacon sizzles, sending tiny embers of golden grease into the air. [...]
In the video, the bacon is unceremoniously slapped onto the pancakes, while the prompt sounds like it was intended to be a separate shot, with the bacon still in the pan? Or, alternatively, everything described in the prompt should have been on the table at the same time?
So, yet again: AI produces impressive results, but it rarely does exactly what you wanted it to do...
But I'm also seeing some genuinely creative uses of generative video - stuff I could argue has got some genuine creative validity. I am loathe to dismiss an entire technique because it is mostly used to create garbage.
We'll have to figure out how to solve the slop problem - it was already an issues before AI so maybe this is just hastening the inevevitable.
I have. It sucks. The world we're headed for maybe isn't one we actually wind up wanting in the end.
I like the idea of increasingly advanced video models as a technologist, but in practice, I'm noticing slop and I don't like it. Having grown up on porn, when video models are in my hands, the addiction steers me in the direction of only using the the technology to generate it. That's a slot machine so addictive akin to the leap from the dirty magazines of old to the world of internet porn I witnessed growing up. So, porn addiction on steroids. I found it eventually damaging enough to my mental health that I sold my 4090. I'm a lot better off now.
The nerd in me absolutely loves Generative models from a technology perspective, but just like the era of social media before it, it's a double edged sword.
Hence, a certain % of the population will be negatively affected by this. I personally personally think it's worth raising awareness of.
Also I just don't want to live in a world where the things we watch just aren't real. I want to be able to trust what I see, and see the human-ness in it. I'm aware that these things can co-exist, but I'm also becoming increasingly aware that as long as this technology is available and in development, it will be used for deception.
Now, we have entire TV shows shot on green screen in virtual sets. Replacing all the actors is just the next logical step.
I do believe that humans are restless, and even when there is no longer any point to create, and it is far easier to dictate, we still will, just because we are too driven not to.
>at resolutions up to 4K, and extended to minutes in length.
https://blog.google/technology/google-labs/video-image-gener...
Anyways, I strongly suspect that the funny meme content that seems to be the practical uses case of these video generators won't be possible on either Veo or Sora, because of copyright, PC, containing famous people, or other 'safety' related reasons.
I was so excited to see Sora out - only to see it has most of the same problems. And Kling seems to do better in a lot of benchmarks.
I can’t quite make sense of it - what OpenAI were showing when they first launched Sora was so amazing. Was it cherry picked? Or was it using loads more compute than what they’ve release?
It does pop up. Look at where his hand is relative to the jar when he grabs it vs when he stops lifting it. The hand and the jar are moving, but the jar is non-physically unattached to the grab.
This feels like a bit of a comeback as Veo 2 (subjectively) appears to be a step up from what Sora is currently able to achieve.
Some of the videos look incredibly believable though.
By the time the politician says it, you've been soaking in it for weeks or months, if not longer. That just confirms the bias that has been implanted in you.
And X is really egregious, where the owner shitposts frequently and often things of dubious factuality.
I left X precisely because it was flooded with Russian propaganda/misinformation.
- People were 'forced' into vaccinations
- Covid 19 was a testing ground for the next global pandemic so that "they" can control us
- Climate change is a hoax/Renewables are our doom
- Everything our government does is to create a totalitarian state next.
- Putin is actually the victim, it is all NATO fault and their imperialism
"Take this novel vaccine primarily for someone else's benefit or lose your job: it's your choice, you totally aren't being 'forced.'"
Putin's apologists always demand he be given the benefit of the doubt. That's akin to convicting a spy beyond a reasonable doubt. That standard is meant to favor false negatives over false positives when incarcerating people. Better to let a thousand criminals go free than to imprison an innocent person.
If we used that for spies, we'd have 1000 of them running around for each convicted one. Not to mention that they have a million ways to avoid detection. They rely on their training, on the resources of the state, and on infiltrators who sabotage detection efforts. The actual ratio would be much higher.
In the case of opinion manipulation, the balance is even more pernicious. That's because the West decided a couple decades ago to use the "it's just a flesh wound" approach to foreign interference.
The problem is that we're not just protecting gullible voters. We're also defending the reputation of democracy. Either democracy works, or it doesn't. If it doesn't, then we're philosophically no better than Russia and China.
But if it was possible to control the outcome of elections by online manipulation alone, that would imply that democracy doesn't really work. Therefore online manipulation "can't work." Officially, it might sway opinion by a few points, but a majority of voters must definitionally be right. If manipulation makes little difference, then there's not much reason to fight it (or too openly anyways.)
Paradoxically, when it comes to detecting Russian voter manipulation, the West and Putin are strange bedfellows. Nothing to see here, move along.
My sense is that the "hivemind" is, in a symbiotic way, both homegrown and significantly foreign-influnced.
More specifically: the core sentiment of the hivemind (basically: anti-war/anti-interventionist mixed with a broader distrust of anything the perceived "establishment" supports) is certainly indigenous -- and it is very important to not overlook this fact.
But many of its memes, and its various nuggets of disinformation do seem to be foreign imports. This isn't just an insinuation; sometimes the lineage can actually be traced word-for-word with statements originating from foreign sources (for example, "8 years of shelling the Donbas").
The memes don't create the sentiment. But they do seem to reinforce it, and provide it with a certain muscle and kick. While all the while maintaining the impression that it's all entirely homegrown.
And the farther one goes down the "multipolar" rabbit hole, the more often one encounters not just topical memes, but signature phrases lifted directly from known statements by Putin and Lavrov themselves. E.g. that Ukraine urgently needs to "denazify". The more hardcore types even have no qualms about using that precious phrase "Special Military Operation", with a touch of pride in their voice.
It's really genuinely weird, what's happening. What people don't realize is that none of this is happening by accident. It's a very specific craft that the Russian security services (in particular) have nurtured and developed, literally across generations, to create language that pushes people's buttons in this way.
The Western agencies and institutions have their own way of propaganda of course, but usually it's far more bland and boring (e.g. as to how NATO "supports fosters broader European integration" and all that).
Would we have the same kind of hivemind without Putin? There's always some kind of a hivemind -- but as applies to Eastern Europe, it does seem that the general climate of discourse was quite different before his ascendancy. And that it certainly took a very sharp, weird bend in the road after the start of Special Military Operation.
I remember saying to someone at the time that I was pretty sure iPhone was going to get secure corporate email and device management faster than BlackBerry was going to get an approachable UI, decent camera, or app ecosystem.
These videos will and may be too realistic.
Our society is not prepared for this kind of reality "bending" media. These hyperrealistic videos will be the reason for hate and murder. Evil actors will use it to influence elections on a global scale. Create cults around virtual characters. Deny the rules of physics and human reason. And yet, there is no way for a person to detect instantly that he is watching a generated video. Maybe now, but in 1 year, it will be indistinguishable from a real recorded video
I'm thinking of simple cryptographic signing of a file, rather than embedding watermarks into the content, but that's another option.
I don't think it will solve the fake video onslaught, but it could help.
Cute hack showing that its kinda useless unless the user-facing UX does a better job of actually knowing whether the certificate represents the manufacturer of the sensor (dude just uses a self signed cert with "Leica Camera AG" as the name. Clearly cryptography literacy is lagging behind... https://hackaday.com/2023/11/30/falsified-photos-fooling-ado...
1. AI video watermarks that carry over even if a video of the AI video is taken
2. Cameras that can see AI video watermarks and put an AI video watermark on the videos of any AI videos they take
That’s not that much money.
I really find the threat to be overhyped.
The only thing this changes is not needing to pay human beings for work.
I feel this kind of hypervigilance will be mentally exhausting, and not being able to trust your primary senses will have untold psychological effects
We're already in a world where "fake news" and "alt-facts" influence our daily lives and political outcomes.
In the grand scheme of understanding the world at large, our immediate senses are not particularly valuable. So we _have_ to rely on other streams of information. And the trend is towards more of those streams being digital.
The existence of "fake news" and "alt facts", doesn't mean we should accept a further and dramatic worsening of our ability to have a shared reality. To accept that as an inevitability is defeatist and a kind of learned helplessness.
Have you seen the Adam Curtis documentary "Hypernormalisation"? It deals with some similar themes, but on a much smaller scale (at least it is smaller in the context of current and near future tech)
I recently had an issue with my mobile service provider and I was insanely glad when I could interact with a friendly and competent shop clerk (I know I got lucky there) in a brick&mortar instead of a chatbot stuck in a loop.
This is like trying to hide Photoshop from the public. Realistic AI generated videos and adversary-sponsored mass disinformation campaigns are 100% inevitable, even if the US labs stopped working on it today.
So, you might as well open access to it to blunt the effect, and make sure our own labs don't fall behind on the global stage.
If it's kill or be killed, we should do away with medicine right? Only the strong survive. Why are we saving the weak? Sorry but this argument is beyond silly
Maybe operation Timber Sycamore, that bears fruit in Syria right now wouldn't happen, if the population was less trusting of the shit they see on tv.
I have not heard of Timber Sycamore until this comment. A quick look at Wikipedia I'm struggling to see the relevance here. Can you elaborate?
"Uh, Is that video of [insert your least favourite politician here] taking a bribe real or not? Well, I'm going to trust my instincts here..."
And no big tech company would run the ads you're suggesting, because they only make money when people use the systems that deliver the untrustworthy content.
I think we will need the same healthy media diet.
Just take a look at how many everyday things were "incredibly dangerous for society" - https://pessimistsarchive.org/
Even if you're not convinced that it's dangerous, at the very least it's incredibly annoying.
If someone dumped a trailer full of trash in your garden, you're not going to say "oh well, market forces compelled them to do that".
This quote suggests not: "maintaining complete consistency throughout complex scenes or those with complex motion, remains a challenge."
> a thief threatens a man with a gun, demanding his money, then fires the gun (etc add details)
> the thief runs away, while his victim slowly collapses on the sidewalk (etc same details)
Would you get the same characters, wearing the identical clothing, the same lighting and identical background details? You need all these elements to be the same, that's what filmmakers call "continuity". I doubt that Veo or any of the generators would actually produce continuity.
Not much. Low quality over-saturated advertising? Short films made by untalented lazy filmmakers?
When text prompts are the only source, creativity is absent. No craft, no art. Audiences won't gravitate towards fake crap that oozes out of AI vending machines, unrefined, artistically uncontrolled.
Imagine visiting a restaurant because you heard the chef is good. You enjoy your meal but later discover the chef has a "food generator" where he prompts the food into existence. Would you go back to that restaurant?
There's one exception. Video-to-video and image-to-video, where your own original artwork, photos, drawings and videos are the source of the generated output. Even then, it's like outsourcing production to an unpredictable third party. Good luck getting lighting and details exactly right.
I see the role of this AI gen stuff as background filler, such as populating set details or distant environments via green screen.
That's an obvious yes from me. I liked it, and not only that, but I can reasonably assume it will be consistently good in the future, something lot's of places can't do.
You don't care about the absence of a lifetime of hard work behind your meal, or the efforts of small business owners inspired by good food and passion in the kitchen. All that matters to you is that your taste buds were satisfied?
Interesting. Perhaps we can divide the world into those who'd happily dine at "Skynet Gourmet", and those who'd seek a real restaurant.
I still believe that there's place for creative work, I just don't see why something created by something other than a human is inherently bad.
I can't speak to OpenAI but ByteDance isn't waiting for permission.
In theory that should matter to something like Open(Closed)Ai. But who knows.
Why can't a silicon being train itself on Youtube as well?
A corporation "is a person" with all the rights that come along with that - free speech etc.
Like, a blind person with vision restored by silicon eyes?
Do I not have rights to run whatever firmware I want on those eyes, because it's part of my body?
Okay, so what if that firmware could hypothetically save and train AI models?
we can take this a step further: if your augmented eyes and ears can record people in a conversation, should you be allowed to produce lifelike replicas of people's appearance and voice? a person can definitely imagine someone saying/doing anything. a talented person with enough effort could even make a 3D model and do a voice impression on their own. it should be obvious that having a conversation with a stranger doesn't give them permission to clone your every detail, and shouldn't that also be true for your creations?
If it just reduces to an issue of data efficiency, AI research will eventually get there though.
They could also just acquire that other company.
From the creator's standpoint, signing away rights to one company is as good as gone.
> Veo sample duration is 8s, VideoGen’s sample duration is 10s, and other models' durations are 5s. We show the full video duration to raters.
Could the positive result for Veo 2 mean the raters like longer videos? Why not trim Veo 2's output to 5s for a better controlled test?
I'm not surprised this isn't open to the public by Google yet, there's a huge amount of volunteer red-teaming to be done by the public on other services like hailuoai.video yet.
P.S. The skate tricks in the final video are delightfully insane.
Closed models aren't going to matter in the long run. Hunyuan and LTX both run on consumer hardware and produce videos similar in quality to Sora Turbo, yet you can train them and prompt them on anything. They fit into the open source ecosystem which makes building plugins and controls super easy.
Video is going to play out in a way that resembles images. Stable Diffusion and Flux like players will win. There might be room for one or two Midjourney-type players, but by and large the most activity happens in the open ecosystem.
Are there other versions than the official?
> An NVIDIA GPU with CUDA support is required. > Recommended: We recommend using a GPU with 80GB of memory for better generation quality.
https://github.com/Tencent/HunyuanVideo
> I am getting CUDA out of memory on an Nvidia L4 with 24 GB of VRAM, even after using the bfloat16 optimization.
With the YouTube corpus at their disposal, I don't see how anyone can beat Google for AI video generation.
Since Llama3.3 came out it is my first stop for coding questions, and I’m only using closed models when llama3.3 has trouble.
I think it’s fairly clear that between open weights and LLMs plateauing, the game will be who can build what on top of largely equivalent base models.
It absolutely is. Moreover, the tools built on top of SD (and now Flux) are superior to any commercial vertical.
The second-place companies and research labs will continue to release their models as open source, which will cause further atrophy to the value of building a foundation model. Value will accrue in the product, as has always been the case.
Like the tanker which is still steering to fully align with the course people expect it to be, which they don't recognize that it will soon be there and be capable of rolling over everything which comes in its way.
If OpenAi claims they're close to having AGI, Google most likely already has it and is doing its shenanigans with the US government under the radar. While Microsoft are playing the cool guys and Amazon is still trying to get their act together.
That, or they have a secret super human intelligence under wraps at the pentagon.
OpenAI might be well-capitalized, but they're (1) bleeding money, (2) no clear path to profitability, and (3) competing head-to-head with a behemoth who can profitably provide a similar offering at 10-20x cheaper (literally).
Google might be slow out the blocks, but it's not like they've been sitting on their hands for the past decade.
That’s the core issue, and they’ve also pissed off a non-zero percentage of top talent by ditching what still existed of Google culture and going full “Corporate Megacorp” a few years ago.
Google is having to pay a ton to retain the talent they have left and it’s often not enough.
Google's biggest threat isn't OpenAI. It's the FTC (which I admit is a very real danger).
* from a developer/platform perspective, at least. The "consumer" facing side of things (e.g. the AI Studio UI) is still pretty awful.
https://arstechnica.com/information-technology/2023/03/yes-v...
We're not even done with 2024.
Just imagine what's waiting for us in 2025.
SD Cards?
To really do well on this task, the model basically has to understand physics, and human anatomy, and all sorts of cultural things. So you're forcing the model to learn all these things about the world, but it's relatively easy to train because you can just collect a lot of videos and show the model parts of them -- you know what the next frame is, but the model doesn't.
Along the way, this also creates a video generation model - but you can think of this as more of a nice side effect rather than the ultimate goal.
All these models have just “seen” enough videos of all those things to build a probability distribution to predict the next step.
This is not bad, or make it inherently dumb, a major component of human intelligence is built on similar strategies. I couldn’t tell what grammatical rules are broken in text or what physical rules in a photograph but can tell it is wrong using the same methods .
Inference can take it far with large enough data sets, but sooner or later without reasoning you will hit a ceiling .
This is true for humans as well, plenty of people go far in life with just memorization and replication do a lot of jobs fairly competently, but not in everything.
Reasoning is essential for higher order functions and transformers is not the path for that
we do extensive amount of pattern matching and drop enormous amount of sensory input very quickly because we expect patterns and assume a lot about our surroundings.
Unlearning this is a hard skill to pick up. There are many versions of training from martial arts to meditation that attempt to achieve this .
Point is that alone is not sufficient, the other core component is reasoning and understanding , transformers and learning on data is insufficient .
Parrot and few other animals can imitate human speech very well , that doesn’t mean they are understanding the speech or constructing .
Don’t get me wrong, i am not saying it is not useful, it is , but this attribution of reasoning and understanding to models that foundationally has no such building block is just being impressed by a speaking parrot
Mimicking more patterns like emotion and motivation may be better user experience, it doesn't make the machine any smarter, just a better mime.
Your thesis is that as we mimic reality more and more the differences will not matter, this is a idea romanticized by popular media like Blade Runner.
I believe there are classes of applications, particularly if the goal singularity or better than human super intelligence, emulating human responses no matter how sophisticated won't take you take there. Proponents may hand wash this as moving the goalposts, it is only refining the tests to reflect the models of the era.
If the proponents of AI were serious about their claims of intelligence than they should also be pushing for AI rights , there is no such serious discourse happening, only issues related to human data privacy rights on what can be used by AI models for learning or where they can the models be allowed to work.
It's beginning to happen. Anthropic hired their first AI welfare researcher from Eleos AI, which is an organization specifically dedicated to investigating this question: https://eleosai.org/
Think 5-10 years into the future, this is a stepping stone
but more than anything it's useful as a stepping stone to more full-featured video generation that can maintain characters and story across multiple scenes. it seems clear that at some point tools like this will be able to generate full videos, not just shots.
Now, it may not be the best fit for those yet due to its limitations, but you've gotta walk before you can run: compare Stable Diffusion 1.x to FLUX.1 with ControlNet to see where quality and controllability could head in the future.
https://www.reddit.com/r/aivideo/comments/1hbnyi2/comment/m1...
Another more serious music video also made entirely by one person. https://www.youtube.com/watch?v=pdqcnRGzH5c Don't know how long it took though.
my templates all are waiting for stock videos to be added looping in the background
you have no idea how cool I am with the lack of copyright protections afforded to these videos I will generate, I'm making my money other ways
- gold everywhere is excessive - more Rococo (1730s-1760s) than Renaissance (1300-1600 roughly), which was a lot more restrained
- mirror way too big and clear. Renaissance mirrors were small polished metal or darker imperfect glass
- candelabras too ornate and numerous for Renaissance. Multi tier candleholders are more Baroque (1600-1750), and candles look suspiciously perfect, as opposed to period-appropriate uneven tallow or beeswax
- white paper too pristine (parchment or vellum would be expected), pen holders hilariously modern, gold-plated(??) desk surface is absurd
- woman's clothing looks too recent (Victorian?); sleeves and hair are theatrical
- hard to tell, but background dudes are lurking in what look like theatrical costumes rather than anything historically accurate
The prompt for the figure running through glowing threads seems to contain a lot of detail that doesn't show up in the video.
In the first example (close-up of DJ), the last line about her captivating presence and the power of music I guess should give the video a "vibe" (compared to prescriptively describing the video). I wonder how the result changes if you leave it out?
Cynically I think that it's a leading statement there for the reader rather than the model. Like now that you mention it, her presence _is_ captivating! Wow!
Now, examples of image or video generation models showing off how great they are should be stickman drawings or stickman videos. As far as I know, no model has been able to do that properly yet. If a model can do it well, it will be a huge breakthrough.
Another point to consider is that if my generative video system isn't good at maintaining world consistency, then doing a slow-motion video gives the illusion of a long video while being able to maintain a smaller "world context".
The website is horrible on resources.
Although I tried that and it has the same issue all of them seem to have for me: if you are familiar with the face but they are not really famous then the features in the video are never close enough to be able to recognize the same person.
50 cents per video. Far more when accounting for a cherrypick rate.
..and that's when I realized how much cherry picking we have in these "demos". These demos are about deceiving you into thinking the model is much better than it actually is.
This promotes not making the models available, because people then compare their extrapolation of demo images with the actual outputs. This can trick people into thinking Google is winning the game.
Google won.
Of course, it's orders of magnitude cheaper than making a video or an animation yourself.
Namely, so few neurons to get picture in our heads.
I guess, end of the world scenarios may lead us to create that super intelligence with a gigantic ultra performant artificial "brain".
Seriously, it sounds like something kids can have fun with, or bored deskworkers. But a serious use case, at the current state of the art? I doubt it.
Humanity has its ways of objecting accelerationism.
When did you ask people for directions, or other major questions, instead of Google?
You can wax poetic about wanting "the human touch", but at the end of the day, the market speaks -- people will just prefer everything automated. Including their partners, after your boyfriend can remember every little detail about you, notice everything including your pupils dilating, know exactly how you like it, when you like it, never get angry unless it's to spice things up, and has been trained on 1000 other partners, how could you go back? When robots can raise children better than parents, with patience and discipline and teaching them with individual attention, know 1000 ways to mold their behavior and achieve healthier outcomes. Everything people do is being commodified as we speak. Soon it will be humor, entertainment, nursing, etc. Then personal relations.
Just extrapolate a decade or three into the future. Best case scenario: if we nail alignment, we build a zoo for ourselves where we have zero power and are treated like animals who have sex and eat and fart all day long. No one will care about whatever you have to offer, because everyone will be surrounded by layers of bots from the time they are born.
PS: anything you write on HN can already have been written by AI, pretty soon you may as well quit producing any content at all. No one will care whether you wrote it.
People theoretically would care, but the internet has already set up producing things to be pseudo-anonymous, so we have forgotten the value of actually having a human being behind content. That's why AI is so successful, and it's a damn shame.
It's so we can in a fraction of those cases, develop real relationships to others behind the content! The whole point of sharing is to develop connections with real people. If all you want to do is consume independently of that, you are effectively a soulless machine.
If there's one thing that connects all media made in human history, it's that humans find humans interesting. No technology (like literally no technology ever) will change that.
Source? My experience has been that people at most might be “ok” at picking up completely generic output, and outright terrible at identifying anything with a modicum of effort or chance placed into it.
Bold of you to assume any effort is placed into content when the entire point of using AI in the first place is to avoid this.
I mean, i've seen people using it in that way yes. These are normally the same people I saw copying and pasting the first google result they found for any search as an answer to their customers/co-workers etc. qOr to whom you would say "Do not send this to the customer, this is my explanation to you, use your own words, this is just a high level blah blah" and then five minutes later you see your response word for word having gone out to a customer with zero modification or review for appropriateness.
I equally see a very different kind of usage, where its just another tool used for speeding up portions of work, but not being produced to complete a work in totality.
Like sadly yes, i've now see sales members with rando chrome extensions that just attach AI to everything and they just let it do whatever the fuck it wants, which makes me want to cry...but again, these people were already effectively doing that, they are just doing it faster than ever.
If a fish could write a novel, would you find what it wrote interesting, or would it seem like a fish wrote it? Humans absorb information relative to the human experience, and without living a human existence the information will feel fuzzy or uncanny. AI can approximate that but can't live it for real. Since it is a derivative of an information set, it can never truly express the full resolution of it's primary source.
What would be the point of paying for AI content if nobody did anything to produce it? Just take that shit!
Yeah in some broad sense, the same as we've always had: back in the 2010s it could have been generated by a Markov chain, after all. The only difference now is that the average quality of these LLMs is much, much higher. But the distribution of their responses is still not on par with what I'd consider a good response, and so I hunt out real people to listen to. This is especially important because LLMs are still not capable of doing what I care most about: giving me novel data and insights about the real world, coming from the day to day lived experience of people like me.
HN might die but real people will still write blogs, and real people will seek them out for so long as humans are still economically relevant.
Asking for directions is a bad example, because it takes very little time for both humans and machines to give you directions. Therefore it would be highly unusual for anyone to pay for this service (LOL)
Actually, typically human objection only slows it down and often it becomes a fringe movement, while the masses continue to consume the lowest common denominator. Take the revival of the flip phone, typewriter, etc. Sadly, technology marches on and life gets worse.
If you use 'proximity to wild nature', 'clean air', 'more space', then life has gotten worse.
But people don't choose between these two. They choose between alternatives that give them analgesics in an already corrupt society creating a series of descending local maximae.
TikTok is one of the easiest platforms to create for, and look at how much human attention it has sucked up.
The attention/dopamine magnet is accelerating its transformation into a gravitational singularity for human minds.
I might be wrong, but AI videos are on the same path as AI generated images. Cool for the first year, then “ah ok, zero effort content”.
"The Human Security System is structured by delusion. What's being protected there is not some real thing that is mankind, it's the structure of illusory identity. Just as at the more micro level it's not that humans as an organism are being threatened by robots, it's rather that your self-comprehension as an organism becomes something that can't be maintained beyond a certain threshold of ambient networked intelligence." [0]
See also my research project on the core thesis of Accelerationism that capitalism is AI. [1]
[0] https://syntheticzero.net/2017/06/19/the-only-thing-i-would-...
Thanks for sharing that video and post!
One way to think about this stuff is to imagine that you are 14 and starting to create videos, art, music, etc in order to build a platform online. Maybe you dream of having 7 channels at the same time for your sundry hobbies and building audiences.
For that 14 year old, these tools are available everywhere by default and are a step function above what the prior generation had. If you imagine these tools improving even faster in usability and capability than prior generations' tools did …
If you are of a certain age you'll remember how we were harangued endlessly about "remix culture" and how mp3s were enabling us to steal creativity without making an effort at being creative ourselves, about how photobashing in Photoshop (pirated cracked version anyway) was not real art, etc.
And yet, halfway through the linked video, the speaker, who has misgivings, was laughing out loud at the inventiveness of the generated replies and I was reminded that someone once said that one true IQ test is the ability to make other humans laugh.
Inventive is one way of putting it, but I think he was laughing at how bizarre or out-of-character the responses would be if he used them. Like the AI suggesting that he post "it is indeed a beverage that would make you have a hard time finding a toilet bowl that can hold all of that liquid" as if those were his own words.
If this is "just another tool" then my question is: does the output of someone who has used this tool for one thousand hours display a meaningful difference in quality to someone who just picked it up?
I have not seen any evidence that it does.
Another idea: What the pro generative AI crowd doesn't seem to understand is that good art is not about _execution_ it's about _making deliberate choices_. While a master painter or guitarist may indeed pull off incredible technical feats, their execution is not the art in and of itself, it is widening the amount of choices they can make. The more and more generative AI steps into the role of making these choices ironically the more useless it becomes.
And lastly: I've never met anyone who has spent significant time creating art react to generative AI as anything more than a toy.
Yes. A thousand hours confers you with a much greater understanding of what it's capable of, its constraints, and how to best take advantage of these.
By comparison, consider photography: it is ostensibly only a few controls and a button, but getting quality results requires the user to understand the language of the medium.
> What the pro generative AI crowd doesn't seem to understand is that good art is not about _execution_ it's about _making deliberate choices_. While a master painter or guitarist may indeed pull off incredible technical feats, their execution is not the art in and of itself, it is widening the amount of choices they can make.
This is often not true, as evidenced by the pre-existing fields of generative art and evolutionary art. It's also a pretty reductive definition of art: viewers can often find art in something with no intentional artistry behind it.
> I've never met anyone who has spent significant time creating art react to generative AI as anything more than a toy.
It's a big world out there, and you haven't met everyone ;) Just this last week, I went to two art exhibitions in Paris that involved generative AI as part of the artwork; here's one of the pieces: https://www.muhka.be/en/exhibitions/agnieszka-polska-flowers...
The exhibition you shared is rather beautiful. Thank you for the link!
We were told that what we were doing didn't require as much skill as whatever the previous generation were doing to sample music and make new tracks. In hindsight, of course you find it easy to cite the prominent successes that you know from the generation. That's arguing from survivorship bias and availability bias.
But those successes were never the point: the publishers and artists were pissed off at the tens of thousands of teenagers remixing stuff for their own enjoyment and forming small yet numerous communities and subcultures globally over the net. Many of us never became famous so you can cite our fame as proof of skill but we made money hosting parties at the local raves with beats we remixed together ad hoc and that others enjoyed.
> The artists creating those projects clearly have hundreds if not thousands of hours of practice differentiating them from someone who just started pasting MP3s together in a DAW yesterday.
But they all began as I did, by being someone who "just started pasting MP3s together" in my bedroom. Darude, Skrillex, Burial, and all the others simply kept doing it longer than those who decided they had to get an office job instead.
The teenagers today are in exactly the same position, except with vastly more powerful tools and the entire corpus of human creativity free to download, whether in the public domain or not.
I guess in response to your "required skill and talent", I'm saying that skill is something that's developed within the context of the technology a generation has available. But it is always developed, then viewed as such in hindsight.
Yes, absolutely. Not necessarily in apparent execution without knowledge of intent (though, often, there, too), but in the scope of meaningful choices that fhey can make and reflect with the tools, yes.
This is probably even more pronounced with use of open models than the exclusively hosted ones, because more choices and controls are exposed to the user (with the right toolchain) than with most exclusively-hosted models.
Maybe it’s just me who couldn’t find it, (the website barely works at all on FF iOS)..
> VideoFX isn't available in your country yet.
Think about it, almost everyone I know rarely clicks on ads or buys from ads anymore. On the other hand, a lot of people including myself look into buying something advertised implicitly or explicitly by content creators we follow. Say a router recommended by LinusTechTips. A lot of brands started moving their as spending to influencers too.
Google doesn't have a lot of control on these influencers. But if they can get good video generations models, they can control this ad space too without having human in the loop.
1) AI is a massive wave right now and everyone's afraid that they're going to miss it, and that it will change the world. They're not obviously wrong!
2) AI is showing real results in some places. Maybe a lot of us are numb to what gen AI can do by now, but the fact that it can generate the videos in this post is actually astounding! 10 years ago it would have been borderline unbelievable. Of course they want to keep investing in that.
This is a typical tech echo chamber. There is a significant number of people who make direct purchases through ads.
> But if they can get good video generations models, they can control this ad space too without having human in the loop.
Looks like based on a misguided assumption. Format might have significant impacts on reach, but decision factor is trust on the reviewer. Video format itself does not guarantee a decent CTR/CVR. It's true that those ads company find this space lucrative, but they're smart enough to acknowledge this complexity.
Even if its not, TV ads, newspaper ads, magazine ads, billboards, etc... get exactly 0 clickthrus, and yet, people still bought (and continue to buy) them. Why do we act like impressions are hunky-dory for every other medium, but worthless for web ads?
I remember saying this to a google VP fifteen years ago. Somehow people are still clicking on ads today.
Most people have claimed not to be influenced by ads since long before networked computers were a major medium for delivering them.