I know people have been scared by new technology since technology, but I've never before fallen into that camp until now. I have to admit, this really does frighten me.
What’s wild to me is how Yann LeCun doesn’t seem to see this as an issue at all. Many other leading researchers (Yoshua Bengio, Geoffrey Hinton, Frank Hutter, etc.) signed that letter on the threats of AI and LeCun just posts on Twitter and talks about how we’ll just “not build” potentially harmful AI. Really makes me lose trust in anything else he says.
There with you. This is really worrying to me. This technology is advancing way faster than were adjusting to it. I haven't even gotten over how amazing GPT2.5 is but most people already seem to be taking it for granted. We didn't have anything even close to this just few years prior
To make that statement a little more accurate, I'm afraid of the humans that will abuse this technology and societies ability to adapt to it. There's some amazingly cool things that can come about from this, like all the small indie creators that lack the connections and project management skills to make their ambitions come to life will be able to achieve their vision, and that's really cool and I'm excited for that, but my excitement is smashed from knowing all the bad that will come with this.
Honestly, let's make it mainstream.
Get it to a point where it's more profitable to mass produce Ai porn than exploit young women from god knows where.
This is so much better than all text-to-video models currently available. I'm looking forward to read the paper but I'm afraid they won't say much about how they did this.
Even if the examples are cherry picked, this is mind blowing!
I was thinking exactly this but with the Bible. Not because I like the Bible but because I'd love to see how AI interprets one of the most important books in human history.
But yeha, the Silmarillion is basically a Bible from another universe.
I wonder if in the 1800s people saw the first photograph and thought… “well, that’s the end of painters.” Others probably said “look! it’s so shitty it can’t even reproduce colors!!!”.
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.
I have worked with hundreds of software developers in the last 20 years, half of them were copy pasters who got into software because they tricked people into thinking it was magic. In the future we will still code, just don’t bother with the thing the Prompt Engineer can do in 5 seconds.
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.
I think a better way of saying this are people who were just doing it for a job, not because of a lot of talent or passion for painting.
But doing something just because it is a job is what a lot of people have to do to survive. Not everyone can have a profession that they love and have a passion for.
That's where the problem comes in when it comes to these generative AI.
And then the problem here is capitalism and NOT AI art. The capitalists are ALWAYS looking for ways to not pay us, if it wasnt AI art, it was always going to be something else
It was exactly the same as with AI art. The same histrionics about the end of art and the dangers to society. It's really embarrassing how unoriginal all this is.
As the photographic industry was the refuge of every would-be painter, every painter too ill-endowed or too lazy to complete his studies, this universal infatuation bore not only the mark of a blindness, an imbecility, but had also the air of a vengeance. I do not believe, or at least I do not wish to believe, in the absolute success of such a brutish conspiracy, in which, as in all others, one finds both fools and knaves; but I am convinced that the ill-applied developments of photography, like all other purely material developments of progress, have contributed much to the impoverishment of the French artistic genius, which is already so scarce.
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art.
This attitude is not new, either. He addressed it thus:
I know very well that some people will retort, “The disease which you have just been diagnosing is a disease of imbeciles. What man worthy of the name of artist, and what true connoisseur, has ever confused art with industry?” I know it; and yet I will ask them in my turn if they believe in the contagion of good and evil, in the action of the mass on individuals, and in the involuntary, forced obedience of the individual to the mass.
The hardest part of coding is managing the project, not writing the content of one function. By the time LLMs can do that it's not just programming jobs that will be obsolete, it will be all office jobs.
This is still so bizarre to me. I've worked on 3D rendering engines trying to create realistic lighting and even the most advanced 3D games are pretty artificial. And now all of a sudden this stuff is just BAM super realistic. Not just that, but as a game designer you could create an entire game by writing text and some logic.
In my experience as a game designer, the code that LLMs spit out is pretty shit. It won't even compile half the time, and when it does, it won't do what you want without significant changes.
The correct usage of LLMs in coding imo is for a single use case at a time, building up to what you need from scratch. It requires skill both in talking to AI for it to give you what you want, knowing how to build up to it, reading the code it spits out so that you know when it goes south and the skill of actually knowing how to build the bigger picture software from little pieces but if you are an intermediate dev who is stuck on something it is a great help.
That or for rubber ducky debugging, it s also great in that
Keep in mind that this isn't creating 3d Billy volumes at all. While immensely impressive, the thing being created by this architecture is a series of 2d frames.
Lol you don't know how cruel that is. For decades programmers have devoted their passion to creating hyperrealistic games and 3D graphics in general, and now poof it's here like with a magic wand and people say "yeah well you should have made your 3D engine look like the real world, not to look like shit" :D
Welcome to the club my friend... Expert after expert is having this experience as AI develops in the past couple years and we discover that the job can be automated way more than we thought.
First it was the customer service chat agents. Then it was the writers. Then it was the programmers. Then it was the graphic design artists. Now it's the animators.
Another programmer here. The bottleneck in most jobs isn't in getting boilerplate out, which is where AI excels, it's in that first and/or last 10-20%, alongside dictating what patterns are suitable for your problem, what proprietary tooling you'll need to use, what API's you're hitting and what has changed in recent weeks/months.
What AI is achieving is impressive, but as someone that works in AI, I think that we're seeing a two-fold problem: we're seeing a limit of what these models can accomplish with their training data, and we're seeing employers hedge their bets on weaker output with AI over specialist workers.
The former is a great problem, because this tooling could be adjusted to make workers lives far easier/faster, in the same way that many tools have done so already. The latter is a huge problem, as in many skilled worker industries we've seen waves of layoffs, and years of enshitification resulting in poorer products.
The latter is also where I think we'll see a huge change in culture. IMO, we'll see existing companies bet it all and die from supporting AI over people, and a new wave of companies focus on putting output of a certain standard to take on larger companies.
Writer here, absolutely not having this experience. Generative AI tools are bad at writing, but people generally have a pretty low bar for what they think is good enough.
These things are great if you care about tech demos and not quality of output. If you actually need the end result to be good though, you’re gonna be waiting a while.
Still waiting on the programmer part. In a nutshell AI being say 90% perfect means you have 90% working code IE 10% broken code. Images and video (but not sound) is way easier cause human eyes kinda just suck. Couple of the videos they've released pass even at a pretty long glance. You only notice funny businesses once you look closer.
I can't imagine that digital artists/animators have reason to worry. At the upper end, animated movies will simply get flashier, eating up all the productivity gains. In live action, more effects will be pure CGI. At the bottom end, we may see productions hiring VFX artists, just as naturally as they hire makeup artists now.
When something becomes cheaper, people buy more of it, until their demand is satisfied. With food, we are well past that point. I don't think we are anywhere near that point with visual effects.
It seems to me that AI won't completely replace jobs (but will do in 10-20 years). But will reduce demand because oversaturation + ultraproductivity with AI. Moreover, AI will continue to improve. A work of a team of 30 people will be done with just 3 people.
Yeah. And it's not just how good the images look it's also the creativity. Everyone tries to downplay this but I've read texts and those videos and just from the prompts there is a "creative spark" there. It's not very bright spark lol but it's there.
I should get into this stuff but I feel old lol. I imagine you could generate interesting levels with obstacles and riddles and "story beats" too.
If you read Japanese, it's really obvious the Tokyo one is AI; the signage largely makes no sense, has incorrect characters, has weird mixing of characters, etc.
There are tons of books. Afaik the main storyline was an extragalactic invasion by a super evil swarm. Also explains why the emperor build so many ships.
Except the gains technology and automation bring are rarely evenly distributed in society. Just compare how productive a worker is today and how much we make compared to 50 years ago.
1 Generally people want to work, people don't want to be exploited by capitolists for a capitolist society where they barely make rent humans are generally workers.
2. This isn't working less, this isn't productivity improvement. This is less humanity in art and all just so employers don't need to spend money on workers.
If the natural state of technology is that there aren't enough jobs to sustain an economy, then our economic system is broken, and trying to preserve obsolete jobs is just preserving the broken status quo that primarily benefits the rich. Over time I'm thinking more and more that instead of trying to prop up an outdated economic system we should just let it fail, and then we have no choice but to rethink it.
Oh yes yes I'm sure that we will totally rethink our economic systems that's absolutely what will happen and it will totally result in the utopia you're dreaming of. I'm sure that will happen I'm sure it's not just the ultra wealthy noting how they can make even more profit whilst everyone else suffers can't be that I'm sure the government will do something we all have faith in that we know it's obvious that will happen
The second one is easy as you don't need coherence between reflected and non-reflected stuff: Only the reflection is visible. The second one has lots of inconsistencies: I works kinda well if the reflected thing and reflection are close together in the image, it does tend to copy over uniformly-coloured tall lights, but OTOH it also invents completely new things.
Do people notice? Well, it depends. People do notice screen-space reflections being off in traditional rendering pipelines, not always, but it happens and those AI reflections are the same kind of "mostly there in most situations but let's cheap out to make it computationally feasible" type of deal: Ultimately processing information, tracking influence of one piece of data throughout the whole scene, comes with a minimum amount of required computational complexity and neither AI nor SSR do it.
The example videos are both impressive (insofar that they exist) and dreadful. Two-legged horses everywhere, lots of random half-human-half-horse hybrids, walls change materials constantly, etc.
It really feels like all this does is generate 60 DALL-E images per second and little else.
For the limitations visual AI tends to have, this is still better than what I've seen. Objects and subjects seem pretty stable from Frame to Frame, even if those objects are quite nightmarish
This would work very well with a text adventure game, though. A lot of them are already set in fantasy worlds with cosmic horrors everywhere, so this would fit well to animate what's happening in the game
I mean, it took a couple months for AI to mostly figure out that hand situation. Video is, I'd assume, a different beast, but I can't imagine it won't improve almost as fast.
It will get better, but in the mean time you just manually tell the AI to try again or adjust your prompt. I don't get the negativity about it not being perfect right off the bat. When the magic wand tool originally came out, it had tons of jagged edges. That didn't make it useless, it just meant it did a good chunk of the work for you and you just needed to manually get it the rest of the way there. With stable diffusion if I get a bad hand you just inpaint and regenerate it again until it's fixed. If you don't get the composition you want, just generate parts of the scene, combine it in an image editor, then have it use it as a base image to generate on top of.
They're showing you the raw output to show off the capabilities of the base model. In practice you would review the output and manually fix anything that's broken. Sure you'll get people too lazy to even do that, but non lazy people will be able to do really impressive things with this even in its current state.
If this goes well, future video compression might take a massive leap. Imagine downloading 2 hours movies with just 20kb file size because it just a bunch of prompts under the hood.
And the largest ever decoder since it'll need the whole model to work. I'm not particularly knowledgeable on AI but I'll assume this will occupy hundreds of gigabytes, correct me if I'm wrong there. In comparison, libdav1d, an av1 decoder, weighs less than 2 MB.
Looks good but still has the ai hallmarks, rotating legs, f’ed up gait.. impressive though and it’s going be wild to see what results from this latest pox on the tubes.
The compute power it would take to do that in realtime at the framerates required for VR to be comfortable for two separate perspectives would be absolutely beyond insane. But at the rate hardware improves and the breakneck speed these AI models are developing maybe it's not as far off as I think.
An Ai generated VR world would be a single map environment generated in the same way you wait at loading screens when a game starts or you move to an entirely new map.
A text to 3D game asset Ai wouldn't regenerate a new 3D world on every frame in the same way you wouldn't ask AI to draw a picture of an orange cat and then ask it to draw another picture of an orange cat shifted one pixel to the left if you wanted the cat moved a pixel. The result would be totally different picture.
I recently played a game where people found immortality and each individual just lived in their own personal virtual reality for thousands of years. It's kinda creepy seeing the recent advances in technology today lining up to that, minus the immortality part.
This is a base model, just because it's 90% there on its own doesn't mean you can't improve on it by adding extra safe guards. For example you can get LLMs to be more accurate by asking another LLM to proofread the work. I am frankly amazed that the base models are this good to begin with. I was totally expecting to need way more safeguarda from the get go, but we're getting a lot even without them. But I fully expect there to be AI tools that are specialized to identify where the base model messes up and then corrects it.
Sora is capable of creating “complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” according to OpenAI’s introductory blog post.
The company also notes that the model can understand how objects “exist in the physical world,” as well as “accurately interpret props and generate compelling characters that express vibrant emotions.”
Many have some telltale signs of AI — like a suspiciously moving floor in a video of a museum — and OpenAI says the model “may struggle with accurately simulating the physics of a complex scene,” but the results are overall pretty impressive.
A couple of years ago, it was text-to-image generators like Midjourney that were at the forefront of models’ ability to turn words into images.
But recently, video has begun to improve at a remarkable pace: companies like Runway and Pika have shown impressive text-to-video models of their own, and Google’s Lumiere figures to be one of OpenAI’s primary competitors in this space, too.
It notes that the existing model might not accurately simulate the physics of a complex scene and may not properly interpret certain instances of cause and effect.
The original article contains 395 words, the summary contains 190 words. Saved 52%. I'm a bot and I'm open source!
The most obvious, immediate use is better CGI in shows and movies. Personally, I like to be entertained, so I consider myself as benefitting from this.
The less immediate use is AI with an understanding of time and space, real world physics, cause and effect,...