BioHacker News | Experiment with Gemini 2.0 Flash native image generation

▲Experiment with Gemini 2.0 Flash native image generation(developers.googleblog.com)

63 points by meetpateltech 9 hours ago | 4 comments

▲Filligree 1 hour ago

It just tells me 'content not permitted'.

For context, I was attempting to put a cup of hot chocolate into the hands of an anime character.

▲nmfisher 23 minutes ago

I had exactly the same issue, seems like it's practically useless for humanoid illustrations/characters.

▲megadata 1 hour ago

I think it's definition on what's a "person" is way too broad. I've had similar problems, it's pretty shitty.

▲Filligree 1 hour ago

It allowed me to do a horse, but-

https://usercontent.irccloud-cdn.com/file/DkQ5SSdT/image.png

Pixelwave generates much better horse-people, in the event I wanted that. Admittedly it doesn't have an edit function; unfortunately the illustrations I want to edit all have people in them.

Realistic photos work better, though it still doesn't beat Flux.1-dev: https://usercontent.irccloud-cdn.com/file/ZsouXNpn/image.png

▲ilaksh 2 hours ago

Ever since OpenAI showed (but did not release) this type of multimodal output with 4o, I have been waiting for this to be available to the general public.

It seems like really combining visuals at the level of generation capability means language understanding is fully grounded in a richer world model.

I am hoping for a step up in real world common sense intelligence areas like those covered by SimpleBench. Although they are static images, so there might still be room for improvement ad far as physics understanding.

Also, if they can get it to the point of really accurate (probably larger models), this unlocks whole industries in terms of being able to do useful work.

▲londons_explore 2 hours ago

If it can do diagrams, charts, etc, with any kind of accuracy, it would have far more impact.

Eg. "I suggest moving the boiler from point A to B on the below map of the factory to reduce piping costs and heat loss"

▲ilaksh 1 hour ago

It's on the right track, but lacks precision.

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

Previous try with some interesting introspection:

https://drive.google.com/file/d/1SCBbpDo1dAJBAz7bFABk4yBZBuz..., https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

▲jcuenod 2 hours ago

I was really hoping that there would be more character consistency, given the fact they mention it in the blog. It also doesn't seem to reliably follow styles like "watercolor illustration" or "line and wash".

▲ 8 hours ago