Transforming Our Brainies: AI Image Manipulation Experiment - 2 / 5

Transforming Our Brainies: AI Image Manipulation Experiment - 2 / 5

Transforming Our Brainies: AI Image Manipulation Experiment - 2 / 5

Part 2: Testing Flux, cgdream.ai & Think Diffusion

In Part 1, we analyzed Manus’ recommendations, designed the experience, and prepared the comparison.

In this Part 2 of our Brainy transformation series, we put Flux, cgdream.ai, and Think Diffusion to the test. These tools specialize in AI-powered image-to-image generation, promising to refine Brainy’s design while maintaining his core identity.

Our approach:

  1. Extract key characteristics from Agatha Christie facial features.
  2. Upload Text Model Brainy’s current version.
  3. Apply facial features, detail enhancements, and corrections based on Manus’ feedback.
  4. Quick comparison outputs for clarity, creativity, and brand alignment. Which tool will deliver the most professional and polished result? Let’s find out!

Analysis

We used the website Reverse Prompt to extract the features from Agatha Christie’s picture: arafed black and white photo of a woman with her hand on her chin, meret elisabeth oppenheim, catalogue photo, georgia o'keefe, portrait image, inspired by Mildred Eldridge, pilar gogar, alice russell glenny, promotional portrait, margaret watts hughes, by Mildred Eldridge, agnes pelton

This was not what we were looking for. It did not describe the person’s facial features.

We tried another website Flux AI: A captivating portrait of a mature woman with elegantly styled gray curls, seated thoughtfully with her chin resting on her hand. The setting is minimalistic, with a soft, gradient background that transitions from dark to light, emphasizing her facial features. The lighting is dramatic yet soft, highlighting the contours of her face and casting subtle shadows that add depth to her expression of contemplation and wisdom. The composition follows the rule of thirds, positioning her off-center to create a sense of balance and focus. The woman's attire is a simple black top, which contrasts beautifully with her luminous skin tone. The image conveys a timeless quality, reminiscent of classic black and white photography from the mid-20th century. It evokes a mood of introspection and grace, showcasing feminine strength and resilience. The overall style draws inspiration from portraitists like Yousuf Karsh and Richard Avedon, noted for their ability to capture the essence of their subjects through striking simplicity and emotive lighting techniques.

We also tried Flux AI Lab: black and white photograph, elegant older woman, close-up, white curly hair, wearing a dark suit, leaning on hand, thoughtful expression, soft lighting, vintage style, studio setting, 1930s fashion, monochrome tones, high contrast, classic portrait, fine details, timeless beauty, retro vibes, photographer: unknown, artistic composition, historical feel, sophisticated atmosphere

And Image Prompt: A middle-aged woman, likely Caucasian, is the central subject. She is positioned slightly off-center to the left of the frame, and her head is tilted slightly forward. Her gaze is directed slightly downward and away from the viewer. Her expression is contemplative, almost pensive, with a slight downturn of the mouth. Her hair is styled in a classic 1930s or 1940s fashion, with soft waves and curls. She's wearing a dark-colored, likely black, jacket or dress that extends to the mid-chest area, with a delicate chain necklace visible around her neck. Her posture is relaxed but not slouching. She appears to have a moderate build and is resting her right hand on her left cheek. The image is in black and white, with strong contrast and dramatic shadows, which create a portrait style. The lighting suggests a studio setting, with light positioned to highlight the face's features and create depth in the image. The overall atmosphere is one of quiet contemplation and introspection. The composition focuses on her face and upper body, creating a sense of intimacy and connection with the subject.

None of these prompts were good for what we want to do. They concentrated too much on composition, light, clothes, but they didn’t describe the facial features of Agatha Christie. The only feature described was her hair. Nothing about her eyes, her nose, her lips. So did it ourselves. Add the following human features: an elegant British woman in her sixties, deep blue eyes in an olive shape, a large straight nose, fine rose lips, medium cheekbones, curly white hair, large white eyebrows, and small wrinkles in the corners of her eyes. Add an old-style book into the hand of the character. Remove the feather.

Flux Image to Image

Test 1

Uploaded the original Text Model Brainy picture to Flux Image to Image and used the above prompt. Using parameter Reference Image Strength at 0.90 and turn ON the Raw Mode.

Result from Flux Image to Image

The eye looks good, so does the (half-)nose, and there is a book. So some of the instructions made it through.

A few problems:

  • The style has changed from 3D model to cartoon,
  • Because the original Brainy only had one eye, the generate one also has one,
  • The feather was not removed,
  • We no longer see the feet.

We changed the Reference Image Strength to 0.5 and changed the prompt to: Add an old-style book open into the hand of the character. Remove the feather. Make all features symmetrical. Add the following human features: an elegant British woman in her sixties, two deep blue eyes in an olive shape, a large straight nose, fine rose lips, medium cheekbones, curly white hair, large white eyebrows, and small wrinkles in the corners of her eyes.

Unfortunately, we ran out of free credits to continue testing the tool.

Flux Conclusion

We cannot reach any conclusion by running a single test. The starting number of credits is not enough to evaluate the platform.

cgdream.ai

After creating an account in cgdream we see a very different interface: it has several parameters on the right side of the screen, and 5 slots to upload elements on the left size:

  • Style
  • Structure
  • Reference
  • 3D model
  • Character

Test 1

We used Brainy as Character and Style, kept the default strength at 1, changed the model to SDXK (Juggernaut XL), raised the prompt guidance to 7, and activated the Creative mode. The previous prompt gave us this result:

First result from cgdream

So the prompt does not work as a modification of the image, it must be a full prompt with all characteristics. And strength 1 should be lowered to have stronger similarity with the original image, which is counter intuitive.

Test 2

We moved the Brainy image from Character to Image, with a default strength value of 0.5. We kept Brainy as a Style reference. And we tried with this full prompt: A 3D mascot for a B2B brand in the shape of Brain. The brain has the following human features: an elegant British woman in her sixties, two deep blue eyes in an olive shape, a large straight nose, fine rose lips, medium cheekbones, curly white hair, large white eyebrows, and small wrinkles in the corners of her eyes. The Brain is holding an old-style book in its hand. It has thin arms and legs. The mascot is on a white background.

Second result from cgdream

The results were better:

  • The style is consistent,
  • We have added human features (blue eyes, thin lips, a nose, curly hairs integrated in the brain),

However, we don’t have all the elements we wanted:

  • The book is missing,
  • The glasses are not wanted,

Test 3

So we increased the Guidance to the prompt to 9.

Second result from cgdream

Now we have a book, but we have lost the Brain effect. It is now a big head. The eyes are also too big, and so is the mouth. We also noticed that the prompts were rewritten (likely due to the Creative flag.) The real prompt was: The image features a whimsical 3D mascot designed for a B2B brand, taking the form of a brain with anthropomorphic characteristics. This brain is personified as an elegant British woman in her sixties, showcasing a sophisticated demeanor through her deep blue olive-shaped eyes, large straight nose, and fine rose lips. Her medium cheekbones and curly white hair add to her distinguished appearance, complemented by large white eyebrows and subtle wrinkles at the corners of her eyes, suggesting wisdom and experience.

In her right hand, she holds an old-style book, conveying a sense of knowledge and tradition, while her thin arms and legs give her a light, approachable look. The composition is set against a clean white background, enhancing the character’s vibrant colors and details, allowing the viewer to focus on the mascot’s charming features. The lighting is soft and even, illuminating the brain’s surface texture and the delicate features of the character, creating an inviting and friendly atmosphere that aligns well with the brand’s professional yet approachable identity.

Test 4

For the next trial, we remove the Creative flag, to confirm that it would use our original prompt.

Second result from cgdream

We could see the benefit of the Creative mode. The human feature were not as good as in the previous two test. We had to turn it back on.

Test 5

As a last attempt (we are running out of free credits), let’s simplify everything. We turn back Creative on, and use the following prompt: A 3D mascot for a B2B brand in the shape of Brain. The brain has the human features of Agatha Christie in her sixties. The brain is holding an old-style book in its hand. It has thin arms and legs. The mascot is on a white background.

Second result from cgdream

That looked more like a Lovecraft creature than Agatha Christie. We did need to provide the key human features to the model.

cgdream conclusion

The tool gave us enough free credits to test and learn. We got interesting results. With a few more trials, we would be able to get to the result we want. We selected image 3 as the candidate for the final competition, as it was clearly female, had enough human features to be recognized, and had the book.

Think Diffusion

After login in, Think Diffusion creates a virtual machine running A1111 which we could activate for 15 minutes. We needed to optimize our time in understanding, executing, and comparing results.

We used the option img2img with all default parameters and uploaded Text Brainy as a reference. We made a minor change to the creative prompt from cgdream that gave us our best result. We replaced whimsical to serious: The image features a serious 3D mascot designed for a B2B brand, taking the form of a brain with anthropomorphic characteristics. This brain is personified as an elegant British woman in her sixties, showcasing a sophisticated demeanor through her deep blue olive-shaped eyes, large straight nose, and fine rose lips. Her medium cheekbones and curly white hair add to her distinguished appearance, complemented by large white eyebrows and subtle wrinkles at the corners of her eyes, suggesting wisdom and experience.

In her right hand, she holds an old-style book, conveying a sense of knowledge and tradition, while her thin arms and legs give her a light, approachable look. The composition is set against a clean white background, enhancing the character’s vibrant colors and details, allowing the viewer to focus on the mascot’s charming features. The lighting is soft and even, illuminating the brain’s surface texture and the delicate features of the character, creating an inviting and friendly atmosphere that aligns well with the brand’s professional yet approachable identity.

After three attempts, no image was generated, and there was no error message, UI indication, or explanation of the issue.

We tried different options and toggles for 10 minutes and then stopped the virtual machine.

Think Diffusion conclusion

This was not a user-friendly tool. It was highly technical and offers many low-level parameters. It is unfortunate that the default parameters did not produce any result. But the worst was that the user had no clue what was happening and how to fix the issue.

Part 2 Conclusion

After testing these three tools, only cgdream.ai allowed us to move forward in our quest to improve our Brainies. Flux did not give us enough credits, and Think Diffusion proved to be a very crude toolbox for expert who have many hours to invest in understanding the parameters.

This represented clear progress from our original design.

Second result from cgdream

Next, we will explore image manipulation tools (Fotor, PhotoEditor.ai, Phot.ai) - can any of these tool help us improve further our Brainies?

We are Here to Empower

At System in Motion, we are on a mission to empower as many knowledge workers as possible. To start or continue your GenAI journey.

You should also read

Transforming Our Brainies: AI Image Manipulation Experiment - 1 / 5

Transforming Our Brainies: AI Image Manipulation Experiment - 1 / 5

Article 5 minutes read
AI Finance Use Cases - Part 2: Ratios

AI Finance Use Cases - Part 2: Ratios

Article 11 minutes read

Let's start and accelerate your digitalization

One step at a time, we can start your AI journey today, by building the foundation of your future performance.

Book a Training