Using AI Generated Imaging To Mimic Popular Photography And The Lessons Learned

Vincent T.
High-Definition Pro
11 min readDec 11, 2023

--

AI has advanced to a higher level when it comes to imaging. That includes generating “fake” photographs using trained algorithms based on image generation models.

In this technique, users generate content from natural language input, either by text or speech. The computer then creates the output based on that input.

The main feature of these AI-based models enables text-to-image generation with rather acceptable results. Earlier versions were slow and inaccurate, but the machine learning process has improved that.

Now we have more trained datasets that can actually render realistic looking images. The average person may not be able to tell real from fake, if the images look convincing enough.

I am going to present some examples using the Stable Diffusion (SDXL) image generation model with K_DPMPP_2M sampling. This can generate hyper-realistic looking images especially when it comes to people.

I will create prompts that will try to mimic the style and work used in popular photography. Most of the work will be done by the image generation engine, so this is also a form of generative art.

Urban Fashion Style

Using some inspiration from photojournalist Steve McCurry, I generated an urban fashion style look. I focused my prompt on:

female fashion model photography by Steve McCurry, casual style, urban scenery, 150mm

I want a style and the scenery to be based on Steve McCurry’s work or its interpretation from the SDXL imaging engine.

The result from that creation was something I expected it would be. That is from my perspective of the style, but not necessarily the technique.

The model has beautiful, realistic and detailed features that are typical of some of McCurry’s portrait shots. There are soft shadows, with subtle lighting in the background that makes the subject pop in the foreground.

Figure 1. Urban style fashion

The scenery in the image is also something I would expect. Since McCurry is a photojournalist, the setting would be more likely on location in the streets and not in a studio.

Since I gave emphasis to urban scenery, I get this nice street backdrop. It looks like one of the locations in South Asia where McCurry would shoot.

Overall I thought that the AI was able to produce an acceptable image. I did not describe any further how I wanted the model styled, so the imaging engine just came up with a random style based on my inputs.

Commercial Photography

The next set of photos are for commercial fashion photography. I did not specify any particular photographer for this. I wanted to see how the AI would interpret a more generic description of the style.

The first look I want is a sporty and athletic style. I used the following words in my prompt:

colored photography, magazine, catalogue, athletic fashion, sporty, e-commerce

The result I got was not exactly what I expected for what I wanted to see. I was expecting a more fashion catalog or e-commerce style, but in my iteration I got something different.

This can always be improved in my prompt, but this was the result generated. I did not specify any brand or particular style, just a generic sporty stylish look, but the AI included a known brand.

The result that comes to my mind is that it could be more of a commercial for billboard and poster size images used in advertising. It also has an artistic composition feel, like that of a painting or digital illustration.

Figure 2. Sporty and athletic

The second look I generated is a boudoir style. I could not be specific about the word boudoir since the AI engine bans certain words due to the explicit nature which could be open for abuse.

Instead I put in the following words:

female model, couture, bedroom, soft light, delicate features, Victoria’s Secret

I was thinking along the lines of a Victoria’s Secret lingerie shoot. There is a filter that prevents certain words to be used in the prompt, so I had to use the closest description of how I want the image generated.

The result was rather surprising. Without getting too descriptive about the look, it seems the training set used has an understanding of the type of fashion for couture in the bedroom.

Figure 3. Boudoir intimate style

The model was styled and posed like in a typical boudoir shoot. The amount of detail looks amazing for AI, even if it is not a real photo.

The lighting in the photo generated was like it was shot by a DSLR camera with a softbox using a shallow depth-of-field (DoF).

The image is also spot on, with the details, pose and styling. This somehow met expectations.

In the last image generated, I tried a product shoot. Here is what I put in my prompt:

green grapes, colored commercial photography, splashing in water, studio lighting for food, hard light, beauty dish, hyper detailed

Figure 4. Product shoot with grapes

The result exceeded my expectations. For commercial photography, this is as good as it gets. In real life, a production shot like this for a commercial is difficult and takes time for the photographer. With AI it can be rendered in seconds.

Stunning Headshot Portraits

In this series of photos I go for stunning portraits in black-and-white. Typically head and shoulder shots, and character or celebrity style portraiture.

The first photographer I wanted AI to mimic is Annie Leibovitz. She is well known for taking amazing portraits of public figures and celebrities. For that I put in the following:

portrait photography by Annie Leibovitz, youthful, beauty, big smile, soft light

The result is a head-and-shoulders black-and-white portrait with high contrast. I got a wonderful smile and expression from the subjects, typical of her style.

The one thing I was not too fond of was the neck and chest area of the female model. I assume that the AI was constructing the image based on features of this part of the human body, but somehow exaggerated it.

Figure 5–6. Black and White Portrait Annie Leibovitz style

Next photographer up is Herb Ritts. He is a great fashion and portrait photographer, with a simple but strong composition.

I put in the following:

female fashion model, photography by Herb Ritts, portrait, smiling, detailed photograph, head and shoulders

I was not expecting the result generated. First of all, I am not quite sure this looks like something Herb Ritts would typically shoot. It is not something an expert in photography would say was shot in his style.

Next, the portrait style Ritts uses is mostly softer lighting in natural or ambient light, with little or no shadows. The lighting looked hard, like a beauty dish or silver reflector was used as a light modifier.

I am not complaining about the look of the model or composition. Since Ritts shot mostly celebrities and models, that is what the subject looked like. It is elegant and nearly perfect, but that is what makes it different from a typical Herb Ritts portrait.

Ritts’ style is more raw and definitely with an analog film look (e.q. grains present). The AI generated photo looked clearly digital, like an image shot from a smartphone camera and edited by a neural network processor.

Figure 7. Black and White Portrait Herb Ritts style

The portraits turned out well, but the former turned out closest in style to the photographer rather than the latter. The latter photo just looked too digital for a Herb Ritts style portrait.

Classic Editorials

In this set, I generate images in the style of classic editorial photographers. This will be photos of elegant fashion and styling, like that in popular magazines.

First up is to mimic the style of Mario Testino. I specify the following in my prompt:

female fashion model photography by Mario Testino, resort, summer, 150mm, realistic photograph, hard light

Testino has an edgy style, but the results generated were not so much. They look rather like a typical magazine editorial.

Is this what Mario Testino would shoot? Well, from the AI dataset knowledge of Testino’s work the pose and style may be correct, but I do not think the AI understands the context of lighting and composition in that regard.

From what I have seen of Testino’s work, the subjects are usually well lit. I could have specified soft or well lit instead of hard light. If you understand Testino’s style more, you can tweak the prompt to try and get the best result.

The scenes he creates also tell a story, but in these images it looks like the model is just posing. It looks like fancy snapshots for social media, rather than creative editorial images.

Figure 8–9. Mario Testino style 1 and 2

Next photographer is the great Helmut Newton. I thought why not have AI mimic a great photographer’s style and see how it would be interpreted.

Newton makes an association of his subject to the image in a film noir style. Whether it is art, architecture, sensuality or lifestyle the result can be provocative and at times controversial.

I put in the prompt:

female fashion model photography by Helmut Newton in casual style, resort, 35mm, film, sunlight

I got a result of something I would expect from Helmut Newton. The composition and style is there, but not so much the lighting (like in the Mario Testino mimic).

I also got black-and-white images, so I think the AI is sampling based on interpretation. If you do a search on Helmut Newton’s work, many of the image results are black-and-white photos.

Figure 10–11. Helmut Newton style

The AI generated images in Netwon’s racy style. There are bounds to how far the AI will take it, so there are restraints on what it can produce for general audiences.

Finally I am going to try the style of Pamela Hanson. Here are the words I put in my prompt:

female fashion model photography by Pamela Hanson, colored photography, similar to Christy Turlington as the model, high fashion, beauty

I got the result I was expecting. Pamela Hanson editorials are known for blending sophistication with beauty. I chose a model like Christy Turlington, since she has been featured in some of her editorials and would make a good starting point.

What I did not expect is the similarity the generated female model had to the real model. I did notice the face did not look too natural, more digitally rendered than something that would come from film.

Figure 12–13. Pamela Hanson style

The AI can easily generate the images based on the style of the photographer, but it lacks context and understanding about the photographer’s techniques.

The Swimsuit Issue

The last set consists of swimsuit photos in the style of Sports Illustrated’s Swimsuit Issue. What better way to test AI then in generating an image of the human body in swimsuits.

I did not specify any photographers for this set. Instead I put in the following:

female fashion model photography in bikini, sunlight, vibrant details, long legs, long hair, sports illustrated swimsuit issue

The first result looked like your typical swimsuit issue photo. The AI-generated model looks like the standard for cookie-cutter swimsuit supermodel. However, from another perspective the model just looks too unnatural and AI could could have added more meat to her bones.

Figure 14. Swimsuit Look 1 (Swimming Pool)

The next photo generated had a similar pose, but different location background. The body proportions on the model do not seem to represent reality. The coloring also looks off on some parts of the image.

I did notice that in this swimsuit model iteration, the AI gave the body more muscle and depth. You can see the details in the model’s abs, torso, hair, arms and hands. The second model had a more athletic and toned body, compared to the first model.

Figure 15. Swimsuit Look 2 (Beach)

It seems that AI models like SDXL are capable of generating human faces, but body parts have less accuracy. With more datasets, this is going to get more and more accurate.

The rendering of the shadows was quite realistic. This is how ambient occlusion techniques are applied in generating realistic images in video games. The shadowing brings contrast to the image in a way that it would appear in reality.

The scenery looks great, which is also typical in swimsuit issues. There were some errors I noticed in the background (inspect the photos and you will see). This is AI generating the scenery in its own interpretation, based on the training data.

Overall, the images are fine but not the greatest in terms of photorealism or accuracy.

Final Thoughts And Takeaways

While the main images I have shown turned out great, a lot of them actually turned out horrible.

The rendering of human hands, eyes and feet is a known issue with many AI image generating models. Even SDXL is not perfect.

Figure 16. Everything looks fine until you look at the feet and the eyes.

Sometimes the results are just too cringe for any acceptance.

Figure 17-18. Sometimes the AI renders extra fingers, extra limbs, feet for hands or totally deformed looking body parts.

This is why it is best to avoid generating full body images (as of the version of the model used). I got the best results from images that did not show hands, feet or full body renderings.

You can always edit the image yourself. Some of the errors can be touched up easily, but others will require more work.

I expect the rendering of human body parts to further improve, based on the datasets. It is possible at the moment, but takes time creating until you get a decent enough result.

One way to improve the image generation is to use negative prompts like:

bad hands, bad feet, bad eyes, mutated, malformed …

You can also give more weight to a particular word or phrase, by wrapping it in parenthesis with a value. The weight is a value that should be greater than zero but not more than 1.5 (according to the rules on prompt weight syntax in SDXL). For example:

A (beautiful model:1.3) posing in a (red dress:1.2) smiling.

A higher weight provides for more control over the image generated.

It is not that simple to fix though. Sometimes it can take several iterations on an image to get even decent looking hands or feet, so it is a time consuming process.

Optimize your prompt and fine tune it until you see the desired image. If you are not specific, the algorithm uses a generic weight. This is because there is bias in the training set based on what was sampled.

You need to focus on great prompts for control of your image. Since Stable Diffusion is based on a natural language text processor, it is important to be precise and detailed to get the best results.

Have fun and experiment.

Disclosure: All images presented were computer AI-generated and no images of actual persons or the work of known photographers were used. Some editing was also performed on the images for presentation in this article, so they are not the raw output from the AI software.

--

--

Vincent T.
High-Definition Pro

Blockchain, AI, DevOps, Cybersecurity, Software Development, Engineering, Photography, Technology