Fast and Easy Image Generation with Fabric and OpenAI

An abstract design featuring vibrant, swirling lines in shades of blue, pink, and purple, combined with geometric shapes and mathematical equations. In the lower right corner, a computer keyboard and some colorful pencils are illustrated against a textured beige background, creating a visually engaging and artistic composition.

Introduction

In my pre­vi­ous arti­cles, I intro­duced Fab­ric and explained how this tool can be inte­grat­ed into work­flows on Mac, iPad, and iPhone. Fab­ric gen­er­ates an opti­mized prompt for a giv­en prob­lem through text inputs, such as sim­ple entries in the com­mand line or by extract­ing infor­ma­tion from text files, in con­junc­tion with a tem­plate referred to as a pat­tern. This prompt is then sub­mit­ted to a large lan­guage mod­el (LLM), which sub­se­quent­ly returns a text out­put.

Image Generation

A work­flow that I use quite often is the cre­ation of an arti­cle image using a spe­cif­ic pat­tern tai­lored to the style of this blog. Until now, I have cre­at­ed the arti­cle images for this blog man­u­al­ly using Sta­ble Dif­fu­sion or Chat­G­PT. To do this, I described the top­ic of the arti­cle in key­words and con­veyed the desired image style as well as the type of com­po­si­tion to the tools, sub­se­quent­ly sav­ing the result man­u­al­ly. With Fab­ric, I can fur­ther auto­mate this process by gen­er­al­iz­ing these style and com­po­si­tion descrip­tions as a pat­tern. This way, the text of an arti­cle can sim­ply be passed to Fab­ric, and the fin­ished image will be gen­er­at­ed as a result.

Creation of an Article Image

Orig­i­nal­ly, I intend­ed to imple­ment the image gen­er­a­tion in Python. How­ev­er, the Python exam­ple on the Ope­nAI web­site failed to gen­er­ate images in land­scape for­mat. So, I adapt­ed the curl exam­ple to meet my require­ments and inte­grat­ed it into a shell script. The fol­low­ing pre­req­ui­sites must be ful­filled:

The Pattern

Let us begin with the pat­tern opti­mized for this blog. To do this, I dupli­cat­ed an exist­ing pat­tern fold­er as a tem­plate. The fold­er $HOME/.config/fabric/prompts/create_art_prompt suit­ed my needs well. I renamed the copy of the fold­er to create_blog_image, which simul­ta­ne­ous­ly serves as the name under which the new pat­tern will be invoked in Fab­ric. I then replaced the con­tents of the file system.md in the fold­er with the fol­low­ing con­tent.

# IDENTITY AND GOALS

You are an expert graphic designer and AI whisperer. You know how to take a concept and give it to an AI and have it create the perfect piece of drawing for it.

Take a step back and think step by step about how to create the best result according to the STEPS below.

STEPS

- Think deeply about the concepts in the input.

- Think about the best possible way to capture that concept visually in a compelling and interesting way.

OUTPUT

- Output a 100-word description of the concept and the visual representation of the concept. 

- Write the direct instruction to the AI for how to create the drawing, i.e., don't describe the drawing, but describe what it looks like and how it makes people feel in a way that matches the concept. the style, colors, mood and composition description below

- Style: Vibrant and dynamic with a mix of modern digital and vintage elements.

- Composition: Flowing, ribbon-like elements intertwined with detailed sketches of mathematical equations and geometric shapes. The background features aged parchment paper with modern digital elements like a keyboard and computer code on a screen.

- Colors: Bright, neon colors such as blues, reds, and purples contrasted against warm sepia tones.

- Mood: Abstract and technological, with a futuristic feel emphasized by the bold, futuristic font of the word “FABRIC” integrated into the design.

- Include nudging clues that give the piece the proper style, .e.g., "Like you might see in the New York Times", or "Like you would see in a Sci-Fi book cover from the 1980's.", etc. In other words, give multiple examples of the style of the art in addition to the description of the art itself.

INPUT

INPUT:

With this pat­tern, a prompt can already be cre­at­ed that can be used in Chat­G­PT or Sta­ble Dif­fu­sion:

cat $HOME/Documents/Blog/new_blog_post.md | fabric -sp create_blog_image

Thus, the first part is com­plete.

The Script

The script is intend­ed to pass the gen­er­at­ed prompt to DALL·E 3 and process the response. This response con­sists of a JSON pay­load that includes, among oth­er things, an URL that point­ed to the gen­er­at­ed image and the revised_prompt, which is the prompt actu­al­ly uti­lized by DALL·E 3.

As pre­vi­ous­ly men­tioned, the com­mand-line tool jq is required for pro­cess­ing the input and out­put to Ope­nAI, which can be installed using Home­brew:

brew install jq

In addi­tion to the image, the script will, at least in the test phase, save the prompt gen­er­at­ed by Fab­ric as well as the revised_prompt for analy­sis pur­pos­es and to opti­mize the pat­tern. Lat­er, these lines can be uncom­ment­ed or delet­ed.

Until now, I have no found a solu­tion for gen­er­at­ing mean­ing­ful names for the files, so I used time­stamp to name them. Addi­tion­al­ly, the image will be opened at the end in a pro­gram des­ig­nat­ed for PNG files, such as the Pre­view app.

This leads to the fol­low­ing script:

#!/bin/zsh

# Check if data is being piped into the script
if [ -t 0 ]; then
	# If no input is received via pipe...
  echo "Es wurde keine Eingabe gepiped" 
  exit 1
else
  # Read all piped input
  prompt=$(cat -)
fi

# Your OpenAI API key should be set as an environment variable
api_key="$OPENAI_API_KEY"

# Create the JSON payload
json_payload=$(jq -n \
  --arg model "dall-e-3" \
  --arg prompt "$prompt" \
  --argjson n 1 \
  --arg size "1792x1024" \
  '{model: $model, prompt: $prompt, n: $n, size: $size}'
)

# Get the current date and time for the filename
timestamp=$(date +"%Y%m%d_%H%M%S")

# Execute the curl command and save the response
response=$(curl -s https://api.openai.com/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $api_key" \
  -d "$json_payload")

# Extract the image URL from the response
image_url=$(echo $response | jq -r '.data[0].url')

# Extract the revised prompt from the response
revised_prompt=$(echo $response | jq -r '.data[0].revised_prompt')

# Save the image using the date and time in the filename
curl -s "$image_url" -o "$HOME/Pictures/CreateImage/image_${timestamp}.png"

# Save the used prompt in a text file
echo "$prompt" > "$HOME/Pictures/CreateImage/prompt_${timestamp}.txt"

# Save the revised prompt in a text file
echo "$revised_prompt" > "$HOME/Pictures/CreateImage/revised_prompt_${timestamp}.txt"

# Display the image
open "$HOME/Pictures/CreateImage/image_${timestamp}.png"

# Output success messages
echo "Bild gespeichert als image_${timestamp}.png"
echo "Prompt gespeichert als prompt_${timestamp}.txt"
echo "Überarbeiteter Prompt gespeichert als revised_prompt_${timestamp}.txt"


# If the created image file should be piped to the next step, uncomment the success messages and uncomment the following line # echo image_${timestamp}.png

What to Do with the Scripts

Scripts are text files that typ­i­cal­ly can­not be exe­cut­ed direct­ly. To sim­pli­fy the invo­ca­tion of a shell script, I copy them into a direc­to­ry that is includ­ed in the shel­l’s search path. On a Mac, these are usu­al­ly:

$HOME/Applications 
$HOME/bin
$HOME/.local/bin

When scripts are stored in one of these direc­to­ries, they are acces­si­ble only to the cur­rent user. If a script is to be made avail­able to all users on a com­put­er, the appro­pri­ate paths are:

/usr/local/bin
/opt/bin

Admin­is­tra­tor priv­i­leges are required for copy­ing and manip­u­lat­ing files in these loca­tions.

To test which paths are already spec­i­fied in the search path, you can use the com­mand:

echo $PATH

I pre­fer the fold­er $HOME/Applications for my per­son­al pro­grams and scripts. Since this fold­er is also used for web apps cre­at­ed by browsers, such as those added to the Dock in Safari via “File → Add to Dock”, it is usu­al­ly already present.

If this fold­er is not includ­ed in the search path, it can eas­i­ly be added with the fol­low­ing com­mand:

echo 'export PATH=$PATH:$HOME/Applications' >> $HOME/.zshrc`  
source $HOME/.zshrc`

The script can then be saved in the fold­er, for exam­ple, under the name CreateImage, and made exe­cutable with:

chmod +x $HOME/Applications/CreateImage

Another Environment Variable

In order to cre­ate an image using the Ope­nAI API call, the Ope­nAI API key must be stored as an envi­ron­ment vari­able in the shell con­fig­u­ra­tion file. The script will then read it from there. The advan­tage of this approach is that the key does not need to be spec­i­fied in the code of every pro­gram that calls Ope­nAI, thus pre­vent­ing acci­den­tal expo­sure of the key in a blog post like this or on GitHub.

# Added for OpenAI Apps
echo 'export OPENAI_API_KEY="Your OpenAI Key"' >> $HOME/.zshrc
source $HOME/.zshre

The First Test Run

With this com­mand, you can now test whether the image gen­er­a­tion with Fab­ric and the script works:

echo "Create an image of two parrots on a skyscraper roof" | CreateImage 

You should see a con­fir­ma­tion in the ter­mi­nal that the image and both prompts have been saved. Addi­tion­al­ly, the cre­at­ed image will open in the Pre­view app:

Now that all com­po­nents have been cre­at­ed, the work­flow can be test­ed with the draft of this arti­cle:

cat $HOME/Documents/Blog/article_draft.md | fabric -sp create_blog_image | CreateImage

With that result:

Further Optimization

How­ev­er, this still requires too much typ­ing. There­fore, this lengthy com­mand line invo­ca­tion can be encap­su­lat­ed in a shell script:

#!/bin/zsh
# Check if an argument has been provided
if [ $# -eq 0 ]; then
    echo "Please provide a file."
    exit 1
fi
# Check if the provided argument is a file
if [ ! -f "$1" ]; then
    echo "The provided argument is not a file."
    exit 1
fi
# If an argument has been provided and it is a file, execute the command
cat "$1" | fabric -sp create_blog_image | CreateImage


Saved in the $HOME/Applications fold­er, for exam­ple as make_article_image, and marked as exe­cutable, the invo­ca­tion then sim­pli­fies to:

make_article_image /path/to/article.md

For my use case, I am cur­rent­ly seek­ing a solu­tion that I can direct­ly invoke in Obsid­i­an to cre­ate the image from the active note. There are sev­er­al can­di­dates that might sup­port this, such as the “Tem­plater” or “Shell Com­mands” plu­g­ins, but that will be the next step.

Conclusion

This exam­ple of how Fab­ric can be inte­grat­ed into a use­ful work­flow should be under­stood as just that: an exam­ple for per­son­al devel­op­ment. The pat­tern must be tai­lored to indi­vid­ual needs — I cer­tain­ly want my image blog style to be con­sis­tent­ly reflect­ed every­where 😉 . Opti­miz­ing such a pat­tern will undoubt­ed­ly require sev­er­al iter­a­tions and time. It is worth­while to take a clos­er look at the two gen­er­at­ed prompts to see which levers need to be adjust­ed, as the say­ing goes in Ger­man.

I hope these ideas are nonethe­less help­ful, and as always, I wel­come com­ments, whether regard­ing poten­tial errors, improve­ments, praise, or crit­i­cism.

Leave a Reply

Your email address will not be published. Required fields are marked *