I know it’s bad form to start off with a disclaimer: but the truth is, I do not know what I am doing. I am just testing out two new ComfyUI nodes, PerturbedAttentionGuidance and PerpNegGuider.

About PAG and Perp-Neg

Pert-what and Perp-what?

What is Perturbed-Attention Guidance (PAG)?

Quoting from the official implementation repo of the paper “Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance”:

Perturbed-Attention Guidance significantly enhances the sample quality of diffusion models without requiring external conditions, such as class labels or text prompts, or additional training. This proves particularly valuable in unconditional generation settings, where classifier-free guidance (CFG) is inapplicable. Our guidance can be utilized to enhance performance in various downstream tasks that leverage unconditional diffusion models, including ControlNet with an empty prompt and image restoration tasks like super-resolution and inpainting.

What is Perp-Neg (“perpendicular component of the negative prompt”)?

Quoting from the Perp-Neg sampling with Stable Diffusion repo, as presented in the paper “Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond”, and my interest is piqued because of the last sentence:

The proposed Perp-Neg alleviates the Janus problem in text-to-3D generations, and achieves view generation in a more precise way in the text-to-image generation task. Beyond that, it can also be used to generate natural 2D images while eliminating undersired [sic] attributes from the negative text descriptions.

(Aside: I am led to believe there is actually no such thing as a negative prompt)

Installation

Nothing to install, just make sure ComfyUI is updated to the very latest version, since per the commit log these nodes were implemented around 15 and 16 April 2024.

If you, like me, use ComfyUI Standalone Portable for Windows, just run update_comfyui.bat found in the the update folder.

Workflow

Using the same positive and negative prompt, and (hopefully) the same sampler, scheduler and seed, the workflow below compares the output images generated by SDXL (in this case, Lykon/dreamshaper-xl-lightning:

  • The blue nodes are just standard SDXL.
  • The green nodes pass the SDXL model through the PerturbedAttentionGuidance node first, at scale 1.0.
  • And, the purple nodes pass the PerpNegGuider, at neg_scale 1.0, GUIDER output to the SamplerCustomAdvanced node.

I note that putting both these nodes in the workflow, even if they are not connected does change the output compared to when the workflow only has one of these node paths enabled. I suspect the nodes influence the model as long as connected. Not sure about this, I cannot replicate!

ComfyUI Perturbed-Attention Guidance (PAG) and Perp-Neg workflow

Comparing the original (blue) output vs PerturbedAttentionGuidance (PAG) (green) output: I can see a rainbow, which is what I wanted. More testing needed, but... in this example, PAG did improve prompt adherence!

Comparing the original (blue) output vs the PerpNegGuider (purple) output: What do you know, the brick wall texture is really gone. Again more testing needed, but... in this example, Perp-Neg did use the negative prompt to correctly condition the model!

Base model
With PAG
◁▷
With PAG
With Perp-Neg
◁▷

What do you think?