/ Workflows / LoRA + IPAdapter Stack: The 95 Percent Consistency Recipe

Workflows • June 2, 2026 • 16 min read

LoRA + IPAdapter Stack: The 95 Percent Consistency Recipe

LoRA alone caps near 85 percent identity match. Stack IPAdapter FaceID v2 on top and you cross 95. The exact node graph and the failures to avoid.

Here is a question I get asked constantly. "Should I train a LoRA for my AI influencer or just use IPAdapter?" The answer almost nobody publishes is, do both. The LoRA plus IPAdapter character consistency stack is what separates production-grade output from talented-amateur output, and the numbers back this up across every test I have run.

A character LoRA alone tops out around eighty-three to eighty-five percent identity match in my testing. IPAdapter FaceID v2 alone tops out around eighty-eight to ninety percent. Stack them properly and you cross ninety-five percent consistently. That is the difference between "this is clearly the same character" and "no human reviewer can tell these came from different sessions."

But the stacking is fragile. Get the order wrong, the weights wrong, or the conflict patterns wrong and your stack delivers worse output than either component alone. This guide walks through the full node graph, the two main conflict patterns nobody documents, and the measurement methodology so you can verify the lift on your own characters.

Quick Answer: A LoRA plus IPAdapter character consistency stack achieves around 95 percent identity match by combining a character-trained LoRA at 0.65 to 0.75 weight with IPAdapter FaceID v2 at 0.80 to 0.95 weight. The LoRA carries identity structure, IPAdapter carries facial feature precision, and the two layers complement rather than duplicate. Order the LoRA in the model loader chain first, then apply IPAdapter as a separate conditioning injection. Wrong order or balanced equal weights produce identity bleed.

Key Takeaways:

LoRA alone plateaus near 85 percent. Stacking IPAdapter pushes it past 95.
Load order matters. LoRA goes into the model first, IPAdapter is a separate conditioning layer.
Set LoRA weight first at 0.65 to 0.75. Then tune IPAdapter to compensate.
Equal weights cause identity bleed where the two layers fight rather than reinforce.
IPAdapter overriding LoRA style is the other common failure pattern.

The Two-Layer Theory LoRA Owns Identity, IPAdapter Owns Features

Look, here is the mental model that finally made stacking click for me. A LoRA does not see your character. A LoRA learns a statistical pattern from your training images that the model can re-summon when you trigger it. The pattern includes face structure, body proportions, hair tendency, and a lot of subtle pose and style biases from the training set.

IPAdapter FaceID v2 does see your character, but only one image of them. It injects facial feature precision into the attention layers at generation time based on that single reference. It is sharp but narrow.

The two layers do different jobs. LoRA owns the structural identity, the bones of the persona. IPAdapter owns the facial feature precision, the surface of the persona. When they cooperate, you get an identity that is both structurally stable (LoRA holding the bones) and surface-accurate (IPAdapter pinning the features). When they fight, the LoRA tries to reconstruct the face from learned pattern while IPAdapter overwrites with reference-based features, and the model produces a kind of feature-soup that is technically your character but unsettling to look at.

The fix is to make them complement, not compete. Lower LoRA weight, higher IPAdapter weight, and accept that you are using two different mechanisms for two different jobs.

Why LoRA Plateaus at Eighty-Five Percent

This is the part of the conversation that LoRA-only guides skip over. LoRAs work brilliantly until they do not, and then they wall up at a remarkably consistent ceiling.

In my testing across about forty characters I have trained, the identity score plateau sits right around eighty-three to eighty-five percent regardless of training quality. I have trained with twenty images, fifty images, two hundred images. I have run ten epochs, twenty epochs, fifty epochs. I have varied dim and alpha across every reasonable range. The ceiling sits.

The reason is mechanistic. A LoRA is a low-rank adaptation, meaning it modifies the base model with a small set of additional parameters. Even at high rank (128, 256), the adaptation has finite capacity. The base model is doing most of the work and the LoRA is steering. Steering is powerful but limited. The face that comes out is "in the family" of your character, but the specific configuration of features in any given generation is being decided by the base model's general face-generation logic with your LoRA pushing the result toward your training distribution.

What this means in practice is that your character ends up with the right hair color, the right face shape, the right body type, the right vibe. But the specific eye-spacing, nose-tip angle, lip line are being computed fresh every generation. They land in the neighborhood of your training set but not on the exact coordinates.

IPAdapter is the fix for this. It injects exact feature coordinates from a reference image rather than learned statistical patterns. The two layers stack because they operate on different parts of the model's behavior.

Building the Stack Node by Node in ComfyUI

Here is the actual node graph. I am going to be specific because everyone gets the order wrong the first time.

Start with a Load Checkpoint node loading your base model. Connect the MODEL output to a LoraLoaderModelOnly node. Set your character LoRA in this loader with weight 0.70 (we will tune this in a moment). Connect the MODEL output from the LoRA loader to a IPAdapter Unified Loader node, then to a IPAdapter FaceID v2 node.

The IPAdapter FaceID v2 node takes the image input from your reference. Set weight to 0.85. Set the FaceID LoRA companion in the unified loader to weight 0.65. The MODEL output from the IPAdapter node connects to your KSampler.

Conditioning side. Your CLIPTextEncode positive and negative go directly to the KSampler. You do not need to feed the IPAdapter output into the conditioning. The model patch is doing the work via attention layer modification.

That is the whole graph. Five nodes (Checkpoint, LoRA loader, IPAdapter unified loader, IPAdapter FaceID v2, KSampler) plus your normal prompting and sampling setup. Cleaner than people expect.

One mistake I made for months. I was loading the IPAdapter before the LoRA in the model chain. Wrong. The LoRA needs to go in first because the character pattern needs to be baked into the model state before IPAdapter applies its attention layer injection. Reverse order and the IPAdapter conditioning fires against the base model state without the character LoRA's structural pattern primed, which causes IPAdapter to over-correct toward the reference image and you get the override-style problem I describe below.

Setting the LoRA Weight First, Then IPAdapter

Order of tuning matters as much as order of nodes. Here is the sequence I follow when dialing a new character stack.

Step one, lock the LoRA weight before touching IPAdapter. Set IPAdapter weight to zero (or just bypass the IPAdapter nodes). Generate a few test shots and tune the LoRA weight until the character feels structurally right. For most character LoRAs this lands between 0.65 and 0.75. Lower if the LoRA was over-trained (you will see this as expression flatness or pose rigidity). Higher if the LoRA feels weak.

Step two, enable IPAdapter at 0.80 and generate the same test shots. The face should now sharpen and lock to your reference. If you get plastic-looking skin or a face that feels pasted on, drop IPAdapter to 0.75 and re-check. If the face still feels structurally off but the features are precise, the LoRA weight is too low and you should bump it to 0.75.

Step three, run the stack at full and check for the conflict patterns I describe in the next two sections. If you see double identity bleed or IPAdapter override, you need to adjust further.

The reason to tune LoRA first is that the LoRA defines the structure your character lives in. IPAdapter without a LoRA foundation is fighting the base model's general face logic. With a LoRA foundation, IPAdapter is fine-tuning a structure that is already most of the way there. That is the stack working.

Conflict Pattern One Double Identity Bleed

This is the failure pattern I see most often when people stack for the first time. Symptom is a face that looks "almost right but uncanny." You can see your character but something is off and you cannot quite pin what.

What is happening is that both layers are at maximum confidence and they are predicting slightly different feature configurations. The LoRA wants to put the nose tip at coordinate A. IPAdapter wants the nose tip at coordinate B. The model resolves by averaging, which puts the nose tip at neither A nor B and creates a face that lives in the uncanny valley between your two reference points.

Fix one. Drop LoRA weight to 0.60 and bump IPAdapter to 0.90. Now IPAdapter is dominant and the LoRA is providing structural background rather than competing on feature placement. This is my default for portrait-range shots.

Fix two. Drop IPAdapter weight to 0.70 and bump LoRA to 0.80. Now LoRA is dominant and IPAdapter is providing surface precision rather than competing on feature placement. This works better for stylized characters where the LoRA captures a specific visual aesthetic you want to preserve.

You cannot run both layers at maximum confidence. One has to lead. The version of the stack that crosses 95 percent has a clear hierarchy, not a balanced fifty-fifty split.

Conflict Pattern Two IPAdapter Overriding the LoRA Style

This pattern shows up when the LoRA was trained on a specific style (anime, illustration, cinematic photography) but IPAdapter is being fed a reference image in a different style. Symptom is that the character identity holds but the visual style of your output drifts toward the IPAdapter reference, not the LoRA style you trained.

Example. You trained an anime-style LoRA of your character from anime-style training images. You apply IPAdapter with a real-photo reference of the underlying face shape. IPAdapter pushes the generation toward photographic features and your "anime" output starts looking like a realistic person with anime hair color, not an anime character.

Fix. Use a stylized reference for IPAdapter that matches your LoRA's training style. If your LoRA is anime, your IPAdapter reference should be one of the anime-style outputs your LoRA already produces, not a real photo. Yes, this means you train the LoRA first, generate a clean reference image, then feed that back into IPAdapter for stacking.

This is counterintuitive and most tutorials get it wrong. IPAdapter does not care about photorealism, it cares about style consistency between its reference and its target. Match the styles and the override problem disappears.

For mixed-style work where you want the IPAdapter to provide cross-style identity transfer, accept that you are doing something more advanced. You will need to drop IPAdapter weight to 0.55 to 0.65 to let the LoRA style dominate, and accept that identity precision will be slightly lower.

Measuring Consistency Score Before and After the Stack

You cannot improve what you do not measure. Here is the simple methodology I use to verify a stack is actually working.

Generate twenty test images at LoRA only. Generate twenty test images at IPAdapter only. Generate twenty test images at the stack. Same prompts across all sixty. Mix of portrait, half body, full body framing in each batch.

Score each image on a three-point scale. Three for "same character, no question." Two for "same character if you squint." One for "related character, not same." Zero for "different character entirely."

Average the scores in each batch. Multiply by 33.3 to get a percentage. My typical numbers across about a dozen characters I have tested this on:

LoRA only averages around eighty-four percent (2.52 average score).

IPAdapter only averages around eighty-nine percent (2.67 average score).

Stack averages around ninety-five percent (2.85 average score).

The stack delivers measurably better than either component alone, and the lift is biggest at half body and full body framing where IPAdapter weight tuning matters most. I covered the framing question in detail in my IPAdapter weight tuning guide and it pairs directly with this stack methodology.

Do not skip the measurement step. Without it, you are running on vibes and you cannot tell whether your stack is actually adding value or whether you are just generating more output and remembering the hits.

ControlNet on Top for Pose Without Breaking Identity

Once you have the LoRA plus IPAdapter stack dialed, ControlNet becomes the third layer for pose control. This is where things get really productive.

Add a ControlNet model loader and a ControlNet Apply node in the conditioning chain. Use OpenPose for pose extraction from a reference image. Set ControlNet strength to 0.6 to 0.8 depending on how strict you want pose adherence. Lower strength preserves more model freedom and looks more natural. Higher strength locks pose tight at the cost of some natural variation.

The combined stack (LoRA plus IPAdapter plus ControlNet pose) is what production-grade AI character workflows actually look like in 2026. Identity is locked by the first two layers, pose is locked by the third layer, and you have explicit control over every axis that matters for influencer content production.

A note. ControlNet stacks with the LoRA and IPAdapter layers cleanly because it operates on a different mechanism (it modifies conditioning rather than model state). You can pile on canny edge ControlNet, depth ControlNet, and pose ControlNet all at once if your scene requires it, though I rarely use more than one ControlNet at a time in production.

Recreating This as a Single Apatero Workflow Tab

The reason we built persona-lock workflow inside Apatero AI was specifically to compress this stack into one operation. The LoRA loader, the IPAdapter setup, the weight tuning, the ControlNet integration. All wired into one tab where you upload a reference, optionally upload training images for an automatic LoRA bake, and generate against the stack.

The numbers are the same as the manual ComfyUI stack. The grid I ran on Apatero AI personas hit ninety-four percent average identity score against the same test prompts, statistically tied with the hand-tuned ComfyUI stack. The difference is setup time. Manual ComfyUI stack takes about three hours the first time you configure it (downloading nodes, model files, learning the graph). The hosted version takes about ten minutes.

For solo creators producing volume, the time saving is what matters. For deep technical workflows where you need access to every parameter, the manual ComfyUI route gives you that control. Both work. The math is the math. Either way you stack LoRA plus IPAdapter and you cross ninety-five percent identity match on your character output. The official cubiq IPAdapter plus repo has all the manual nodes you need and the documentation is solid.

FAQ

Do I Need a Custom-Trained LoRA for This Stack to Work?

Yes. The stack relies on a character-specific LoRA. A general style LoRA or no LoRA will not deliver the structural identity layer the stack depends on. Training a basic character LoRA takes about two hours on a 4090 or three to five hours on a 3070.

Can I Use a Civitai Community LoRA Instead of Training My Own?

Only if the LoRA was trained on the specific character you want. Generic "young woman" LoRAs do not provide character-specific identity structure. If you find a community LoRA for a public-domain or licensed character that matches what you want, yes, you can use it for the structural layer.

What If I Do Not Want to Train a LoRA at All?

Run IPAdapter only at 0.90 weight with the FaceID LoRA companion at 0.65 and you will land around eighty-nine percent identity. Not 95, but workable. The character sheet approach from my character sheet workflow guide helps bridge some of the gap by providing multiple reference angles.

How Much VRAM Does the Stack Consume?

LoRA adds maybe 200MB to VRAM use. IPAdapter adds another 600MB to 1GB depending on the model. The stack runs comfortably on 10GB VRAM cards but is tight on 8GB cards. ControlNet on top pushes you to want at least 12GB.

Will This Work With Flux Models?

Yes, with adjustments. Flux LoRAs train differently than SDXL LoRAs and the IPAdapter for Flux uses different weight ranges. The stacking principle holds but the specific weights I quoted above are SDXL Juggernaut XL numbers. For Flux, expect to bump weights by 0.10 to 0.15 across both layers.

How Long Does Generation Take With the Stack?

About fifteen to twenty percent slower than base model alone. IPAdapter conditioning adds a small overhead, LoRA loading adds almost none. On a 4090 a 1024x1024 image takes maybe ten seconds with the stack vs eight seconds with base only. The quality lift is worth the latency.

What Happens if I Add a Second IPAdapter to the Stack?

Mixed results. Two IPAdapters can stack for things like style transfer plus identity, but identity-plus-identity rarely improves things. The second face conditioning fights the first. I do not recommend stacking two FaceID v2 nodes on the same identity. Use one strong reference instead.

Can I Switch the LoRA Without Rebuilding the Stack?

Yes. The LoRA loader node accepts any compatible LoRA file. You can build the stack once and swap LoRAs to switch characters, keeping the IPAdapter reference as a paired reference for each character. This is the production pattern I use for multi-character work.

How Do I Know if My LoRA Is Over-Trained?

Over-trained LoRAs flatten expression range, lock pose to training-set poses, and resist prompt-driven variation. If you cannot get the character to smile when the LoRA was trained on serious shots, the LoRA is over-fit. Re-train with fewer epochs or lower LR.

Wrap Up

The LoRA plus IPAdapter stack is the workflow that takes you from amateur character drift to production-grade identity lock. Train the LoRA, dial it to 0.65 to 0.75, stack IPAdapter FaceID v2 at 0.80 to 0.95 with the LoRA companion at 0.65, and verify with a real measurement methodology.

If you would rather not assemble the node graph by hand, Apatero AI ships this exact stack as a single workflow tab. Either way, the unlock is real and the numbers cross 95 percent identity. That is what separates a real persona from a vaguely-related family of generations.

#lora #ip-adapter #character consistency #comfyui #workflow stack

Workflows • May 26, 2026

Character Sheet From One Reference: Step by Step

Turn one selfie or render into a full turnaround sheet the AI can lock to. Front, three-quarter, side, back, plus expression strip. Real workflow.

#character consistency #character sheet

Twelve picture-book pages showing the same protagonist character across different scenes and poses

Workflows • June 23, 2026

Children's Book Character Lock: Twelve Pages, Same Kid

The LoRA-vs-IPAdapter decision tree, the page-by-page prompt template, and the rescue strategy for the inevitable page-eight drift in AI children's books.

#childrens book #character consistency

Flux Kontext outfit swap workflow showing identical face and pose with three different outfits

Workflows • June 19, 2026

Flux Kontext Outfit Swap: Preserve Face, Change Clothes

The exact phrasing that swaps outfits in Flux Kontext while keeping the face and background locked. The one phrase that always breaks it, and the rescue.

#flux kontext #outfit swap