CountLoop

Iterative Agent Guided High Instance Image Generation

Anindya Mondal1, Ayan Banerjee2, Sauradip Nag3, Josep Lladós2, Xiatian Zhu1, Anjan Dutta1
1 University of Surrey   2 Computer Vision Center, UAB   3 Simon Fraser University
CountLoop Teaser

Abstract

CountLoop is a training-free framework capable of generating a high number of instances with precise layout control and high aesthetic quality. Our method leverages iterative agent guidance, layout conditioning, and cross-instance texture consistency to achieve state-of-the-art results in high-instance image generation tasks.

Key Features

Pipeline Overview

CountLoop Pipeline

Given a text prompt, the Layout Designer constructs a planning graph encoding object attributes and spatial relations, which is converted into a pixel-aligned layout. Guided by instance masks and cumulative latent composition with an IP-Adapter, the image is synthesized. A Design-Critic evaluates the result and updates the planning graph via an iterative feedback loop. This loop repeats until the count and quality goals are met.

Visual Results

Sample generations and benchmark comparisons. See paper for more details.

Benchmarks & Evaluation

Count Accuracy vs Number of Objects

Counting and Aesthetic Quality Across Four Benchmarks

Comparing counting and aesthetic quality across four benchmarks. For every dataset we report Counting—split into F1 score and Accuracy—and Spatial (aesthetic quality).
Type Model COCO-Count T2I-Compbench CountLoop-S CountLoop-M
F1Acc.Spatial F1Acc.Spatial F1Acc.Spatial F1Acc.Spatial
T2ISDXL71.8742.130.3884.3644.000.7555.4024.490.6377.8467.250.55
FLUX84.7354.190.5390.7557.000.7849.0829.590.6579.9978.000.58
SD 3.583.9750.560.4688.5650.000.7654.9633.670.6479.9177.190.56
Counting Guidance67.5418.500.6371.4117.500.5636.6710.200.4764.4225.900.41
GPT-4o92.9172.500.5594.1968.000.8049.4539.640.6979.1050.110.60
AgenticGenArtist75.4045.500.4585.3355.820.7051.0030.560.6077.8770.340.57
SLD90.3469.900.7091.5065.500.7755.0440.070.7582.4674.350.65
RPG-DiffusionMaster84.8960.730.6091.3260.000.7551.8934.380.7080.1671.460.62
L2ILMD54.6929.810.2471.4435.500.7349.2428.570.6680.2877.670.64
MIGC73.8236.110.3671.4733.000.6554.1625.170.6581.0679.080.62
CountGen58.9950.000.6163.7519.780.7548.1841.400.7272.0045.330.69
CountLoop (Ours)98.4793.330.9395.3878.500.7960.0055.000.9785.4383.670.73

BibTeX Citation

    @article{Mondal et.al., 
    title={CountLoop: Iterative Agent Guided High Instance Image Generation}, 
    url={https://openreview.net/forum?id=NZ0H1XtcZG}, 
    author={Mondal, Anindya and Banerjee, Ayan and Nag, Sauradip and Llados, Josep and Zhu, Xiatian and Dutta, Anjan},
    language={en}}