CORE Explained: How to Improve Visibility in LLM-Based Search, Step by Step

A deep dive into our paper on controlling output rankings in generative engines

This post explains, step by step, how our paper CORE approaches one of the newest techniques for improving visibility in LLM-based search: controlling the output ranking of generative engines. Full paper: arXiv:2602.03608.

Key definitions

  • Generative engine — an LLM-based search system that returns a synthesized answer or a short ranked recommendation set rather than a list of links.
  • Output ranking — the order of items in that generated recommendation.
  • Optimization content — text appended to retrieved content to influence the ranking.

How CORE works, step by step

  1. Treat the system as a black box. CORE assumes no access to the model weights or the search engine internals.
  2. Pick the optimization surface. The realistic lever a content owner controls is the content the search engine retrieves — so that is what CORE optimizes.
  3. Append optimization content. CORE adds strategically designed content of three types (below) to steer how the LLM ranks items in its answer.
  4. Measure promotion. CORE evaluates whether a target item is promoted into the top-K of the generated ranking, on the ProductBench benchmark.

The three optimization-content types

  1. String-based — compact textual additions.
  2. Reasoning-based — comparative reasoning that helps the LLM when it ranks.
  3. Review-based — review-style supporting evidence.

Results

Metric (avg across 15 categories)CORE
Promotion Success Rate @Top-591.4%
Promotion Success Rate @Top-386.6%
Promotion Success Rate @Top-180.3%
LLMs evaluatedGPT-4o, Gemini-2.5, Claude-4, Grok-3

CORE outperforms existing ranking-manipulation methods while preserving the fluency of the optimized content.

FAQ

Do I need access to the LLM? No — CORE is black-box and only changes retrieved content.

What is ProductBench? Our benchmark: 15 product categories, 200 products each, paired with top-10 recommendations from Amazon’s search interface.

Which content type should I think about first? The paper studies all three (string / reasoning / review); each is effective, and they target different parts of how the LLM forms its ranking.

Read the paper

arXiv:2602.03608