Lightshade Model Guide

The ultimate guide to every AI model on Lightshade for HEAVY ROLE-PLAYING with DEEP LORE AND WORLD-BUILDING. Your experience with these models for other purposes (like conversation) may differ. Maintained by AegisH and Dia. Community-made, not affiliated with the Lightshade team.

:img_0117: LumenBench Scores

All models evaluated on the LumenBench v2.5 benchmark. Categories ordered by weight. Green cells = highest score in that row. Red cells = lowest score in that row. Yellow boxes indicate scores for models that are unreleased, and thus are not compared to the scores of the public models. Blue indicates a foreign model (benchmarked on a severely nerfed version of the evaluation role-play), whose scores are not directly comparable to the far more capable Lightshade models for obvious reasons. AegisH just included them because he felt like it. Swipe Variety is evaluated separately from the main chats.

Benchmark	Max	🍊 Orange S1	🍊 Orange S2	🍋‍🟩 Lime	🪶 Pro (FT)	🖊️ Lite (NEW)	✏️ Nano	🔂 D-Loop Lite	🗑️ SITT	🎑 Miko Lite	(c.ai) 🐀 **PSQ2	♾️ Miko P1
Summary		Strong — first complete run, creative Confrontation	DID NOT FINISH — recursive loop, 10 beats missing, Broken	Weak — many blocked word violations, catastrophic phrase repetition	Eh — 6 beats missing, duplicate response, strong per-beat quality	Strong — Stronghold missing, wrong Betrayal source	Strong — benchmark leader, zero blocked word violations	Strong — 18 responses, deprecated Lite reborn via safety training	Eh — all 12 beats, correct Betrayal, 17-21 blocked word violations, severe phrase repetition	Strong — all 12 beats, zero blocked word violations, one continuity error	DID NOT FINISH — two repetition loops, 10 beats missing, lowest score in benchmark	Strong — all 12 beats, every swipe writes a full ending. Lightshade’s Claude Mythos.
CHARACTER PORTRAYAL	550
Voice Distinction	185	105	55	40	120	115	130	125	108	118	20	138
Character Consistency	150	95	25	45	100	100	105	100	88	100	15	115
Character Development	95	55	2	15	38	55	65	58	52	55	0	70
Relationship Dynamics	60	38	5	12	35	40	42	42	32	38	2	43
Dialogue Quality	60	32	18	12	40	42	42	42	34	40	5	46
CREATIVE EXPRESSION	425
Originality	150	78	12	18	75	85	82	78	62	78	10	92
Vocabulary and Imagery	120	60	15	10	62	68	72	66	38	68	8	80
World-Building	95	62	12	20	55	52	68	58	50	58	10	66
Descriptive Craft	60	38	10	8	42	38	44	40	25	38	5	42
INSTRUCTION COMPLIANCE	300
Storyline Adherence	125	55	5	52	20	60	95	82	88	82	5	105
Blocked Word Avoidance	100	45	55	2	40	5	70	42	3	70	40	48
Bot Config Respect	75	48	18	25	48	48	58	50	48	55	10	58
CONTEXT AND MEMORY	300
Early Detail Recall	100	68	12	35	70	72	80	72	68	78	15	85
Character Detail Retention	110	72	15	35	78	78	85	80	74	82	15	86
World State Tracking	90	58	5	30	55	65	68	65	58	55	10	70
STORY STRUCTURE	200
Plot Coherence	70	48	5	30	30	48	55	50	45	46	5	55
Pacing	50	22	0	10	15	28	34	30	20	28	0	33
Scene Transitions	35	22	2	12	18	22	26	24	20	24	2	25
Engagement	35	22	3	5	20	25	28	26	20	25	0	27
IMMERSION INTEGRITY	160
Frame Maintenance	65	48	28	35	50	55	52	50	42	52	30	55
Tone Consistency	50	36	15	18	38	38	42	40	38	38	20	42
Narrative Momentum	45	18	0	5	12	35	35	34	18	34	0	36
GROUP DYNAMICS	140
Screen Time Distribution	55	38	15	22	40	40	45	42	38	40	8	45
Multi-Character Scenes	50	34	8	18	38	37	38	38	32	35	8	39
Inter-Character Relationships	35	22	3	8	22	24	26	24	20	22	2	24
DEFINITION UTILIZATION	125
Backstory Integration	50	32	5	15	32	27	42	36	30	32	2	40
Persona Detail Usage	40	25	10	12	26	22	32	26	24	24	2	30
Definition Depth	35	20	3	8	20	16	28	22	16	18	2	26
SWIPE VARIETY	200
Plot Variation	80	42	15	15	30	55	35	48	30	40	20	38
Character Variation	65	32	18	12	25	42	30	35	25	30	15	32
Prose Variation	55	22	12	6	18	28	30	25	16	22	12	24
RESPONSE FORMATTING	100
Dialogue-to-Narration Balance	45	20	12	10	22	30	30	28	18	28	5	33
Paragraph Structure	30	22	10	16	24	24	24	20	22	22	15	25
Technical Quality	25	14	8	7	14	21	20	15	14	18	7	21

COMPOSITE SCORE	2500	1448	436	623	1372	1540	1758	1613	1316	1593	325	1794

Response Speed (subjective, based on AegisH’s experience)	Lightning < Quick < Average < Slow < Painful	Average	Quick	Average	Slow	Quick	Slow	Average	Average	Average	Lighting	Slow

Choosing Between the Best Of Lightshade

Five models earned a Strong rating. If you're choosing between them, this is the breakdown.

All five were tested on the same LumenBench scenario: a long-form dark medieval fantasy role-play with 10 characters, a 55k character long definition, a 6k character long persona, etc. Results reflect this specific test — your experience on other genres, settings, bot constructions, and cast sizes will vary.

	🖊️ Lite NEW (1540, Quick, 7B)	✏️ Nano (1758, Slow, 30B)	🍊 Orange S1 (1448, Average, 100B+)	🔂 D-Loop Lite (1613, Average, 105B)	🎑 Miko Lite (1593, Average, 700B)
Best For	Speed without sacrificing quality. 85-90% of Nano's writing at roughly 3x the speed and one-quarter the size of its overachieving sibling. Best swipe variety — five distinct outcomes every time you swipe. Pick this if you want fast, good output and don't need perfect rule-following.	Maximum quality. The best writing, the strictest rule-following, and the most reliable output on the platform. The only model where your blocked word list actually works. Pick this if you care about depth and consistency and can tolerate a slow generation speed.	Mistral lovers who prefer how the Mistral models write. The only Mistral model that completes stories and maintains character voices throughout. Familiar prose style if you've used Mistral elsewhere. Pick this if you prefer Mistral or want Average speed with solid quality.	Long, atmospheric sessions with large casts. The deprecated Lite reborn — safety training that doesn't restrict creative output and the benchmark's strongest Betrayal scene. Pick this if you want slow-burn character depth with extended output and can live with occasional blocked word violations.	The largest public model on the platform and the only one besides Nano where your blocked word list actually works. Dense, atmospheric prose that prioritizes environmental detail and sensory immersion over dialogue. Pick this if you want strict word-list compliance, rich dark-fantasy writing, and can live with occasional continuity errors.
Writing Quality	27% vocabulary diversity across ~11,000 words. ~70/30 narration-to-dialogue split. Uses ~40-50% of the character definition/persona — surface traits and key dramatic moments, but skips deeper details and hidden motivations. Can sustain up to 7-8 distinct characters with unique personalities, speaking styles, motivations, etc in a heavy roleplay.	21% vocabulary diversity across ~25,000 words (2x the output of Lite). ~70/30 narration-to-dialogue split. Uses ~75-85% of the character definition/persona — the deepest on the platform. Hidden fears, secret motivations, and subtle details all surface organically. Can sustain up to 8-9 distinct characters with unique personalities, speaking styles, motivations, etc in a heavy roleplay.	8% vocabulary diversity across ~52,000 words — the same phrases recur heavily, which is the tradeoff for the longest output. ~85/15 narration-to-dialogue — the least dialogue of the three. Uses ~40-50% of your character backstory. Can sustain up to 7-8 distinct characters with unique personalities, speaking styles, motivations, etc for a while, but the characters eventually sound the same.	16% vocabulary diversity across ~22,000 words. ~85/15 narration-to-dialogue split — narration-heavy like Orange S1 but with sharper dialogue when it appears. Uses ~55-65% of the character definition/persona — major backstories dramatized through dialogue, but deeper hidden details missed. Can sustain up to 8-9 distinct characters with unique personalities, speaking styles, motivations, etc in a heavy role-play.	~22% vocabulary diversity across ~15,000 words. ~70/30 narration-to-dialogue split. Uses ~50-60% of the character definition/persona — surface traits, key backstories dramatized through dialogue, and dramatic moments, but misses deeper hidden details like Corwin and claustrophobia. Can sustain up to 7-8 distinct characters with unique personalities, speaking styles, motivations, etc in a heavy roleplay. Possible occasional world state errors.
Writing Style	Careful, novelistic pacing in the first two-thirds that compresses sharply in the final act. Reads like a writer who nails the setup and rushes the ending. Dark atmospheric prose with strong character voices. Fastest of the four.	Unhurried pacing — scenes breathe, characters get individual moments, the story takes its time before advancing. Best side-character work on the platform. Reads like a slow, careful novel where every character feels real.	Heavy narration with sparse dialogue. Strongest in quiet, character-driven scenes. Reads like a moody, atmospheric novel that occasionally copy-pastes its own paragraphs. The longest raw output of the four by a wide margin.	Deliberate pacing that lets every scene breathe. Reads like a dark, atmospheric novel where the cast never drops a voice across 18 responses. No repetition loops. The longest sustained response count of any model. Formatting degrades subtly in the final stretch (random italics, garbled ending).	Dark, atmospheric prose with the Miko family's distinct environmental framing — heavier narration than LimonLM, longer atmospheric passages that set up character action. Patient pacing in the first two-thirds that compresses in the final act. Distinct from both LimonLM and the Mistrals.
Weaknesses	Rushes the ending — the last third of the story compresses into a single response. Sets up a key character secret correctly but resolves it differently than mandated by the test instructions. More disobedient compared to Nano. Protagonist acts at key moments but doesn't drive the plot.	Slow. Protagonist becomes permanently passive after an injury — carried for the second half with no decisions or agency. Conventional ending. Regenerating usually gives you the same buildup without reaching the climax. Vocabulary narrows over long sessions.	One phrase appears 72 times across the chat. Action scenes are copy-pasted templates applied to each character in sequence. Occasionally needs a OOC intervention to prevent repetition loops. Protagonist is a passive observer throughout.	Output degrades at the end — THE END garbles into corrupted characters, random italics in the last 3 responses. 2 blocked word violations ("curse"). Protagonist passive after injury — carried for the second half. Vocabulary narrows across 18 responses. 105B for output a 30B model beats by 145 points.	One world state contradiction — an item explicitly left at a shrine appears in a character's pack two responses later and drives the climax. Betrayal scene truncated by immediate combat — no group-fracture argument plays out. Later beats compressed. Protagonist passive after injury. "THE END" plain text, not massive. 700B for output a 30B model beats by 165 points.
Memory	128K context window. Zero memory errors across ~11,000 words of story. Injuries, items, and character details all tracked without dropping or contradicting anything. Smaller window than Nano but more than enough for the output length.	512K context window. Zero memory errors across ~25,000 words — the longest and most demanding memory test, passed with no errors. Every injury, supply count, item, and character detail tracked permanently. If you write it, Nano remembers it.	131K context window. Zero memory errors across ~52,000 words — the longest raw output of any model, tracked without contradictions. Injuries, items, and character details all maintained. Smaller window than Nano but handles its own massive output cleanly.	≤150K context window (hard-capped; above 150K triggers disintegration loop). Zero memory errors across ~22,000 words and 18 responses. Injuries, pendant escalation, supply depletion, and deaths all tracked permanently.	4M context window. One memory error across ~15,000 words: the iron shard ordered left at the shrine appears in Maren's pack without explanation. All other details — injuries, items, pendant escalation, character details, deaths — tracked without error.

Individual Model Pages

Detailed breakdowns for each model — specs, context window, strengths, weaknesses, and testing notes. Click any page below to read the full entry.