May 1, 202612 min read

The 7-check pre-publish protocol every short-form video should pass before you post

Seven binary checks that catch the structural problems behind most underperforming Reels, TikToks, and YouTube Shorts — before they go live. Runs in 90 seconds.

Most short-form video underperformance is structural. The hook promised something the video did not deliver. The reveal landed 8 seconds too late. The cover frame showed the setup instead of the outcome. These are fixable before you post.

This is a 90-second pre-publish protocol: seven checks, each with a measurable threshold. Run it on every video before it goes live. The checks catch the structural problems responsible for most underperforming Reels, TikToks, and Shorts.

The 7-check protocol

#	Check	Pass condition
1	Hook promise lands fast	Claim delivered: 3s TikTok, 4s Reels, 5s Shorts
2	Reveal lands before the 60% mark	By 0:18 on 30s video, 0:36 on 60s
3	First 2 seconds work sound-off	Visual creates curiosity or relevance without audio
4	One CTA, soft, in the final 10%	One CTA, in last 3s on a 30s video
5	Spoken transcript contains target keyword	Keyword spoken in first 10s of voiceover
6	No dead seconds	Every second has a visual, audio, or informational change
7	Cover frame and caption match the hook's promise	Thumbnail and caption first line reflect the hook claim

1. The hook promise lands fast

Threshold: The specific claim, audience callout, or payoff is delivered within the platform ceiling. Three seconds on TikTok, four on Reels, five on Shorts.

How to test: Read the opening line of your voiceover aloud without watching the video. Is the claim specific enough that a stranger would either stay or scroll based on that one sentence? If the first sentence is "okay so today we're going to talk about..." the hook has already failed. If it is "I cut my grocery bill by $347 last month doing one thing," you are at the threshold.

What "landing fast" means in practice: the specific promise or audience callout is front-loaded before the platform's ceiling. You can have a hook that runs 5 seconds as long as the substance is delivered in the first 3. A soft ramp-up that takes 6 seconds to name the topic is not a 6-second hook — it is a broken hook.

Common failures: opens with a question that has no stakes, the promise is delivered at 0:08 instead of 0:03, the first sentence is a generic category frame rather than a specific claim.

Fix: Pull the claim forward. Whatever you were going to say at 0:06, say it at 0:02. Cut the warm-up.

2. The reveal lands before the 60% mark

Threshold: On a 30-second video, the key insight or payoff lands by 0:18. On a 60-second video, by 0:36.

How to test: Watch the video and note the timestamp when you deliver the sentence a viewer would repeat to a friend. That is the reveal. If it falls past the 60% mark, the middle is too heavy.

This is the single most common structural problem in short-form video. Creators overinvest in context and problem, leaving the reveal squeezed into the final quarter. By the time it lands, most of the original audience has already dropped. The retention curve does not recover.

The 60% threshold is the point at which drop-off typically accelerates on a poorly structured video. Landing the reveal before that threshold means delivering the payoff while the majority of the audience is still watching.

Common failures: context beat runs 12 seconds on a 30-second video, the problem beat repeats what the hook already said, the reveal is implicit ("so that's why X matters") rather than explicit ("cut your menu to 9 items").

Fix: Find where your reveal is. If it lands past 60%, cut the setup between the hook and the reveal by half. The reveal earns attention; everything before it borrows it.

3. The first 2 seconds work sound-off

Threshold: Cover the speaker and watch the first 2 seconds with audio off. The visual alone should create curiosity or signal relevance to the intended viewer.

How to test: Physically cover the audio indicator and watch the opening 2 seconds. Does something happen visually that makes a viewer want to know more? Or is it a static talking head?

A significant fraction of short-form views happen on autoplay with ambient attention. The viewer hears the video before they look at it. The first visual frame — and the first 2 seconds of motion — is what converts ambient attention into active watching. A static frame with no visual signal for the first 2 seconds gives a distracted viewer a clean window to scroll before their attention fully engages.

This is distinct from Check 3 in earlier protocol versions that asked for a visual change in the first 8 seconds. The first 2 seconds is a tighter, more precise standard: the opening visual should be doing work from the first frame.

Common failures: the opening frame is the creator looking at camera with no text or visual context, the first motion is the creator adjusting their shirt, demonstration videos where the subject appears but nothing happens for 3 seconds.

Fix: Add a text overlay in the first 2 seconds that restates the hook claim or names the topic. Or cut directly to the action: if the video is a demonstration, start mid-action. The opening visual should carry the hook's weight, not just establish that a video exists.

4. One CTA, soft, in the final 10%

Threshold: One call to action, in the final 10% of the video. On a 30-second video, in the last 3 seconds. On a 60-second video, in the last 6 seconds.

How to test: Watch the video and count how many CTAs appear and where the first one is. One CTA, at the end, that feels like an extension of the content. Two CTAs, or a CTA that fires before the reveal, fails.

Short-form CTAs convert better when they are soft and specific. "Follow for more hook pattern breakdowns" outperforms "like and subscribe." "Save this for the next time you post" outperforms "link in bio." The CTA should not feel like an ad appended to a piece of content — it should feel like a natural next step from the content just delivered.

Two common failure modes pull in opposite directions. The first is no CTA at all — the video ends on the reveal with no direction. That is fine for some videos and leaves growth on the table for others. The second is a hard-sell CTA that fires mid-video and interrupts the reveal. That tanks both the CTA conversion and the completion rate.

Common failures: CTA fires at 0:15 on a 30-second video, two CTAs (one mid-video and one at the end), the CTA references a product unrelated to the video's topic.

Fix: Move the CTA to the last 3 seconds. Keep it to one. Name the specific topic it connects to.

5. The spoken transcript contains the target keyword

Threshold: The target keyword appears in the first 10 seconds of spoken voiceover.

How to test: Listen to the first 10 seconds of the video with eyes closed. Does your target keyword appear spoken in that window?

This check applies to all three platforms but matters most on YouTube Shorts. YouTube indexes the spoken transcript for Shorts search ranking. If the keyword appears only in the title, description, and tags but not in the spoken audio, the ranking signal is weaker than if it appears in the transcript. Creators who write strong titles but oblique hooks leave ranking signal on the table.

On TikTok and Reels, the check matters for a different reason: the spoken keyword in the hook confirms to the algorithm that the video matches the content category it is being served to. It is a topic signal, not just a ranking signal.

The keyword does not need to be forced. A direct how-to hook naturally contains the keyword because it states the topic. "Here's how to write a hook that keeps 80% of viewers past 15 seconds" contains the keyword "hook" in the first sentence. The problem is oblique hooks that are clever but vague — the topic is implied rather than named.

Common failures: the hook is clever but does not name the topic, the keyword appears only in the title, the voiceover refers to "this" or "it" without naming what "it" is.

Fix: Name the topic in the first sentence of voiceover. If the keyword sounds awkward, the hook is probably too oblique — fix the hook first.

6. No dead seconds

Threshold: Every second of the video contains at least one change: visual movement, a new audio layer, a tone shift, a text overlay appearing or disappearing, or new information in the voiceover.

How to test: Watch at 1x speed and note any second where nothing changes. Each such second is a scroll opportunity the video is not defending.

"Dead seconds" are not the same as pauses. A deliberate pause before a reveal is a pattern interrupt — it creates tension. A dead second is a moment where the video has simply stopped trying: the creator is still talking but nothing is changing on screen, or the visual is static while the voiceover ramps up to a point.

The density of changes does not need to be high. Two or three changes per 30-second video is normal and sufficient. The issue is consecutive seconds with no stimulus change. Three static seconds in a row, particularly in the context beat or early problem beat, is where drop-off accelerates.

Common failures: talking-head segments of 6 to 8 seconds with no overlay, B-roll, or audio shift; demonstration videos that show the setup for 4 seconds before anything happens; outro sequences that run longer than the content supports.

Fix: For each dead second identified, add one stimulus change: a text overlay with a pull quote, a cut to a second angle, a music volume dip, or a direct address to camera. The change does not need to be dramatic. It needs to exist.

7. Cover frame and caption match the hook's promise

Threshold: The static thumbnail reflects the hook's specific promise, and the first line of the caption restates or supports that same promise.

How to test: Screenshot the cover frame. Read the first line of the caption. Now read the hook sentence. Do all three say the same thing? Or does the cover frame show the setup while the hook promises the outcome?

The cover frame and caption are the two pieces of copy a viewer reads before autoplay starts, particularly in search results, recommended feeds, and shares. They function as the meta-hook: if they contradict or dilute the hook, the viewer arrives at the video with the wrong expectation, and the hook has to work twice as hard.

The most common version of this failure is a cover frame taken from the opening of the video — the creator looking at camera, the workspace before the demonstration begins, the product box still closed. The hook promises an outcome. The cover frame shows the starting state. The mismatch costs click-through.

The caption version of this failure is an opening emoji or a generic phrase. "New video!" does not support a hook that says "I cut my grocery bill by $347." "Drop a 🙋 if you've felt this" does not support a hook about hook writing.

Common failures: cover frame is the literal first frame of the video, caption opens with a hashtag block, thumbnail shows the creator's face without text, caption first line is the video title verbatim instead of a supporting claim.

Fix: Set a custom cover frame from the reveal section of the video, or create a static graphic with the hook claim as text. Write the caption's first line as a keyword sentence that supports the hook.

Running the full protocol in 90 seconds

Seconds 0-20: Read the first voiceover line aloud (Check 1). Note the reveal timestamp (Check 2).

Seconds 20-40: Watch the opening 2 seconds sound-off (Check 3). Listen to the last 10% for CTA placement and count (Check 4).

Seconds 40-60: Listen to the first 10 seconds for the target keyword (Check 5). Watch at 1x for dead seconds (Check 6).

Seconds 60-90: Screenshot the cover frame, read the first caption line, compare both against the hook (Check 7). Count failures.

If more than two checks fail, fix before posting. If two fail, decide based on which two: Check 1 and Check 2 failing together is a rebuild. Check 6 and Check 7 failing together is a 5-minute fix.

After the protocol

The pre-publish protocol catches structural problems before the feed sees them. For post-publish analysis — hook score, segment breakdown, which beat landed where — paste the URL into lomero.app/analyze after posting. The hook scoring breakdown explains the 0-to-100 rubric. The diagnostic framework covers what to do when the numbers come back low.

The two layers are complements. The protocol prevents known structural errors. The post-publish analysis finds the ones the protocol cannot catch — weak specificity, a pattern mismatch, an audience signal that did not land.

Frequently asked questions

How long does this checklist take to run?

About 90 seconds on a video you already know well. Each check is a single binary judgment. The time cost is low. The cost of skipping it and posting a broken video is not.

Do I need Lomero to run this checklist?

No. All seven checks can be run manually. Lomero's hook score and segment breakdown make checks 1 and 2 faster and more precise, but the protocol works without the tool.

Should I run this on every video or just ones I am unsure about?

Every video. The checks that catch the most problems are the ones that feel unnecessary. Hooks that seem strong to the creator are often the ones with the softest payoff timing. The protocol is fast enough that there is no good reason to skip it.

What is the most commonly failed check?

Check 2 — the reveal lands after the 60% mark. Most creators overinvest in context and end up with the payoff buried in the last quarter of the video. Pulling the reveal forward by 4 to 6 seconds is the single most common fix that improves retention.

Does this checklist apply to all three platforms?

Yes, with minor timing adjustments. Check 1 ceiling is 3 seconds on TikTok, 4 on Reels, 5 on Shorts. Check 5 applies most directly to YouTube Shorts where spoken transcripts are indexed for search. The other five checks are platform-agnostic.

What if a video passes all 7 checks but still underperforms?

A structural checklist catches structural problems. It does not predict distribution, topic-audience fit, or algorithm state. A video that passes all 7 checks and still underperforms is likely a distribution or targeting issue, not a content issue.

Can I use this before recording instead of after?

Yes. Checks 1, 4, and 5 can be run on a script or outline before recording. The rest require a finished video. Running check 1 on your script before shooting catches the most expensive problem — the one that requires a full re-shoot.

Related: hook patterns that stop the scroll covers the structures that pass Check 1, why your short-form video isn't converting is the post-publish diagnostic companion, and how hook scoring works explains the automated analysis Lomero runs after posting. For a comparison of pre-publish analysis tools, see Lomero vs Viral Roast.