CiteThis: Evidence-Based Research Platform

Starting point
April experiment: build something where every technical detail is intentional. Topic - evidence-based medicine (supplementation, sleep, ADHD, postpartum). Goal - not to write "another blog," but a research platform that AI engines can and will cite.
What I built
16 evidence-based protocols, each ~2500 words. Structured frontmatter, key definitions, methodology notes, FAQ sections, comparison tables, safety considerations, DOI links to primary studies.
33 tag landing pages with dedicated explainers (~2100 words each) and DefinedTerm schema markup. When someone searches "what is myo-inositol for PCOS" in ChatGPT, citethis.site is directly an answer, not just a list of articles.
Tech stack
- Framework: Astro 6 (SSG mode) — no client-side JavaScript, clean HTML for AI crawlers
- Content: Markdown content collections with TypeScript validation
- Search: Pagefind (static search, 60KB JSON)
- Deployment: Vercel (free tier, TTFB under 100ms)
- Design: Tailwind CSS, brutalist dark-mode, WCAG AAA
GEO-first architecture
6 layers of structured data on every page:
- ScholarlyArticle + MedicalWebPage schema with Person entity
- Dataset schema with explicit links to .md and .json versions
- FAQPage schema auto-generated from H3 questions in articles
- DefinedTerm schema for every tag
- CC-BY 4.0 license (explicit AI citation permission)
- Cite this protocol box with formatted citation + raw markdown endpoint
Plus: llms.txt manifest, robots.txt with AI crawlers whitelisted (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot), /api/protocols.json RAG endpoint.
Self-audit and iteration
After launch, a GEO audit of my own site found 4 critical bugs: llms.txt generated 404s (wrong slug field in Astro template), author schema said "jroh.cz" instead of "Jakub Roh", duplicate tags (ADHD/adhd), missing FAQPage schema.
Fixes within an hour, then adding methodology notes to all articles and writing 33 tag explainers (~70K words of new content).
Average citability score: 42 → 48-49, B-grade passages 1 → 4-5 per article.
Results
- 16 protocols + 33 tag landing pages = 117 static pages
- ~110K words of GEO-optimized evidence-based content
- 6 layers of structured data, functional llms.txt, auto-FAQ schema
- Cost: $0 infrastructure (Vercel free tier, Namecheap domain for a few dollars/year)
- Time: 3 days from idea to production
What this demonstrates
Most Czech companies have a blog without FAQPage schema, without Author entity, without .md endpoints, without llms.txt — and ask why ChatGPT doesn't cite them. The answer: because they haven't done anything to allow it.
CiteThis is proof that GEO-first architecture can be built fast and without budget — if you know what you're doing. And the detailed article about building it is here.
Project: citethis.site · Stack: Astro, TypeScript, Tailwind · License: CC-BY 4.0 (content), MIT (code)