Corpora
Corpus-family readiness overview for first-slice navigation. It lists corpus shells without asserting source coverage, search output, or corpus-level findings.
- Route Family
- Corpora
- Robots Policy
- Public route
- Sitemap Inclusion
- included
- Source Gate
- Landing honesty gate
- Receipt Pointer
- none
- Closed Claim
- Documentation only
This index is an honest readiness landing. It makes corpus routes findable without corpus coverage claims while keeping corpus evidence, search output, and source text closed behind source-audit gates.
Corpus detail remains a source-gated provenance shell, not a public corpus claim.
hebrew Hebrew Bible Navigable, not public evidenceCorpus detail remains a source-gated provenance shell, not a public corpus claim.
greek Greek New Testament Navigable, not public evidenceCorpus detail remains a source-gated provenance shell, not a public corpus claim.
latin Latin Vulgate Navigable, not public evidenceCorpus detail remains a source-gated provenance shell, not a public corpus claim.
latin Classical Latin Seed Navigable, not public evidenceCorpus detail remains a source-gated provenance shell, not a public corpus claim.
english English Public-Domain Bible Orientation Navigable, not public evidenceCorpus detail remains a source-gated provenance shell, not a public corpus claim.
From Corpus Shell To Reviewed Evidence
This summary makes the corpus route family easier to scan without turning route inventory into corpus evidence. It keeps the replacement-platform path visible while search, source display, and corpus claims remain closed.
The index can introduce available corpus routes while the detail routes keep their source-audit locks.
Language breadth is navigation only. It is not a claim that coverage, search, morphology, or usage statistics are ready.
Each corpus still needs source, license, locator, checksum, display-policy, and receipt gates before evidence promotion.
Concordance, KWIC, collocations, frequency, and public corpus statistics remain unavailable until audited indexes exist.
- Do not infer corpus coverage from a listed shell.
- Do not reconstruct source or translation text from memory.
- Do not treat language breadth as morphology, usage, frequency, or equivalence evidence.
- Do not promote corpus detail routes while source and receipt gates remain open.
Acquisition Readiness Ladder
The corpora index is a non-operative map of acquisition readiness. It shows the required movement from candidate source to manifest, snapshot, passage audit, reader availability, and search availability while source-pending corpus routes remain noindex where the registry says so.
| Order | Stage | State | Registry Anchor | Required Gate | Safe Output Now | Blocked Output |
|---|---|---|---|---|---|---|
| 1 | Candidate source | Review pending | Source family and candidate IDs | Rights, source-family, repository, and locator candidates must be named without ingestion. | Candidate source metadata and blocker language. | No source text, translation, morphology, lexical, etymology, frequency, or corpus-coverage claim. |
| 2 | Acquisition manifest | Manifest only | source_acquisition_manifests | Manifest, source, corpus, work, candidate term, license, and display-policy rows must agree. | Manifest title, states, source family, candidate IDs, and no-ingest proof. | No route promotion, passage expansion, token rows, copied entries, or public evidence bundle. |
| 3 | Source snapshot | Source audited | source snapshot checksum | A normalized snapshot, checksum, repository handle, locator scope, and display policy must be pinned. | Snapshot ID, checksum state, line-count metadata, and withhold notice. | No primary-source rendering, translation rendering, index build, or public search surface. |
| 4 | Passage audit | Under source audit | passage row and review gates | Canonical reference, source IDs, display flags, audit notes, and blocker IDs must pass together. | Passage ID, canonical route, audit state, and display-disabled states. | No audited passage claim, source excerpt, translation text, morphology parse, or citable passage receipt. |
| 5 | Reader availability | Navigable, not public evidence | reader and work routes | Reader display, citation, source, translation, and receipt policies must agree before promotion. | Reader route pointer and readiness dashboard only. | No public reader launch, no promoted text/work page, and no searchable reading output. |
| 6 | Search availability | Not started | future index build report | A Logoi-owned chunking, locator, KWIC, lexical index, and validation report must exist. | Unavailable-state labels and index-build blockers. | No search results, KWIC rows, concordance sets, frequency charts, collocations, or completeness counts. |
Collection And Work Browsing
Browse shell onlyCorpus index can deepen route browsing by showing collection, language, corpus, work, reader, source, and edition dependencies. The panels do not display source text, translation, morphology, lexical range, etymology, frequency, arbitrary search results, or corpus coverage claims.
A top-level reader landing can organize available corridors and blocked layers.
Language browsing is represented by route expectations and current work/corpus examples.
Corpus browsing can show the corpus row, source dependencies, and work corridor.
Work browsing can anchor a canonical work route and route onward to reader, passage, source, and edition shells.
Reader browsing can show route state and next-route choices without rendering primary text.
Source browsing can expose metadata and checksum posture as dependency facts.
Edition browsing can make the dependency chain legible before evidence opens.
Receipt browsing can show why publication and export are not ready.
Route-Family Expansion Lanes
These lanes make Greek, Hebrew, Latin, and English expansion visible as route planning only. They do not certify corpus completeness, source availability, translation, morphology, etymology, frequency, or indexed search.
| Lane | Family | State | Visible Now | Required Gate | Blocked Output |
|---|---|---|---|---|---|
| Language coverage | language | Under source audit | Greek, Hebrew, Latin, and English appear as planned route corridors and source-family dependencies. | Each language needs an accepted source family, license posture, locator policy, and source-gate report before evidence can appear. | No language-wide corpus completeness, representative coverage, morphology, lexical range, etymology, frequency, or indexed-search claim. |
| Work selection | work | Navigable, not public evidence | The Homer Iliad shell names the current first-slice work route and leaves future Hebrew, Latin, and English work picks as blockers. | Work identity, edition dependency, canonical path, source IDs, and passage scope must be approved before work pages become evidence. | No complete work text, translation, work-level concordance, source availability, or citable work receipt. |
| Locator policy | reference | Not started | Book-line, chapter-verse, and section-style locators are named as policy lanes only. | A cross-corpus locator grammar and reference parser validation report must exist before arbitrary references resolve. | No arbitrary passage lookup, adjacent passage loading, token-span join, KWIC row, search result, or reference completeness claim. |
| Rights and display blockers | rights | Withheld pending source-text display gate | License, edition, repository, checksum, and display-policy blockers remain visible as metadata. | Rights, attribution, source snapshot, passage audit, translation policy, and source-display approval must agree. | No source excerpt, translation text, gloss, copied entry, public display authorization, or public-source availability claim. |
| Unavailable text state | reader | Unavailable pending audit | Reader, text, and corpus routes show unavailable states for source text, translation, morphology, lexical, etymology, and search layers. | Unavailable states can change only after source, display, translation, token, index, and receipt gates pass together. | No source text, translation, morphology, etymology, lexical definition, frequency, indexed search, generated interpretation, or receipt export. |
Source-Readiness Breadcrumbs
Breadcrumbs show where a browsing route stands in the source-readiness chain. Each step is a dependency marker, not an authorization to render source display or derived evidence.
| Step | Route | State | Visible Now | Next Gate | Blocked Output |
|---|---|---|---|---|---|
| 1. Route family selected | /texts | Under source audit | Texts and corpora indexes can list route shells. | Route shell must remain distinct from source evidence. | No source text, translation, or search result opens from the index. |
| 2. Corpus dependency named | /corpora/homer | Navigable, not public evidence | Corpus ID, language slug, source IDs, and acquisition state. | Corpus scope, license, display, locator, and source rows must pass together. | No corpus coverage, completeness, or representative-breadth claim. |
| 3. Work identity named | /texts/greek/homer/iliad | Navigable, not public evidence | Work ID, canonical path, corpus relation, source pointers, and passage pointer. | Work boundaries, edition dependency, locator grammar, and passage scope must be audited. | No complete work text, translation relation, concordance, or work-level receipt. |
| 4. Edition dependency checked | /editions/grc-homer-archive-fixture-pending | License review pending | Edition label, source repository, version dependency, checksum state, and display policy. | Edition authority, license posture, attribution, and source-display policy must close. | No edition authority claim, copied entry, translation display, or ingestion claim. |
| 5. Source display withheld | /sources/grc-homer-archive-fixture-pending | Withheld pending source-text display gate | Withhold notice, source metadata route, and blocker IDs. | Source-display approval must pass before any text-bearing surface exists. | No source excerpt, reconstructed line, passage quote, or public source display. |
| 6. Reader route remains gated | /read/greek/homer/iliad/iliad-1-1-5 | Under source audit | Reader path, citation label, disabled controls, and route handoffs. | Reader display, translation policy, term-anchor policy, and receipt policy must agree. | No translation, morphology, token span, lexical range, etymology, frequency, or receipt export. |
| 7. Search remains absent | /study/concordance | Not started | Unavailable-state label and index-build blocker. | Logoi-owned locator, chunking, token, KWIC, and search validation must exist. | No arbitrary search result, corpus coverage count, frequency chart, or concordance row. |
Language, Corpus, And Work Expectations
Route expectations make future browsing shape legible while keeping current evidence posture fail-closed for /corpora.
| Family | Pattern | Example | State | Expected Role | Cannot Promise |
|---|---|---|---|---|---|
| language | /texts/{language} | /texts/greek is not a promoted standalone evidence route. | Under source audit | A language corridor should group corpus and work shells only after source-family policy exists. | It cannot imply language-wide source coverage, morphology, lexical range, or frequency. |
| corpus | /corpora/{corpus} | /corpora/homer | Navigable, not public evidence | A corpus route should show corpus identity, source dependencies, acquisition state, and work links. | It cannot claim complete corpus coverage, representativeness, search breadth, or citable source evidence. |
| work | /texts/{language}/{corpus}/{work} | /texts/greek/homer/iliad | Navigable, not public evidence | A work route should organize work identity, corpus relation, edition dependency, and reader/passage handoffs. | It cannot provide complete work text, translation, concordance, token rows, or public receipt export. |
| work | /texts/{language}/{corpus}/{next-work} | Next work route remains manifest-only until selected. | Unavailable pending audit | A future work route must start as metadata with source, locator, license, and display blockers. | It cannot be generated from memory or promoted before a source acquisition packet exists. |
Visible Browse Blockers
Display policy, source reviewer gate, passage audit, and route-specific approval are not complete.
Show source metadata, edition dependency, checksum posture, and a withhold notice.
No translation source, license, locator, alignment, or display policy is accepted for this route family.
Show translation as a closed dependency, not as text or paraphrase.
Token spans, morphology provider, locator joins, and confidence policy have no validated Logoi-owned output.
Show disabled term controls and the provider-review gate.
No source-backed lexical, etymological, or range record is reviewed for this browsing layer.
Route to word shells as source-pending planning surfaces only.
No Logoi-owned search index, KWIC output, frequency table, or coverage validation exists.
Show index-build blockers and route expectations without result rows.
One routed source corridor does not establish corpus-wide completeness or representative coverage.
Show corpus identity and acquisition readiness with explicit non-coverage language.
Receipt export, JSON evidence, sitemap promotion, and indexability review are still separate gates.
Show receipt dependency and public-promotion blocker only.
Corpus route family
Public routeCorpus index is a platform shell for navigation, readiness, and blocker visibility. It does not open source text, translation, tokenization, lexical evidence, frequency, or index coverage.
Corpus To Search Readiness Path
Metadata-only handoffThis pathway shows the handoff sequence from corpus family to search readiness without opening source text, translation, morphology, lexical, etymology, frequency, receipt export, or index output.
Corpus row, language slug, source IDs, and noindex state are visible as family metadata.
Work ID, canonical path, corpus relation, source IDs, and passage pointer stay visible as organizers.
Edition label, repository handle, license/access class, checksum state, and display policy remain separate.
Stable passage route, canonical reference label, source snapshot ID, and audit state remain pointer-only.
Reader route can be named as a future workbench while display, translation, token, and receipt gates stay closed.
Study routes identify the future index lane only; no index inputs or result rows are present.
| Handoff | Required Gate | Safe Output Now | Held Back |
|---|---|---|---|
| Corpus family | Source registry, corpus scope, license posture, and route policy must agree before work routes can inherit evidence. | Family route, source-scope blockers, source IDs, and route-state labels. | No corpus coverage, representative breadth, source text, search result, or completeness claim. |
| Work identity | Work boundaries, edition relation, locator grammar, source IDs, and passage scope must be audited together. | Work route, corpus link, route status, source pointers, and blocked passage corridor. | No complete work text, translation, morphology, lexical claim, etymology claim, or citable work receipt. |
| Edition dependency | Edition authority, repository version, license, attribution, source snapshot, and display policy must pass as one dependency. | Edition route pointer, repository/version metadata, checksum state, and display-policy blockers. | No edition authority claim, copied entry, source excerpt, translation text, source display, or ingestion claim. |
| Passage audit | Canonical reference, source IDs, display flags, audit notes, blockers, and checksum records must pass together. | Passage route, audit state, source-snapshot pointer, withhold notice, and next gate. | No source excerpt, translation, morphology parse, token span, lexical range, or citable passage receipt. |
| Reader readiness | Reader display, source display, translation policy, term-anchor policy, and receipt policy must agree before launch. | Reader route pointer, display-gate state, disabled controls, and audit blockers. | No public reader launch, source line, translation line, token row, morphology row, or searchable reading output. |
| Search readiness | A Logoi-owned chunking, locator, KWIC, lexical index, search index, and validation report must exist. | Unavailable-state labels, index-build blockers, and route pointers. | No search results, KWIC rows, concordance sets, frequency charts, collocations, coverage counts, or completeness metrics. |
Scope Readiness Panel
This panel names the future PhiloLogic/Perseus replacement source-family shape while keeping source display, provider display, denominators, index coverage, and public export closed.
| Source Family | State | Edition/Provider Scope | Rights Posture | Omitted Scope | Denominator | Index Coverage Blocker |
|---|---|---|---|---|---|---|
| Perseus Homer seed Greek epic seed | Under source audit | Perseus canonical GreekLit metadata and a Homer-only edition handle. | Open-source provenance, license, attribution, and display policy remain under review. | No Odyssey, Hesiod, tragedians, adjacent lines, complete Iliad text, or TLG-class breadth. | Not started No token denominator, work denominator, line coverage table, or inclusion basis. | Not started No chunking plan, locator join, index checksum, KWIC build, or coverage validation. |
| Hebrew Bible provider corridor Hebrew Bible | Provider audit pending | Provider choice and chapter-verse locator policy are planning metadata only. | License, attribution, morphology provider, and display policy are unreviewed. | No Hebrew source text, translation, Strong's mapping, morphology rows, or chapter expansion. | Not started No book, verse, token, lemma, or included-row denominator. | Not started No provider audit, tokenization contract, locator coverage map, or search index. |
| Greek New Testament provider corridor Greek New Testament | License review pending | Edition/provider options remain unresolved before any source-family route expands. | Rights, attribution, source-display policy, and translation policy are pending. | No Greek NT text, apparatus, translation, morphology alignment, or canonical expansion. | Not started No passage denominator, token denominator, inclusion list, or omitted-row reasons. | Not started No locator parser, provider checksum, chunking proof, KWIC proof, or result policy. |
| Latin/Vulgate provider corridor Latin and Vulgate | Public-domain review pending | Public-domain candidate review and edition identity are future provider work. | Public-domain status, attribution, version handle, and display policy still need review. | No Latin text, Vulgate alignment, translation, morphology, or cross-language equivalence. | Not started No source-row denominator, token denominator, passage denominator, or coverage basis. | Not started No edition checksum, locator coverage, chunking plan, or index coverage report. |
| Licensed breadth corridor TLG/PHI-class licensed breadth | Licensed citation only | Licensed breadth is represented as citation-only planning, not as a display provider. | Licensed material stays outside display, storage, export, and search-index surfaces. | No licensed corpus text, bulk import, concordance corpus, frequency corpus, or export bundle. | Unavailable pending audit No licensed denominator can be counted until rights and index policy are approved. | Unavailable pending audit No licensed index, result cache, coverage report, public receipt, or API output. |
No source display, quotation, excerpt, copied line, or adjacent passage rendering.
Source-display approval, rights proof, passage audit, and route-specific display policy.
No provider-backed edition display, translation display, morphology display, or gloss display.
Accepted provider, edition identity, license, attribution, checksum, and display policy.
No denominator, inclusion list, omitted-row total, token total, or completeness metric.
Corpus scope, inclusion rules, omitted-scope ledger, token policy, and validation report.
No search execution, KWIC output, concordance rows, frequency table, or coverage report.
Chunking, locator joins, checksums, index build, result policy, and coverage validation.
No public receipt export, JSON evidence export, API output, MCP output, or sitemap promotion.
Source, display, denominator, indexability, receipt hash, and public-export policy.
Route-Family Planner Lanes
Language coverage, work selection, locator policy, rights/display blockers, and unavailable text state remain route-planning lanes until source, rights, locator, display, index, and receipt gates close.
Coverage Readiness Layer
This layer makes the corpus/text boundary explicit: route identity can be visible while acquisition, tokenization, indexing, coverage, and publication remain closed.
| Layer | State | Visible Now | Required Gate | Blocked Claim |
|---|---|---|---|---|
| Corpus identity | Under source audit | Corpus route, corpus row ID, language slug, source IDs, and review state. | Corpus identity must stay tied to one accepted source family, license posture, and locator scope. | No corpus coverage, representative breadth, or completeness claim. |
| Work identity | Under source audit | Work route, work row ID, canonical path, corpus link, and passage pointer. | Work boundaries, edition relation, locator grammar, and passage scope must be audited together. | No work completeness, source text, translation, or citable work receipt. |
| Source acquisition | Withheld pending source-text display gate | Acquisition manifest state, source metadata, snapshot state, checksum state, and blockers. | Rights, source snapshot, checksum, display policy, and passage audit must agree before display. | No copied source entry, translation, display permission claim, or ingestion claim. |
| Tokenization and indexing | Not started | Provider, token, locator, chunking, and index-build blockers. | A Logoi-owned tokenization, morphology, locator-join, and index validation report must exist. | No token rows, morphology features, search output, KWIC, frequency, or coverage denominator. |
| Coverage | Not proven | Negative coverage posture and readiness blockers. | Selected source scope, locator coverage, display policy, and index coverage must be audited. | No coverage counts, public corpus findings, result totals, or completeness metrics. |
| Publication gates | Blocked until receipt policy passes | Noindex route state, source gate state, receipt pointer, and public-promotion blockers. | Source, display, passage, indexability, immutable hash, and receipt policy must pass together. | No public-ready receipt, public evidence export, sitemap promotion, or citation-ready corpus report. |
Route Ladder
Each rung names a durable route while keeping evidence output pending, unavailable, blocked, or noindex.
| Rung | Route | State | Gate | Blocked Output |
|---|---|---|---|---|
| Corpus index | /corpora | Under source audit | Route inventory remains separate from corpus evidence. | No corpus coverage, source text, result set, or completeness claim. |
| Corpus detail | /corpora/homer | Navigable, not public evidence | Source registry, license, locator, and display policy must agree. | No Iliad-wide, Odyssey-wide, or Perseus-scale public evidence. |
| Work identity | /texts/greek/homer/iliad | Navigable, not public evidence | Work, edition, source, and passage IDs must resolve before promotion. | No complete work text, translation, morphology, or citable work receipt. |
| Reader sample | /read/greek/homer/iliad/iliad-1-1-5 | Under source audit | Reader display waits on source-display and translation gates. | No source line, translation line, token span, or morphology row. |
| Passage target | /passages/iliad-1-1-5 | Under source audit | Exact citation must keep source text and translation withheld until audited. | No quotable source text, translation, lexical range, or public citation bundle. |
| Source and edition | /sources/grc-homer-archive-fixture-pending | Under source audit | Provenance metadata may render; source display remains separately gated. | No copied source entries, no translation source, no ingestion claim. |
| Study tools | /study/concordance | Unavailable pending audit | Study output waits on a Logoi-owned index build and validation report. | No KWIC, search result set, frequency, collocation, or index coverage. |
| Receipt pointer | /receipts/soul-word-journey-v0 | Blocked until receipt policy passes | Public receipt promotion requires source gates, content hash, and export policy. | No citable public receipt, generated answer evidence, or export bundle. |
Source-Pending Evidence Panels
Corpus Readiness Controls
| Control | State | Disabled Reason | Required Gate |
|---|---|---|---|
| Source scope | Under source audit | Scope expansion needs an accepted source manifest and locator audit. | Manifest, source, corpus, and work rows must agree before scope expands. |
| Display gate | Withheld pending source-text display gate | Source text display is closed on corpus and text routes. | Open only through a dedicated source-display approval. |
| Translation gate | Unavailable pending audit | No translation source, license, or alignment policy is accepted here. | Translation source and display policy must be audited separately. |
| Tokenization gate | Not started | Token rows and morphology joins do not exist for this shell. | Provider, locator, token, and morphology alignment report required. |
| Index coverage | Not started | No Logoi-owned index build or coverage validation exists. | Chunking, KWIC, lexical, and search validation must pass before output appears. |
| Public promotion | Navigable, not public evidence | Navigable pages are not public evidence while gates are open. | Source, display, passage, indexability, and receipt policy must agree. |
Text And Work Availability States
Stable route identity and route ladder.
Work-level source, display, passage, and receipt gates close.
Source and edition metadata pointers.
Metadata cannot be promoted into source evidence or coverage.
Withhold notice only.
A source-display gate authorizes route-specific rendering.
Unavailable-state label only.
Translation source, rights, alignment, and display policy pass.
Provider and locator blockers.
Token, morphology, locator, and confidence validation exists.
Authority and audit blockers.
Source-backed lexical and etymology records are reviewed.
Index-build blocker state.
Search, KWIC, frequency, and coverage validation exists.
Display, Translation, Tokenization, And Index Blockers
| Blocker | State | Blocks | Required Gate |
|---|---|---|---|
| Display | Withheld pending source-text display gate | Primary source rendering and source-side quotation. | Route-specific source-display approval. |
| Translation | Unavailable pending audit | Translation text, literal gloss, Logoi gloss, and alignment. | Accepted translation source, rights, and alignment policy. |
| Tokenization | Not started | Token rows, morphology, lemma expansion, and token-level links. | Logoi-owned tokenization and morphology validation report. |
| Index coverage | Not started | Search, KWIC, frequency, coverage counts, collocations, and completeness claims. | Logoi-owned corpus index build and coverage validation. |
Closed Claim Layers
These overview routes make the P0 shell navigable while leaving evidence and export gates untouched.
The overview does not assert completeness, frequency, or representativeness.
Corpus detail pages keep their source-review blockers.
No source text is rendered by the overview route.
Perseus/PhiloLogic/TLG-class breadth waits on source, license, locator, and index gates.