Reading HT Words Under the Current Canon
Status: Canon-aligned learner-facing guide
Language: Hadokai Tubatonona / HT
Scope: Practical word reading, strict/casual recognition, lexicon-first interpretation, reverse-stripping as search method, and responsible meaning limits
This page is a learner-facing companion to the HT Deconstruction and Casual-to-Strict Resolution Doctrine.
Its purpose is to help readers approach HT word-forms methodically without guessing meaning from surface shape alone.
The older version of this page treated longest-match stripping as though it could determine meaning. Under the current canon, that is too strong. Reverse-stripping is a useful and canonically appropriate search method for right-built forms, but it does not by itself authorize a definition.
The core rule is simple:
Structure can show whether a form is possible. The lexicon determines whether meaning may be asserted.
1. Begin with the Form Itself
Before interpreting a word, first determine whether the form is strict or casual.
If the word contains =, it is being presented as strict romanization.
A strict word must already divide into complete CVC syllables:
opening consonant + vowel nucleus + closing consonant
If it contains = but does not divide cleanly into full CVC syllables, it is invalid as a strict HT form.
Example:
tu=na= is valid strict HT: tu= + na=
tu=na is invalid, because the final na is incomplete in strict mode.
If the word does not contain =, it is being presented as casual romanization.
Casual romanization is allowed to be incomplete on the surface because it omits ungU. The reader must determine whether the casual form can be resolved, confirmed, or only recognized as structurally possible.
2. Check the Canonical Inventory
Only canonical HT components may appear in an HT word-form.
Canonical consonants:
b, c, d, f, g, h, j, k, l, m, n, p, r, s, S, t, v, w, y, z, =
Canonical vowels:
a, e, i, I, o, u, U
If a claimed HT word contains a noncanonical component, it is invalid.
For example, a word containing q or x is not merely unknown. It is invalid as HT unless the character is outside the HT word-form itself, such as punctuation, formatting, or surrounding non-HT text.
3. Check Casual Consonant Runs
In casual romanization, consonant runs are limited by the underlying CVC structure.
A valid casual form may have:
- one consonant at the beginning of a word;
- up to two consonants internally;
- one consonant at the end of a word.
An internal two-consonant run represents a syllable boundary:
coda of the previous syllable + onset of the next syllable
So:
tubrazna can resolve as tub | raz | na
But:
strolan is invalid because it begins with the consonant run str.
brata is invalid because it begins with the consonant run br.
tabr is invalid because it ends with the consonant run br.
There is no canon mechanism that permits longer consonant runs in ordinary HT word-forms.
4. Check the Lexicon Before Decomposing
If the whole word is directly defined in the lexicon, that entry governs.
The lexicon provides the canonical strict form, meaning, grammatical role, and notes.
A direct lexicon entry has higher authority than a decomposition attempt, provided the entry itself passes HT structural rules.
If the same casual surface points to more than one valid lexicon entry, the form is defined but lexically ambiguous. Report the available entries rather than choosing one without context.
5. Apply Forced Structural Rules
If the word is not directly defined, apply the structural rules that casual romanization still preserves.
CC boundary
Two adjacent consonants inside a casual word force a syllable boundary.
The first consonant closes the syllable on the left.
The second consonant opens the syllable on the right.
Example:
boldar becomes bol | dar
No ungU is required at the boundary because both consonant positions are already filled.
VV boundary
HT does not allow diphthongs.
Two adjacent vowels force a syllable boundary.
Each vowel belongs to its own syllable nucleus.
A VV boundary requires two ungU in strict form:
one to close the syllable on the left, and one to open the syllable on the right.
Example:
main becomes ma | in, then ma= + =in, producing ma==in
The doubled == is not one mark. It is two ungU occupying two separate consonant slots.
CVC completion
After forced boundaries are identified, every syllable must be completed to CVC.
Examples:
tu becomes tu=
al becomes =al
a becomes =a=
tun remains tun
ungU is not decorative. It appears only where the required consonant slot has no pronounced consonant.
6. Use Reverse-Stripping as a Search Method
HT morphology is right-building. Words begin with an attested left-hand base and develop rightward through additive specification.
Because HT words build to the right, reverse-stripping is the appropriate search method for unattested right-built forms.
The method is:
- Check the whole form in the lexicon.
- If it is not defined, remove characters from the right until an attested left-hand base is found.
- Record that match as a candidate anchor.
- Treat the removed material as rightward candidate material.
- Recursively analyze the remainder.
- Continue searching for other possible matches rather than stopping at the first or longest match.
The important correction is this:
The longest match is a useful search priority, not a canon authority.
A form is confirmed only if the decomposition is complete, unique, attested, and semantically licensed by the lexicon.
If more than one complete decomposition exists, the parser reports the alternatives and does not choose among them.
If the structure is valid but the meaning is not licensed, the form remains structurally possible but canonically unresolved.
7. Four Possible Outcomes
Every HT word-form should be classified into one of four resolution states.
Canon defined
The word is directly registered in the lexicon.
The lexicon supplies the canonical strict form and meaning.
Canon confirmed
The word is not directly registered as a whole entry, but it decomposes uniquely into attested canon elements.
The meaning is licensed by those lexicon entries.
Canon structured
The form uses valid HT components and can be completed into legal CVC structure, but the canonical strict form or meaning cannot be uniquely recovered.
No definition may be asserted.
Canon invalid
The form fails a mandatory structural gate.
This includes noncanonical components, malformed strict forms, impossible consonant runs, forms with no vowel nucleus, or other structures that cannot arise from valid HT CVC syllables.
8. Worked Example: dohdokmakotze
Suppose the word dohdokmakotze is not directly defined as a whole lexicon entry.
The reader may apply reverse-stripping as a search method.
Possible candidate components:
doh – area indicator
dokmak – bounded; within containment; contained
otze – empty; lacking; without
The older reading method would have moved directly from those components to an educated interpretation such as “an area without containment.”
Under the current doctrine, the parser must be more careful.
The form may be interpreted that way only if:
- doh, dokmak, and otze are all attested in the lexicon;
- the decomposition doh + dokmak + otze is complete;
- no competing complete decomposition is available;
- the suffixal function of otze licenses the meaning “lacking / without” in this construction;
- the combined meaning is supported by the registered component meanings.
If those conditions are met, the form may be canon confirmed.
If the components are attested but more than one complete decomposition exists, the form is canon structured until the lexicon or context resolves it.
If the structure is valid but the semantic relationship is not licensed by the component entries, the form remains canon structured and no definition should be asserted.
So the responsible interpretation is not:
“This must mean X.”
The responsible interpretation is:
“This form appears structurally possible. If the listed components are attested, uniquely decomposed, and semantically licensed, it may be confirmed as meaning X. Otherwise, the form remains unresolved.”
9. Context Can Select, But Not Invent
Context matters, but context does not create canon by itself.
Context may help choose between already valid readings.
Context may identify which lexicon entry is intended when a casual surface has more than one defined entry.
Context may clarify which licensed sense of a suffix is active.
But context may not invent a definition for an unattested structure.
Meaning is built from canon, not guessed from resemblance.
10. Practical Reading Summary
When reading an HT word:
- Determine whether it is strict or casual.
- Validate the component inventory.
- If strict, require complete CVC syllables.
- If casual, check consonant-run limits.
- Look up the whole word in the lexicon.
- If not defined, apply CC, VV, and CVC completion rules.
- Use reverse-stripping to search for attested right-built components.
- Check whether the decomposition is complete and unique.
- Confirm that the meaning is licensed by the lexicon.
- Classify the result as canon defined, canon confirmed, canon structured, or canon invalid.
The reader may analyze.
The reader may classify.
The reader may report candidates.
The reader may not guess meaning where canon has not supplied it.