Most people think meaning lives in words. But linguists know better:
Meaning lives in relationships.
Between words, between people, between context and culture. It's not what you say — it's what it does.
This insight isn’t just academic. It’s the missing piece in how we build and prompt AI systems today.
Because when meaning depends on context, and context gets stripped out of the data we feed LLMs, you get surface-level answers, hallucinations, or tone-deaf responses.
So how do linguists unlock meaning? Let's break it down.
1. Semantics vs. Pragmatics: What's Said vs. What's Meant
Linguists know:
Semantics is the literal meaning — the dictionary definition
Pragmatics is how meaning shifts depending on who's speaking, where, and why
Example: "Can you open the window?"
Semantically: a question about ability
Pragmatically: a polite request
An LLM will give you an answer to the literal question unless you embed the pragmatic intent clearly:
❌ Weak prompt: "Respond to: Can you open the window?"
✅ Linguist prompt: "Someone just asked 'Can you open the window?' Respond as if this is a polite request, not a literal question about your abilities."
Linguists bake pragmatics into their prompts. Others don't even realize it's missing.
2. Presupposition: What's Assumed, Not Said
Linguists pay attention to the unstated assumptions built into language — the cognitive baggage every sentence carries.
Example: "If you stop wasting time on meetings…"
Presupposes: You are currently wasting time on meetings.
A linguist sees this as a loaded sentence. An LLM might replicate this bias in a rewrite unless explicitly told to neutralize it.
Linguistic insight: Presupposition triggers include:
Definite descriptions ("the problem with remote work")
Temporal clauses ("when you finish the project")
Implicative verbs ("managed to complete" vs. "completed")
In prompting:
❌ "Rewrite this email about the productivity issues."
✅ "Rewrite this email, but remove any assumptions about productivity problems. Keep it neutral."
3. Deixis: Language That Points to Context
Words like here, there, this, that, now, you, me are deictic — they don't mean anything without context. They're linguistic fingers pointing at invisible referents.
Example prompt: "Turn this into a message for them."
For whom? Who is "them"? What is "this"?
Linguists know these expressions break when context disappears. They prompt with full contextual anchoring:
❌ Context-free: "Turn this into a message for them."
✅ Context-rich: "Rewrite the following delay notification email for customers in Germany who ordered custom furniture. Be apologetic but reassuring about quality."
Advanced linguistic insight: Deixis operates on multiple levels:
Person deixis: I/you (who's speaking to whom?)
Spatial deixis: here/there (where is the reference point?)
Temporal deixis: now/then (when is the reference point?)
Discourse deixis: this/that (which part of the text?)
4. Speech Act Theory: Language as Action
J.L. Austin's insight: We don't just describe the world with language — we change it. Every utterance performs an action.
The three levels:
Locutionary: What is literally said
Illocutionary: What action is performed (promising, threatening, requesting)
Perlocutionary: What effect is achieved (persuading, frightening, inspiring)
Example: "I promise to deliver this by Friday."
Locutionary: Statement about future delivery
Illocutionary: Making a commitment
Perlocutionary: Creating trust, setting expectations
In prompting:
❌ "Write a response to this complaint."
✅ "Write a response that performs three speech acts: acknowledge the problem (validation), commit to a solution (promise), and restore confidence (reassurance)."
5. Grice's Maxims: The Unspoken Rules of Conversation
Philosopher-linguist H.P. Grice proposed that humans communicate cooperatively by following four principles:
Quantity: Don't say too much or too little
Quality: Be truthful
Relation: Be relevant
Manner: Be clear and orderly
Example of violation:
Customer: "When will my order arrive?"
Bad response: "Shipping is a complex process involving multiple logistics partners and weather considerations..." (violates Quantity and Manner)
Linguists intuitively apply Gricean principles to prompts:
❌ Violates maxims: "Write a summary of this article."
✅ Follows maxims: "Write a 3-bullet executive summary of this 2,000-word article for VPs who have 30 seconds to scan."
The model now gets the communicative intent, not just the task form.
6. Sociolinguistic Variation: Register, Code-Switching, and Power
Linguists understand that language varies systematically based on:
Register: Level of formality
Field: Subject matter and expertise level
Tenor: Relationship between speakers
Mode: Channel of communication
Example: The same information, different registers:
Academic: "The data demonstrates a significant correlation..."
Business: "Our analysis shows a strong connection..."
Casual: "Turns out these two things are totally linked..."
Advanced prompting:
❌ "Make this more professional."
✅ "Rewrite this for a formal business proposal to C-suite executives. Use elevated register but avoid academic jargon. Show expertise without condescension."
7. Metalinguistic Awareness: Thinking About Language Itself
Linguists are trained to:
Spot syntactic ambiguity ("Visiting relatives can be boring")
Anticipate pragmatic failure
Distinguish surface form from deep structure
Recognize when language itself becomes the topic
They see that:
❌ "Make it more fun" is vague (what dimension of fun? humor? energy? playfulness?) ✅ "Add conversational markers, use unexpected metaphors, and vary sentence rhythm for musicality" is precise.
Linguistic example: "The chicken is ready to eat."
Surface ambiguity: Is the chicken ready to consume food, or ready to be consumed?
Deep structure disambiguation required
In prompting:
❌ "Make this clearer."
✅ "Eliminate ambiguous pronoun references, break compound sentences at conjunctions, and define technical terms on first use."
8. Morphosyntactic Patterns: The Hidden Grammar of Meaning
Linguists recognize that meaning emerges from structural patterns, not just word choice.
Example - Nominalization: "The implementation of the solution" vs. "We implemented the solution"
The first hides agency (who implemented?) and process (how?). The second reveals both.
In prompting:
❌ "Make this more active."
✅ "Convert nominalizations to active verbs, restore hidden agents, and show clear cause-and-effect relationships."
What This Means for AI and Prompting
To make LLMs truly useful, you need to:
Disambiguate intention (separate semantic content from pragmatic goals)
Anchor context (eliminate deictic confusion)
Specify speech acts (what should this language do?)
Follow conversational principles (quantity, quality, relation, manner)
Control register and variation (match language to social context)
Think metalinguistically (be explicit about language choices)
Structure for meaning (use syntax to clarify relationships)
That’s what linguists do.
But there’s more. When you move from language to retrieval systems — like RAG (Retrieval-Augmented Generation) or custom GPTs — the need to preserve relationships doesn't go away. It becomes technical.
Structuring Meaning in RAG and Custom GPTs
Giving relationships (or semantic structure) to documents when doing RAG or creating a custom GPT is crucial for meaningful, contextual responses. Here’s how to do it well:
1. Use Metadata to Add Context and Relationships
Tag every document, chunk, or section with metadata that signals its role, author, source, hierarchy, date, etc.
json
{
"title": "Refund Policy",
"section": "Returns",
"parent_doc": "Terms and Conditions",
"doc_type": "policy",
"locale": "en-US"
}
This enables:
Hierarchical reasoning (e.g., “Returns” is part of “Terms”)
Filtering (e.g., only fetch legal docs)
Improved re-ranking (more relevant retrieval)
2. Chunk Smartly Using Structure (Not Just Tokens)
Don’t break documents arbitrarily. Chunk them by logical structure:
Paragraphs
Headings (H1, H2…)
Lists
QA pairs
This mirrors how linguists think in units of meaning — not character counts.
3. Embed Relationships in the Vector Store
When storing embeddings:
Use parent-child linking (e.g., this paragraph belongs to this section)
Consider graph-based storage (e.g., Neo4j) for complex relationships
Use hybrid search (semantic + metadata filters)
4. Add Inline Linking or Summaries in Chunks
In deeply connected documents (like specs, glossaries, or regulatory texts), add:
Inline references to related content
Short summaries like:
See also: Data Storage Requirements (Section 3.2)
This keeps the model grounded in structured context — like discourse deixis, but for machines.
5. Use Tools that Support Structured Relationships
If you're using:
LangChain → StructuredRetriever or ParentDocumentRetriever
LlamaIndex → Knowledge Graph Index or Composable Graphs
Custom GPTs → Use knowledge files + instructions to guide document relationships
Example:
“If a user asks about returns, prioritize the Refunds section from the T&C document.”
Tip: Give Instructions to Leverage Relationships
Your system prompt should teach the model what to do with structure:
“When answering questions about product pricing, prioritize data from the ‘Pricing Overview’ section and relate it to the ‘Subscription Tiers’ document.”
To help an LLM "see" relationships between documents:
Tag richly with metadata
Chunk by meaning, not token size
Link structures clearly
Use retrieval tools with hierarchy support
Guide the model with smart system prompts
That’s how you preserve meaning across language and architecture.
Language Is a Cognitive Technology
It’s not just communication, it’s how we think.
Linguists have been decoding this tech for centuries.
And now, with AI models becoming the new interface to knowledge and decision-making, it’s never been more important to get meaning right: in prompts, in documents, in how we structure the data that LLMs learn from.
Prompt engineering isn’t about talking to machines.
It’s about translating human meaning into patterns machines can dance with.
I really enjoyed reading this one, Julia. Your writing style is fun and practical. It’s breaking really technical linguistic stuff (at least for me) into something understandable. I feel i am going to come back to this when i am stuck with prompts. I would love an additional kind of next version of this post on how can you identify these issues or the situations when one can expect them!
Great piece Julia! Your breakdown of semantics vs. pragmatics really highlights why context is so crucial for effective AI prompts. It’s fascinating how linguistic principles can make our tech interactions better.