feat(phase-19): track F production RAG lessons 64-69 by rohitg00 · Pull Request #213 · rohitg00/ai-engineering-from-scratch

rohitg00 · 2026-05-26T18:46:18Z

Summary

Adds six deep capstone sub-lessons for Phase 19 Track F (Production-grade RAG):

64 chunking-strategies-advanced - fixed, sentence, recursive-split, semantic (embedding-clustered), structural markdown; recall@k benchmark over a three-doc fixture.
65 hybrid-retrieval-bm25-dense - BM25 from scratch (Robertson/Sparck Jones formulation), deterministic dense retriever, reciprocal rank fusion (Cormack/Clarke/Buettcher 2009 formula), tunable per-modality weights.
66 reranker-cross-encoder - small torch nn.Module cross-encoder, paired tokenizer with type ids, tiny supervised training pass, two-stage retrieve-then-rerank pipeline with latency timing.
67 query-rewriting-hyde - HyDE, multi-query expansion, decomposition; per-strategy fixture queries and a deterministic mock LLM that runs offline.
68 rag-eval-precision-recall - precision@k, recall@k, MRR, nDCG, faithfulness, answer relevance with a mock LLM-as-judge; three pipeline variants graded against the qrels.
69 end-to-end-rag-system - composes 64-68 into one Pipeline class; ingests a fixture corpus, runs a query trace, executes the eval, exits zero on threshold-pass.

Every lesson ships docs (~1200-1500 words with mermaid), main.py with the algorithm from scratch, an exhaustive unittest suite, and a 6-question quiz.

Stack

numpy and torch (torch only in lessons 66 and 69 for the cross-encoder).
No langchain, llama-index, chromadb, faiss, sentence-transformers, or rank-bm25.
Deterministic mock embedding and mock LLM throughout; no network calls.

Test plan

Every demo exits 0 (six runs verified).
Every test suite passes (16 + 15 + 12 + 13 + 24 + 20 = 100 tests).
No em-dashes, no AI vocabulary, all code fences language-tagged.
No edits outside phases/19-capstone-projects/; site/, root README, and catalog.json untouched.
Atomic per-lesson commits, no Co-Authored-By footers.

coderabbitai · 2026-05-26T18:46:31Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7b17ffe2-a51c-4456-9ef5-ecb83201cfc9

📥 Commits

Reviewing files that changed from the base of the PR and between 7a2949f and c3ea8c3.

📒 Files selected for processing (4)

phases/19-capstone-projects/66-reranker-cross-encoder/docs/en.md
phases/19-capstone-projects/68-rag-eval-precision-recall/code/main.py
phases/19-capstone-projects/68-rag-eval-precision-recall/docs/en.md
phases/19-capstone-projects/69-end-to-end-rag-system/code/main.py

✅ Files skipped from review due to trivial changes (2)

phases/19-capstone-projects/66-reranker-cross-encoder/docs/en.md
phases/19-capstone-projects/68-rag-eval-precision-recall/docs/en.md

📝 Walkthrough

Walkthrough

This PR adds six complete, progressive capstone lessons (64–69) for building a RAG system from first principles, plus curriculum metadata updates. Each lesson includes implementation code, unit tests, English documentation, and quiz assets. The lessons form an integrated curriculum: text chunking strategies, hybrid BM25+dense retrieval, cross-encoder reranking, query rewriting (HyDE/multi-query/decomposition), offline evaluation metrics, and a fully integrated end-to-end pipeline with threshold-based validation.

Changes

Curriculum Foundation and New Lessons

Layer / File(s)	Summary
Catalog metadata and lesson registry `catalog.json`	Updated totals: `code_files` 487→519. Updated `19-capstone-projects` phase `lesson_count` 17→23. Expanded `code_files` lists for lessons 09–11 (TypeScript modules). Added metadata entries for lessons 64–69 with associated code/test file counts.
Lesson 64: Chunking Strategies `phases/19-capstone-projects/64-chunking-strategies-advanced/code/main.py`, `code/tests/test_chunkers.py`, `docs/en.md`, `quiz.json`	Five chunking algorithms (fixed-window, sentence-packing, recursive splitting, semantic clustering via mock embeddings, structural markdown) with `Chunk` span objects, `DenseIndex` retrieval, and `eval_recall` measuring overlap-based recall@k. Tests validate chunk overlap, boundary behavior, embedding determinism, and recall monotonicity. Docs explain strategy selection rules and failure modes.
Lesson 65: Hybrid Retrieval `phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py`, `code/tests/test_hybrid.py`, `docs/en.md`, `quiz.json`	BM25 indexing with term frequency and IDF, `DenseIndex` with deterministic mock embeddings, reciprocal rank fusion with optional weights, `HybridRetriever` orchestrating both modalities. Tests validate tokenization, BM25 ranking (field weights and literal matches), dense retrieval determinism, RRF fusion behavior, and end-to-end hybrid output structure.
Lesson 66: Cross-Encoder Reranker `phases/19-capstone-projects/66-reranker-cross-encoder/code/main.py`, `code/tests/test_reranker.py`, `docs/en.md`, `quiz.json`	`CrossEncoder` neural module with embeddings, multi-head attention, mean pooling, and scalar output; paired tokenization with type IDs; `train_tiny()` supervised training on query-doc triples; two-stage retrieve-top-N-then-rerank pipeline with `BiEncoder` mock retriever and latency measurement in ms. Tests validate tokenization, forward-pass shape, training convergence, reranking top-k behavior, and pipeline latency reporting.
Lesson 67: Query Rewriting `phases/19-capstone-projects/67-query-rewriting-hyde/code/main.py`, `code/tests/test_rewriters.py`, `docs/en.md`, `quiz.json`	Deterministic `MockLLM` with synonym tables for HyDE hypothetical generation, multi-query paraphrase expansion, and query decomposition; `Rewriter` base class with `HyDERewriter`, `MultiQueryRewriter`, `DecomposeRewriter` implementations; `retrieve_with_rewriter()` runs variants over `HybridRetriever` and fuses via RRF. Tests validate generation semantics, rewriter output shape, retrieval ranking changes, and baseline-relative gold promotion.
Lesson 68: RAG Evaluation `phases/19-capstone-projects/68-rag-eval-precision-recall/code/main.py`, `code/tests/test_eval.py`, `docs/en.md`, `quiz.json`	Retrieval metrics (precision@k, recall@k, MRR, nDCG with graded relevance); answer metrics (claim extraction, faithfulness/relevance via `MockJudge` token overlap); three pipelines (baseline, hybrid with synonym expansion+title match, hybrid+rerank); `evaluate_pipeline()` aggregates metrics across fixture `QRELS`. Tests validate metric bounds, claim splitting, judge-based scoring, and pipeline metric comparisons.
Lesson 69: End-to-End RAG System `phases/19-capstone-projects/69-end-to-end-rag-system/code/main.py`, `code/tests/test_pipeline.py`, `docs/en.md`, `quiz.json`	Complete `Pipeline` class: document ingestion with recursive chunking, `HybridIndex` retrieval with optional HyDE rewriting, `CrossEncoder` reranking with overlap blending, `generate_answer()` with refusal threshold and citation anchors; `run_eval()` computes recall/precision/MRR/faithfulness against metric thresholds; `run_demo()` self-terminates with exit code 0 (pass) or 1 (fail). Tests validate chunking, index search, answer generation/refusal, metric computation, and threshold-based exit codes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 2.49% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(phase-19): track F production RAG lessons 64-69' clearly summarizes the main addition of six new capstone lessons (64-69) for Phase 19 Track F focused on production RAG systems.
Description check	✅ Passed	The description provides a comprehensive overview of the six lessons being added, their contents, technology stack, test coverage, and constraints—all directly related to the changeset's purpose of adding production-grade RAG lessons.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/phase-19-track-f

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (6)

phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py (3)

95-96: ⚡ Quick win

Consider adding strict=True to zip() for safer iteration.

The zip(self.docs, scores) at line 95 would benefit from strict=True to ensure the lists match in length, catching potential indexing bugs.

🔧 Proposed fix

-        ranked = sorted(zip(self.docs, scores), key=lambda x: -x[1])
+        ranked = sorted(zip(self.docs, scores, strict=True), key=lambda x: -x[1])
         return [(d, s) for d, s in ranked[:k] if s > 0]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py`
around lines 95 - 96, The zip of self.docs and scores in the ranking code
(ranked = sorted(zip(self.docs, scores), ...)) can silently truncate if lengths
differ; change the zip call to use strict=True (i.e., zip(self.docs, scores,
strict=True)) so a ValueError is raised when lengths mismatch, ensuring bugs are
caught early in the ranking flow that builds ranked and the return list slice.

120-121: ⚡ Quick win

Consider adding strict=True to zip() for safer iteration.

Same issue as in lesson 64: zip(a, b, strict=True) makes the dimension-match assumption explicit.

🔧 Proposed fix

 def cosine(a: list[float], b: list[float]) -> float:
-    return sum(x * y for x, y in zip(a, b))
+    return sum(x * y for x, y in zip(a, b, strict=True))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py`
around lines 120 - 121, The cosine function uses zip(a, b) which silently
truncates mismatched-length vectors; change the zip call in function cosine to
zip(a, b, strict=True) so mismatched dimensions raise an error, ensuring
explicit dimension checks during iteration.

143-159: ⚡ Quick win

Consider adding strict=True to zip() at line 154.

Although line 151 validates that len(weights) == len(rankings), using zip(weights, rankings, strict=True) at line 154 provides an additional runtime safety check and makes the contract explicit.

🔧 Proposed fix

     score: dict[str, float] = defaultdict(float)
     by_id: dict[str, Doc] = {}
-    for w, ranks in zip(weights, rankings):
+    for w, ranks in zip(weights, rankings, strict=True):
         for rank, (doc, _) in enumerate(ranks):
             score[doc.doc_id] += w * (1.0 / (k + rank + 1))
             by_id[doc.doc_id] = doc

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py`
around lines 143 - 159, The rrf function currently zips weights and rankings
without the strict flag; even though len(weights) is validated earlier, add
strict=True to the zip call (i.e., use zip(weights, rankings, strict=True)) in
the loop that iterates over weights and ranks to enforce the one-to-one contract
at runtime and surface any unexpected length mismatches immediately; update the
zip invocation inside rrf accordingly.

phases/19-capstone-projects/64-chunking-strategies-advanced/code/main.py (1)

172-173: ⚡ Quick win

Consider adding strict=True to zip() for safer iteration.

In Python 3.10+, using zip(a, b, strict=True) catches length mismatches at runtime. While the vectors from mock_embed should always match in dimension, adding strict=True makes the assumption explicit and helps catch bugs if the code changes.
🔧 Proposed fix
 def cosine(a: list[float], b: list[float]) -> float:
-    return sum(x * y for x, y in zip(a, b))
+    return sum(x * y for x, y in zip(a, b, strict=True))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/64-chunking-strategies-advanced/code/main.py`
around lines 172 - 173, The cosine(a: list[float], b: list[float]) function uses
zip(a, b) which silently truncates if lengths differ; update the zip call to
zip(a, b, strict=True) to raise an error on length mismatch so dimension
mismatches are caught at runtime and prevent subtle bugs.

phases/19-capstone-projects/64-chunking-strategies-advanced/code/tests/test_chunkers.py (1)

44-48: ⚡ Quick win

Consider adding strict=True to zip() or using itertools.pairwise().

For Python 3.10+, zip(strict=True) catches length mismatches. Alternatively, itertools.pairwise() (Python 3.10+) is more idiomatic for iterating over successive pairs.

🔧 Proposed fix (Option 1: strict)

     def test_overlap_visible(self) -> None:
         chunks = fixed_window("d", SAMPLE_PROSE, size=60, overlap=20)
         # Adjacent chunks must share at least the overlap region size by construction
-        for a, b in zip(chunks, chunks[1:]):
+        for a, b in zip(chunks, chunks[1:], strict=True):
             self.assertGreaterEqual(a.end - b.start, 20)

🔧 Proposed fix (Option 2: pairwise)

+    from itertools import pairwise
+
     def test_overlap_visible(self) -> None:
         chunks = fixed_window("d", SAMPLE_PROSE, size=60, overlap=20)
         # Adjacent chunks must share at least the overlap region size by construction
-        for a, b in zip(chunks, chunks[1:]):
+        for a, b in pairwise(chunks):
             self.assertGreaterEqual(a.end - b.start, 20)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@phases/19-capstone-projects/64-chunking-strategies-advanced/code/tests/test_chunkers.py`
around lines 44 - 48, In test_overlap_visible, the use of zip(chunks,
chunks[1:]) can silently miss or hide length mismatches; update the pair
iteration to either use zip(chunks, chunks[1:], strict=True) or replace the loop
with itertools.pairwise(chunks) (import pairwise) to make failures explicit;
keep the assertion a.end - b.start >= 20 and ensure imports are adjusted if you
choose pairwise.

phases/19-capstone-projects/67-query-rewriting-hyde/code/main.py (1)

302-304: ⚡ Quick win

Deduplicate multi-query rewrites before retrieval fusion.

MultiQueryRewriter.rewrite() can emit duplicate queries (especially fallback cases), which causes duplicate rankings to be fused multiple times and biases RRF scores.

Proposed fix

     def rewrite(self, query: str) -> RewriteResult:
-        rewrites = [query] + self.llm.paraphrase(query, n=self.n)
+        raw = [query] + self.llm.paraphrase(query, n=self.n)
+        rewrites: list[str] = []
+        seen: set[str] = set()
+        for r in raw:
+            key = r.strip().lower()
+            if key in seen:
+                continue
+            seen.add(key)
+            rewrites.append(r)
         return RewriteResult(strategy=self.name, rewrites=rewrites)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/67-query-rewriting-hyde/code/main.py` around
lines 302 - 304, MultiQueryRewriter.rewrite can emit duplicate queries; change
the construction of the rewrites list in rewrite(self, query) to remove
duplicates while preserving order before returning RewriteResult. Specifically,
deduplicate the list [query] + self.llm.paraphrase(query, n=self.n) (keeping the
first occurrence of each string) and then pass that deduplicated list into
RewriteResult(strategy=self.name, rewrites=rewrites) so duplicate paraphrases
don't bias RRF fusion.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phases/19-capstone-projects/66-reranker-cross-encoder/docs/en.md`:
- Around line 48-49: Update the documentation to match the actual CrossEncoder
implementation: change the statement that the model uses "one attention head" to
reflect that CrossEncoder defaults to 4 heads (refer to CrossEncoder and its
constructor/defaults), and replace the claim about batching via torch.stack with
a note that _batch_encode builds batches using torch.tensor(...) (refer to
_batch_encode). Ensure the wording describes the real architecture (single
transformer block with default 4 attention heads) and the actual batching
mechanism to avoid teaching drift.

In `@phases/19-capstone-projects/68-rag-eval-precision-recall/code/main.py`:
- Around line 62-67: mean_reciprocal_rank currently uses zip which silently
truncates when retrieved_per_query and gold_per_query differ; add an explicit
length check at the start of mean_reciprocal_rank that compares
len(retrieved_per_query) and len(gold_per_query) and raise a ValueError (or
return an explicit error) with a clear message including both lengths if they
differ, then proceed to compute the average using reciprocal_rank as before.

In `@phases/19-capstone-projects/68-rag-eval-precision-recall/docs/en.md`:
- Around line 97-98: Update the documentation text about gold_answer_substring
to match current evaluator behavior: remove the statement that faithfulness is
judged by the gold answer substring and instead state that faithfulness is
evaluated by comparing extracted claims to retrieved context (as implemented in
code/main.py), and note that all metrics (faithfulness, relevance, etc.) are
printed together in the same results table; apply the same correction to the
other occurrence of the outdated wording.

In `@phases/19-capstone-projects/69-end-to-end-rag-system/code/main.py`:
- Around line 474-487: Pipeline.query currently records strategies from
rewriter.pick_strategy but only performs a single retrieval; when strategy ==
"hyde" it uses rewrite_hyde and search_with_hypothetical, but for "multiquery"
and "decompose" it still calls self.index.search. Update Pipeline.query to
branch on the returned strategy: if "hyde" use self.rewriter.rewrite_hyde(...)
and self.index.search_with_hypothetical(...); if "multiquery" call the
multi-query flow (e.g., get multiple sub-queries from a rewriter method like
rewriter.generate_multi_queries(...) or similar and call a corresponding index
method such as self.index.search_multiquery(...) or run multiple
self.index.search(...) calls and aggregate results); if "decompose" call the
decomposition flow (e.g., self.rewriter.decompose(...) then perform per-part
searches and merge results or call self.index.search_decompose(...)); otherwise
fall back to self.index.search(...). Ensure you reference and call the
appropriate rewriter and index methods (rewriter.pick_strategy,
rewriter.rewrite_hyde, rewriter.decompose / rewriter.generate_multi_queries and
index.search_with_hypothetical, index.search_multiquery / index.search_decompose
or perform multiple index.search calls) and preserve latencies/top_n handling.

---

Nitpick comments:
In `@phases/19-capstone-projects/64-chunking-strategies-advanced/code/main.py`:
- Around line 172-173: The cosine(a: list[float], b: list[float]) function uses
zip(a, b) which silently truncates if lengths differ; update the zip call to
zip(a, b, strict=True) to raise an error on length mismatch so dimension
mismatches are caught at runtime and prevent subtle bugs.

In
`@phases/19-capstone-projects/64-chunking-strategies-advanced/code/tests/test_chunkers.py`:
- Around line 44-48: In test_overlap_visible, the use of zip(chunks, chunks[1:])
can silently miss or hide length mismatches; update the pair iteration to either
use zip(chunks, chunks[1:], strict=True) or replace the loop with
itertools.pairwise(chunks) (import pairwise) to make failures explicit; keep the
assertion a.end - b.start >= 20 and ensure imports are adjusted if you choose
pairwise.

In `@phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py`:
- Around line 95-96: The zip of self.docs and scores in the ranking code (ranked
= sorted(zip(self.docs, scores), ...)) can silently truncate if lengths differ;
change the zip call to use strict=True (i.e., zip(self.docs, scores,
strict=True)) so a ValueError is raised when lengths mismatch, ensuring bugs are
caught early in the ranking flow that builds ranked and the return list slice.
- Around line 120-121: The cosine function uses zip(a, b) which silently
truncates mismatched-length vectors; change the zip call in function cosine to
zip(a, b, strict=True) so mismatched dimensions raise an error, ensuring
explicit dimension checks during iteration.
- Around line 143-159: The rrf function currently zips weights and rankings
without the strict flag; even though len(weights) is validated earlier, add
strict=True to the zip call (i.e., use zip(weights, rankings, strict=True)) in
the loop that iterates over weights and ranks to enforce the one-to-one contract
at runtime and surface any unexpected length mismatches immediately; update the
zip invocation inside rrf accordingly.

In `@phases/19-capstone-projects/67-query-rewriting-hyde/code/main.py`:
- Around line 302-304: MultiQueryRewriter.rewrite can emit duplicate queries;
change the construction of the rewrites list in rewrite(self, query) to remove
duplicates while preserving order before returning RewriteResult. Specifically,
deduplicate the list [query] + self.llm.paraphrase(query, n=self.n) (keeping the
first occurrence of each string) and then pass that deduplicated list into
RewriteResult(strategy=self.name, rewrites=rewrites) so duplicate paraphrases
don't bias RRF fusion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f23cb9da-fb46-446c-9f19-534791c14c19

📥 Commits

Reviewing files that changed from the base of the PR and between c1374e1 and 7a2949f.

📒 Files selected for processing (25)

catalog.json
phases/19-capstone-projects/64-chunking-strategies-advanced/code/main.py
phases/19-capstone-projects/64-chunking-strategies-advanced/code/tests/test_chunkers.py
phases/19-capstone-projects/64-chunking-strategies-advanced/docs/en.md
phases/19-capstone-projects/64-chunking-strategies-advanced/quiz.json
phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/main.py
phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/code/tests/test_hybrid.py
phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/docs/en.md
phases/19-capstone-projects/65-hybrid-retrieval-bm25-dense/quiz.json
phases/19-capstone-projects/66-reranker-cross-encoder/code/main.py
phases/19-capstone-projects/66-reranker-cross-encoder/code/tests/test_reranker.py
phases/19-capstone-projects/66-reranker-cross-encoder/docs/en.md
phases/19-capstone-projects/66-reranker-cross-encoder/quiz.json
phases/19-capstone-projects/67-query-rewriting-hyde/code/main.py
phases/19-capstone-projects/67-query-rewriting-hyde/code/tests/test_rewriters.py
phases/19-capstone-projects/67-query-rewriting-hyde/docs/en.md
phases/19-capstone-projects/67-query-rewriting-hyde/quiz.json
phases/19-capstone-projects/68-rag-eval-precision-recall/code/main.py
phases/19-capstone-projects/68-rag-eval-precision-recall/code/tests/test_eval.py
phases/19-capstone-projects/68-rag-eval-precision-recall/docs/en.md
phases/19-capstone-projects/68-rag-eval-precision-recall/quiz.json
phases/19-capstone-projects/69-end-to-end-rag-system/code/main.py
phases/19-capstone-projects/69-end-to-end-rag-system/code/tests/test_pipeline.py
phases/19-capstone-projects/69-end-to-end-rag-system/docs/en.md
phases/19-capstone-projects/69-end-to-end-rag-system/quiz.json

…, batching)

…lness pipeline

… fusion

# Conflicts: # catalog.json

rohitg00 and others added 7 commits May 26, 2026 19:22

feat(phase-19/64): add chunking-strategies-advanced deep capstone

6bba3d3

feat(phase-19/65): add hybrid-retrieval-bm25-dense deep capstone

81ccdb7

feat(phase-19/66): add reranker-cross-encoder deep capstone

515b442

feat(phase-19/67): add query-rewriting-hyde deep capstone

a93c55f

feat(phase-19/68): add rag-eval-precision-recall deep capstone

69425eb

feat(phase-19/69): add end-to-end-rag-system deep capstone

177d9bc

chore(catalog): auto-regen

7a2949f

vercel Bot deployed to Preview May 26, 2026 18:46 View deployment

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

rohitg00 added 3 commits May 26, 2026 20:58

fix(phase-19/66): align cross-encoder docs with implementation (heads…

e130eef

…, batching)

fix(phase-19/68): guard MRR length mismatch + align docs with faithfu…

3fd3399

…lness pipeline

fix(phase-19/69): execute multiquery and decompose strategies via RRF…

c3ea8c3

… fusion

vercel Bot deployed to Preview May 26, 2026 20:01 View deployment

rohitg00 mentioned this pull request May 26, 2026

ci(curriculum): also auto-fix README counts on main push; PR check advisory #217

Merged

Merge remote-tracking branch 'origin/main' into feat/phase-19-track-f

13044a0

# Conflicts: # catalog.json

vercel Bot deployed to Preview May 27, 2026 09:13 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(phase-19): track F production RAG lessons 64-69#213

feat(phase-19): track F production RAG lessons 64-69#213
rohitg00 wants to merge 11 commits into
mainfrom
feat/phase-19-track-f

rohitg00 commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rohitg00 commented May 26, 2026

Summary

Stack

Test plan

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 26, 2026 •

edited

Loading