literature-review-ai
- repo:
- Waaangjl/literature-review-ai
- lang:
- Python
- year:
- 2026 — active
A pipeline that takes one prompt — a topic, a discipline, a depth — and returns a structured literature review with traceable citations. Built when I was TA'ing International Climate Finance and the volume of student abstracts crossed the line from "demanding reading" into "demanding reading I can no longer do alone."
The honest framing
There's a kind of literature work where you actually need to think: synthesizing across schools, noticing which assumptions a field has stopped questioning, finding the move that nobody made. No tool replaces that.
There's a different kind of literature work that is just retrieval and structure: "give me the canonical 30 papers in topic X, group them by methodological tradition, summarize what each is doing in one paragraph, and flag the three that contradict each other." That part is search-and-summarize, and a model — given the right scaffolding — does it well.
This tool only does the second kind. The point of building it was to free up time for the first.
How it works
The pipeline is roughly:
- Decompose the prompt. "Lit review on blended finance for climate adaptation in emerging markets, MPA-level, ~25 papers" gets parsed into a query plan: 4–6 sub-topics, the kind of sources expected (peer-reviewed / grey lit / policy reports), date bounds, geographic scope.
- Retrieve. Through a combination of OpenAlex, Semantic Scholar, and (where the user has API access) JSTOR. Every paper carries its DOI; nothing without one passes the gate.
- Filter for relevance. A model reads each abstract against the original prompt and assigns a 0–1 relevance score. The top N go through.
- Cluster. Hierarchical clustering on the abstracts groups papers into 3–6 thematic clusters. The labels are model-generated then human-editable.
- Synthesize. For each cluster, the model writes a paragraph that names the cluster's central question, the papers that frame it, the consensus, and the dissent. With citations inline.
- Output. A
.mdfile. Optionally a BibTeX export. The citations link to the actual DOIs — every claim is checkable.
What the tool is honest about
- It will miss the canonical paper that everyone in the field knows but doesn't get cited cleanly. Knowledge graph holes are real.
- It cannot judge methodological quality. It can tell you a paper exists and what it claims; it cannot tell you whether the regression spec is right.
- It is most useful as a Pass 1. A first sweep that you, the human, then refine, contest, and add the move it missed.
Why I open-sourced it
A lot of TAs and graduate students do exactly this kind of survey reading. If the tool saves a few of them an afternoon — and frees that afternoon for the kind of thinking the tool can't do — that's a fair trade.
Clone
git clone https://github.com/Waaangjl/literature-review-ai
cd literature-review-ai
pip install -r requirements.txt
python review.py --topic "your topic" --depth medium --out review.md
Repository: github.com/Waaangjl/literature-review-ai ↗