PaperRank
PaperRank is Feynman’s first evidence-graph research workflow. It helps you decide what to read first by ranking papers for a topic and showing why each paper received its score.
Usage
feynman rank "mechanistic interpretability sparse autoencoders"
feynman rank "scaling laws" --limit 20 --json
feynman rank "scaling laws" --limit 20 --expand-citations 2 --json
feynman rank "scaling laws" --limit 20 --full-text-top 3 --json
feynman rank "scaling laws" --limit 20 --critique-top 5 --json
feynman rank "scaling laws" --limit 20 --preference-file preferences.json --json
feynman rank "scaling laws" --limit 20 --reproduction-notes reproduction-notes.json --json
feynman rank "scaling laws" --limit 20 --synthesize --json
Output
Each run writes a topic slug under outputs/:
<slug>-paper-rank.md— readable ranked brief<slug>-papers.jsonl— normalized paper records<slug>-scores.jsonl— component scores, evidence, and matched source spans<slug>-score-audit.md— per-paper score math, normalized contribution weights, field roles, evidence gaps, and source excerpts<slug>-rank-sensitivity.json— rank stability under balanced, influence-heavy, method/reproducibility-heavy, frontier-heavy, and topic-heavy weighting profiles<slug>-citation-graph.json— local seed/citation-neighborhood graph and PageRank-style values<slug>-graph-explorer.html— interactive citation graph explorer with paper roles, scores, links, and local citation edges<slug>-field-map.json— local topic/concept clusters plus foundation, frontier, bridge, methodology-anchor, and reproducibility-anchor roles<slug>-critique.md— optional research critique when--critique-top Nis used<slug>-score-calibration.json,<slug>-calibration-template.json, and<slug>-calibration-guide.md— optional calibration outputs when--preference-fileis supplied<slug>-reproduction-ledger.json,<slug>-reproduction-notes-template.json, and<slug>-replication-plan.md— optional reproduction outputs when--reproduction-notesis supplied<slug>-synthesis-packet.jsonand<slug>-synthesis-prompt.md— optional model-synthesis handoff when--synthesizeis used<slug>-model-synthesis.md— optional generated synthesis with the selected model and selection source when--synthesizeis used and the model call succeeds<slug>-rank.provenance.md— source accounting, formula, and verification caveats
Score Components
ReadFirstScore is a weighted average over available components:
- 30% topical relevance
- 20% citation impact
- 20% graph prestige when local citation edges exist
- 10% citation velocity
- 10% methodology quality
- 10% reproducibility
PaperRank uses OpenAlex work metadata for citation counts, normalized citation percentile, references, abstracts, URLs, and open-access state. Graph prestige is computed over referenced_works edges. By default that graph is built from the seed result set. Use --expand-citations N to add up to N outgoing cited works and incoming citing works per seed paper before computing PageRank-style graph prestige. Ranked outputs still score the seed papers; expansion papers are recorded as graph context in <slug>-citation-graph.json and can be inspected in <slug>-graph-explorer.html. When the graph has no local citation edges, graph prestige is marked unavailable and excluded from the score instead of being guessed.
Methodology and reproducibility are deterministic screening signals over metadata, abstract text, URLs, and enriched full text when requested. Use --full-text-top N to fetch source-specific full text for the highest-ranked candidates with a fetchable access route, extract canonical paper sections, attach section-specific paper-body spans, answer checklist-style rubric items, and rescore. Raw full text is not written to papers.jsonl; the paper record stores enrichment status, access candidates, fullTextLength, and section boundaries, while score evidence stores the matched spans.
When a marker is found, the score JSONL keeps a span with the source, field, marker, character offsets, section name when available, and surrounding text. The score JSONL also includes rubric answers such as present, partial, missing, or not_evaluated for limitations, reproducibility path, experimental details, statistical significance, and compute resources. These spans and rubric answers are designed to show why attention was routed to a paper, not to replace claim validation or reproduction work.
The graph explorer is generated by default so the run is inspectable without hand-opening every JSONL file. It lets you search/filter seed and expanded nodes, click a paper, inspect local citation links, score summaries, field roles, critique judgments, and source URLs. It does not embed raw full-text bodies.
The score audit is generated by default for the direct “why did this rank here?” question. It shows each paper’s component scores, normalized weights, contribution to the final score, field role, critique status, visible source-span evidence, missing components, and rubric checks to verify. It is the readable companion to <slug>-scores.jsonl.
The rank-sensitivity artifact is generated by default to show how much the rank order depends on the chosen weights. It reruns the same component signals under alternate weighting profiles and records each paper’s profile ranks, score range, rank range, and stability label. Treat stable papers as robust to these tested assumptions and volatile papers as requiring closer manual inspection before treating the order as decisive.
Calibration stays explicit without becoming default clutter. Without --preference-file, the ranked brief and provenance record that default weights are a transparent product hypothesis, not fitted preferences. With a filled preference file, PaperRank writes calibration artifacts, accepts rankedPaperIds and pairwise preferences, evaluates whether the preferred paper ranks ahead of the comparison paper, and reports default/profile agreement rates. Preferences whose paper IDs are outside the current run are counted as ignored rather than silently dropped.
The field map is generated by default to show the local structure of the run. It clusters seed and citation-neighborhood papers by OpenAlex topics and concepts, then assigns ranked seed papers roles such as foundation, frontier, bridge, methodology anchor, and reproducibility anchor. Those roles come from score signals, local citation degree, graph prestige, recency, and visible method/reproducibility evidence. They are local research-navigation labels, not a global taxonomy of the field.
Reproduction evidence stays separate from ranking. Without --reproduction-notes, PaperRank records that no completed reproduction notes were supplied inside the ranked brief and provenance. With a filled notes file, PaperRank writes a reproduction ledger, notes template, and replication plan. It accepts note statuses reproduced, partially_reproduced, failed, and not_runnable, plus central claim, result, metric, expected value, observed value, discrepancy, code/data/environment hints, commands, and check date. Notes whose paper IDs are outside the ranked seed set are counted as ignored. The ledger records externally supplied reproduction notes; it does not execute experiments and it does not embed raw full text.
Use --critique-top N to generate research-critique strengths, concerns, and follow-up questions for the top ranked papers. The critique is deterministic and grounded in PaperRank evidence: component scores, warnings, source spans, and section-aware rubric answers. It is a triage aid for deciding what to verify next, not an external review decision.
Use --synthesize to generate a bounded model-synthesis handoff. The packet contains ranks, component-score explanations, field roles, critique summaries, rubric gaps, source-span excerpts, and source references, while omitting raw full-text bodies. Use --synthesis-top N to choose how many ranked papers enter the packet. Feynman then asks the recommended available non-Pro research model to write <slug>-model-synthesis.md from that packet. Pass --synthesis-model provider/model or --model provider/model to select another non-Pro model for that run. The CLI output, generated synthesis, JSON summary, and provenance record the actual model plus whether it came from the recommendation path or an explicit override. The generated synthesis is useful for read-first narrative and next actions, but the deterministic packet, scores, field map, and provenance remain the audit trail.
Scientific Basis
PaperRank keeps bibliometric influence separate from paper quality. Citation-network influence is based on PageRank-style bibliometrics such as Eigenfactor. Citation expansion follows OpenAlex citation fields and filters: referenced_works gives outgoing citations, while cites:<work> finds incoming citing works. Field-map clusters use OpenAlex topics and concepts from the fetched papers. Citation velocity is separate because lifetime citation counts favor older papers. Methodology, reproducibility, critique questions, rank sensitivity, score calibration, completed reproduction evidence, field roles, and model synthesis stay separate from citation popularity because ML-paper checklists ask about experimental detail, transparency, data/code access, limitations, statistical significance, and compute resources.