A lightweight, pure-Python wrapper around pybeh for analyzing and plotting free-recall behavioral data with pandas and seaborn. It re-expresses pybeh's matrix-based analyses as functions that take tidy, long-format pandas DataFrames of presentation/recall events, and adds within-subject confidence-interval helpers for plotting.
A minimal subset of pybeh is vendored (pybeh_pd._pybeh), so no separate
pybeh install is required.
Given a long-format events DataFrame (one row per presented/recalled item), the
pd_* functions compute, per subject:
pd_crp— lag conditional response probability (lag-CRP; the temporal contiguity effect).pd_temp_fact— temporal clustering factor (Polyn, Norman & Kahana, 2009).pd_sem_crp— semantic CRP, binned by a similarity space.pd_dist_fact— distance/similarity clustering factor.pd_min_crp/pd_min_temp_fact— repeated-presentation-aware variants.pd_sem_crp_list,pd_dist_fact_list, and*_sub— per-list / per-subject aggregating variants.- CI helpers for within-subject error bars:
cousineau(Cousineau–Morey–O'Brien),loftus_masson_analytic, and the Loftus–Masson Kahana ports.
The underlying matrix builders (make_recalls_matrix, get_all_matrices, ...)
and pybeh primitives (crp, temp_fact, ...) are also exposed.
Pure Python — install from source with pip:
pip install . # from a clone
pip install git+https://github.com/pennmem/pybeh_pd.git # from GitHub
pip install -e . # editable (for development)Requires Python 3.11–3.13 and numpy / pandas / scipy (installed automatically).
import pandas as pd
import pybeh_pd as pb
# One subject, two 4-item lists. WORD rows are presentations; REC_WORD rows are
# recalls. `itemno` is the item id; recalls reference the presented item numbers.
events = pd.DataFrame({
"subject": "subj1", "session": 0,
"list": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
"type": ["WORD"] * 4 + ["REC_WORD"] * 3 + ["WORD"] * 4 + ["REC_WORD"] * 3,
"itemno": [11, 12, 13, 14, 12, 13, 14, 21, 22, 23, 24, 23, 22, 21],
})
# Expects one subject's events; group by subject for many-subject data:
# events.groupby("subject").apply(pb.pd_crp, itemno_column="itemno")
crp = pb.pd_crp(events, lag_num=3)
print(crp[["lag", "prob"]])
# lag 0 is NaN; forward (+1) and backward (-1) adjacent transitions dominate.
print("temporal clustering factor:", pb.pd_temp_fact(events))The default column names are subject / session / list (the trial index),
type (with "WORD" presentations and "REC_WORD" recalls), and itemno; all
are overridable via keyword arguments (e.g. itemno_column="item_num").
See examples/ for notebooks demonstrating FR1, catFR1, repFR1, and
the Loftus–Masson confidence intervals on real datasets.
This repo uses pixi for a reproducible dev environment:
pixi run build # confirm the package imports
pixi run test # run the test suite
pixi run test-cov # with coverageOr with plain pip/pytest:
pip install -e ".[test]"
pytestThe test suite includes a golden-master behavior-lock regression suite, per-function unit tests, and integration tests on a committed 5-session sample of real ltpFR2 free-recall data (no external data access required).
GitHub Actions runs the suite across Python 3.11–3.13 and both numpy 1.x and 2.x (pandas 2.x/3.x), plus a wheel/sdist build and a pyright + ruff lint job.
MIT.