We develop statistical and mathematical models to make sense of large-scale population genomic data at multiple levels. New types of data from collaborators inspire new types of theory and vice versa. Population genetics provides an incredible tool to uncover the past to infer the key times and places at which natural selection acted or demography changed.
Specific research areas:
- Investigating the dynamics of adaptive immune systems through the lens of evolution in:
- vertebrates whereby the diversity of the T cell repertoire in a single individual evolves on both short timescales (during and shortly after infections/vaccinations) and long timescales (during aging).
- microbes where the CRISPR adaptive immune system is widespread and apparently highly effective yet viral pathogens (e.g., phages) persist.
- Inference of error and contamination in ancient DNA — how can we properly account for these challenges when making inferences based on population genetic theory? How can we use population genetic approaches to detect (or ideally, rule out) the presence of contamination?
- Understanding mutation rate evolution at both the level of the hominid phylogeny and the level of pathogens moving through our populations. How much variation do we see across the human genome? How does our epidemiological response to a pathogen affect its evolutionary trajectory?
- Xiao W, Weissman JL, and Johnson
PLF. (2024). Ecological drivers of CRISPR immune systems.
mSystems, page e0056824. [doi]
- Rasband SA, Bolton PE, Fang Q, Johnson
PLF, and Braun MJ. (2023). Evolution of the growth hormone gene
duplication in passerine birds. Genome biology and evolution, 15.
[doi]
- Johnson PLF, Bergstrom CT, Regoes RR, Longini IM, Halloran
ME, and Antia R. (2022). Evolutionary consequences of delaying intervention
for monkeypox. Lancet, 400:1191-3. [doi]
- Whittier CA, Nutter FB, Johnson PLF, Cross P, Lloyd-Smith
JO, Slenning BD, and Stoskopf MK. (2021). Population structure, intergroup
interaction, and human contact govern infectious disease impacts in mountain
gorilla populations. American Journal of Primatology, page e23350.
[doi]
- Zarnitsyna VI, Akondy RS, Ahmed H, McGuire DJ, Zarnitsyn VG, Moore M,
Johnson PLF, Ahmed R, Li KW, Hellerstein MK, and Antia R.
(2021). Dynamics and turnover of memory CD8 T cell responses following yellow
fever vaccination. PLoS Computational Biology, 17:e1009468. [doi]
- Weissman JL, Dogra S, Javadi K, Bolten S, Flint R, Davati
C, Beattie J, Dixit K, Peesay T, Awan S, Thielen P, Breitwieser F,
Johnson PLF, Karig D, Fagan WF, and Bewick S. (2021).
Exploring the functional composition of the human microbiome using a
hand-curated microbial trait database. BMC Bioinformatics, 22:306.
[doi]
- Yiu HH, Schoettle LN, Garcia-Neuer M, Blattman JN, and
Johnson PLF. (2021). Selection influences naive CD8+ TCR-β
repertoire sharing. Immunology, 162:464-75. [doi]
- Zarnitsyna VI, Johnson PLF, Blattman JN, and Antia R.
(2021). Modeling immunopathology during persistent viral infections. In C
Molina-París and G Lythe, editors, Mathematical, Computational and
Experimental T Cell Immunology, pages 109-20. Springer International
Publishing, Cham. ISBN 978-3-030-57204-4
- Weissman JL, Stoltzfus A, Westra ER, and Johnson
PLF. (2020). Avoidance of self during CRISPR immunization.
Trends in Microbiology, 28:543-53. [doi]
- Weissman JL and Johnson PLF. (2020).
Network-based prediction of novel CRISPR-associated genes in metagenomes.
mSystems, 5. [doi]
- Weissman JL, Fagan WF, and Johnson PLF.
(2019). Linking high GC content to the repair of double strand breaks in
prokaryotic genomes. PLoS Genetics, 15:e1008493. [doi]
- Weissman JL, Laljani RMR, Fagan WF, and
Johnson PLF. (2019). Visualization and prediction of CRISPR
incidence in microbial trait-space to identify drivers of antiviral immune
strategy. The ISME Journal, 13:2589-602. [doi]
- Weissman JL, Yiu HH, and Johnson
PLF. (2019). What bacteria do when they get sick. Frontiers
Young Minds, 7. [doi]
- Weissman JL, Fagan WF, and Johnson PLF.
(2018). Selective maintenance of multiple CRISPR arrays across prokaryotes.
The CRISPR Journal, 1:405-13. [doi]
- Loreille O, Ratnayake S, Bazinet AL, Stockwell TB, Sommer DD, Rohland N,
Mallick S, Johnson PLF, Skoglund P, Onorato AJ, Bergman NH,
Reich D, and Irwin JA. (2018). Biological sexing of a 4000-year-old Egyptian
mummy head to assess the potential of nuclear DNA recovery from the most
damaged and limited forensic specimens. Genes, 9. [doi]
- Weissman JL, Holmes R, Barrangou R, Moineau S, Fagan WF,
Levin B, and Johnson PLF. (2018). Immune loss as a driver of
coexistence during host-phage coevolution. The ISME Journal,
12:585-97. [doi]
- Akondy RS*, Johnson PLF*, Nakaya HI, Edupuganti S,
Mulligan MJ, Lawson B, Miller JD, Pulendran B, Antia R, and Ahmed R. (2015).
Initial viral load determines the magnitude of the human CD8 T cell response
to yellow fever vaccination. Proc Natl Acad Sci U S A, 112:3050-5.
[doi]
- Schroeder H, Ávila Arcos MC, Malaspinas AS, Poznik GD, Sandoval-Velasco M,
Carpenter ML, Moreno-Mayar JV, Sikora M, Johnson PLF,
Allentoft ME, Samaniego JA, Haviser JB, Dee MW, Stafford TW, Salas A, Orlando
L, Willerslev E, Bustamante CD, and Gilbert MTP. (2015). Genome-wide ancestry
of 17th-century enslaved Africans from the Caribbean. Proc Natl Acad Sci
U S A, 112:3669-73. [doi]
- Malaspinas AS, Lao O, Schroeder H, Rasmussen M, Raghavan M, Moltke I,
Campos PF, Sagredo FS, Rasmussen S, Gonçalves VF, Albrechtsen A, Allentoft
ME, Johnson PLF, Li M, Reis S, Bernardo DV, DeGiorgio M,
Duggan AT, Bastos M, Wang Y, Stenderup J, Moreno-Mayar JV, Brunak S,
Sicheritz-Ponten T, Hodges E, Hannon GJ, Orlando L, Price TD, Jensen JD,
Nielsen R, Heinemeier J, Olsen J, Rodrigues-Carvalho C, Lahr MM, Neves WA,
Kayser M, Higham T, Stoneking M, Pena SDJ, and Willerslev E. (2014). Two
ancient human genomes reveal Polynesian ancestry among the indigenous
Botocudos of Brazil. Curr Biol, 24:R1035-7. [doi]
- Fu Q, Li H, Moorjani P, Jay F, Slepchenko SM, Bondarev AA, Johnson
PLF, Aximu-Petri A, Prüfer K, de Filippo C, Meyer M, Zwyns N,
Salazar-García DC, Kuzmin YV, Keates SG, Kosintsev PA, Razhev DI, Richards
MP, Peristov NV, Lachmann M, Douka K, Higham TFG, Slatkin M, Hublin JJ, Reich
D, Kelso J, Viola TB, and Pääbo S. (2014). Genome sequence of a
45,000-year-old modern human from western Siberia. Nature,
514:445-9. [doi]
- Johnson PLF, Goronzy JJ, and Antia R. (2014). A population
biological approach to understanding the maintenance and loss of the T-cell
repertoire during aging. Immunology, 142:167-75. [doi]
- Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze
A, Renaud G, Sudmant PH, de Filippo C, Li H, Mallick S, Dannemann M, Fu Q,
Kircher M, Kuhlwilm M, Lachmann M, Meyer M, Ongyerth M, Siebauer M, Theunert
C, Tandon A, Moorjani P, Pickrell J, Mullikin JC, Vohr SH, Green RE, Hellmann
I, Johnson PLF, Blanche H, Cann H, Kitzman JO, Shendure J,
Eichler EE, Lein ES, Bakken TE, Golovanova LV, Doronichev VB, Shunkov MV,
Derevianko AP, Viola B, Slatkin M, Reich D, Kelso J, and Pääbo S. (2014).
The complete genome sequence of a Neanderthal from the Altai Mountains.
Nature, 505:43-9. [doi]
- Jónsson H, Ginolhac A, Schubert M, Johnson PLF, and
Orlando L. (2013). mapDamage2.0: fast approximate Bayesian estimates of
ancient DNA damage parameters. Bioinformatics, 29:1682-4. [doi]
- Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M,
Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PLF,
Fumagalli M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J,
Szklarczyk D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AMV,
Cahill J, Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen
C, Magnussen K, Thompson JF, Weinstock J, Gregersen K, Røed KH, Eisenmann V,
Rubin CJ, Miller DC, Antczak DF, Bertelsen MF, Brunak S, Al-Rasheid KAS,
Ryder O, Andersson L, Mundy J, Krogh A, Gilbert MTP, Kjær K,
Sicheritz-Ponten T, Jensen LJ, Olsen JV, Hofreiter M, Nielsen R, Shapiro B,
Wang J, and Willerslev E. (2013). Recalibrating Equus evolution using the
genome sequence of an early Middle Pleistocene horse. Nature,
499:74-8. [doi]
- Fu Q, Mittnik A, Johnson PLF, Bos K, Lari M, Bollongino R,
Sun C, Giemsch L, Schmitz R, Burger J, Ronchitelli AM, Martini F, Cremonesi
RG, Svoboda J, Bauer P, Caramelli D, Castellano S, Reich D, Pääbo S, and
Krause J. (2013). A revised timescale for human evolution based on ancient
mitochondrial genomes. Curr Biol, 23:553-9. [doi]
- Johnson PLF, Yates AJ, Goronzy JJ, and Antia R. (2012).
Peripheral selection rather than thymic involution explains sudden
contraction in naive CD4 T-cell diversity with age. Proc Natl Acad Sci U
S A, 109:21432-7. [doi]
- Gargis AS, Kalman L, Berry MW, Bick DP, Dimmock DP, Hambuch T, Lu F, Lyon
E, Voelkerding KV, Zehnbauer BA, Agarwala R, Bennett SF, Chen B, Chin ELH,
Compton JG, Das S, Farkas DH, Ferber MJ, Funke BH, Furtado MR, Ganova-Raeva
LM, Geigenmüller U, Gunselman SJ, Hegde MR, Johnson PLF,
Kasarskis A, Kulkarni S, Lenk T, Liu CSJ, Manion M, Manolio TA, Mardis ER,
Merker JD, Rajeevan MS, Reese MG, Rehm HL, Simen BB, Yeakley JM, Zook JM, and
Lubin IM. (2012). Assuring the quality of next-generation sequencing in
clinical laboratory practice. Nat Biotechnol, 30:1033-6. [doi]
- Johnson PLF, Kochin BF, Ahmed R, and Antia R. (2012). How
do antigenically varying pathogens avoid cross-reactive responses to
invariant antigens? Proc Biol Sci, 279:2777-85. [doi]
- Johnson PLF and Hellmann I. (2011). Mutation rate
distribution inferred from coincident SNPs and coincident substitutions.
Genome Biol Evol, 3:842-50. [doi]
- Johnson PLF, Kochin BF, McAfee MS, Stromnes IM, Regoes RR,
Ahmed R, Blattman JN, and Antia R. (2011). Vaccination alters the balance
between protective immunity, exhaustion, escape, and death in chronic
infections. J Virol, 85:5565-70. [doi]
- Burbano HA, Hodges E, Green RE, Briggs AW, Krause J, Meyer M, Good JM,
Maricic T, Johnson PLF, Xuan Z, Rooks M, Bhattacharjee A,
Brizuela L, Albert FW, de la Rasilla M, Fortea J, Rosas A, Lachmann M, Hannon
GJ, and Pääbo S. (2010). Targeted investigation of the Neandertal genome by
array-based sequence capture. Science, 328:723-5. [doi]
- Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson
N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD,
Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R,
Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A,
Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan
P, Brajkovic D, Kućan Ž, Gušić I, Doronichev VB, Golovanova LV,
Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW,
Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC,
Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, and Pääbo S. (2010). A
draft sequence of the Neandertal genome. Science, 328:710-22. [doi]
- Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B,
Briggs AW, Stenzel U, Johnson PLF, Maricic T, Good JM,
Marques-Bonet T, Alkan C, Fu Q, Mallick S, Li H, Meyer M, Eichler EE,
Stoneking M, Richards M, Talamo S, Shunkov MV, Derevianko AP, Hublin JJ,
Kelso J, Slatkin M, and Pääbo S. (2010). Genetic history of an archaic
hominin group from Denisova Cave in Siberia. Nature, 468:1053-60.
[doi]
- Johnson PLF and Slatkin M. (2009). Inference of microbial
recombination rates from metagenomic data. PLoS Genetics,
5:e1000674. [doi]
- Green RE, Malaspinas AS, Krause J, Briggs AW, Johnson PLF,
Uhler C, Meyer M, Good JM, Maricic T, Stenzel U, Prüfer K, Siebauer M,
Burbano HA, Ronan M, Rothberg JM, Egholm M, Rudan P, Brajković D, Kućan
Ž, Gušić I, Wikström M, Laakkonen L, Kelso J, Slatkin M, and Pääbo S.
(2008). A complete Neandertal mitochondrial genome sequence determined by
high-throughput sequencing. Cell, 134:416-26. [doi]
- Johnson PLF and Slatkin M. (2008). Accounting for bias
from sequencing error in population genetic estimates. Mol Biol
Evol, 25:199-206. [doi]
- Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J,
Prüfer K, Meyer M, Krause J, Ronan MT, Lachmann M, and Pääbo S. (2007).
Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl
Acad Sci U S A, 104:14616-21. [doi]
- Cross PC, Johnson PLF, Lloyd-Smith JO, and Getz WM.
(2007). Utility of R0 as a predictor of disease invasion in structured
populations. J R Soc Interface, 4:315-24. [doi]
- Getz WM, Lloyd-Smith JO, Cross PC, Bar-David S, Johnson
PLF, Porco TC, and Sánchez MS. (2006). Modeling the invasion and
spread of contagious disease in heterogeneous populations. In Z Feng, U
Dieckmann, and SA Levin, editors, Disease Evolution: Models, Concepts and
Data Analyses, AMS-DIMACS Series, pages 113-44. American Mathematical
Society, Providence, RI.
- Johnson PLF and Slatkin M. (2006). Inference of population
genetic parameters in metagenomics: a clean look at messy data. Genome
Res, 16:1320-7. [doi]
- Cross PC, Lloyd-Smith JO, Johnson PLF, and Getz WM.
(2005). Duelling timescales of host movement and disease recovery determine
invasion of disease in structured populations. Ecol Lett, 8:587-95.
[doi]
- International Human Genome Sequencing Consortium. (2004). Finishing the
euchromatic sequence of the human genome. Nature, 431:931-45. [doi]
- Bulyk ML, Johnson PLF, and Church GM. (2002). Nucleotides
of transcription factor binding sites exert interdependent effects on the
binding affinities of transcription factors. Nucleic Acids Research,
30:1255-61. [doi]
Software
All software is distributed under the GNU General Public License.
- PIIM: Population Inference In Metagenomics, with recombination (version 2)
-
This program calculates maximum likelihood estimates
of θ=2Nu (where u is the per-site mutation rate)
and ρ=2Nc (where c is the per-site rate of
initiation of recombination). emt also reproduces the
frequency-spectrum functionality from the previous version to
estimate R=Nr (where r is the exponential growth rate).
Input data is genome-level population data of variable sample depth and quality (e.g. metagenomic data).
For details on the method, see:
Johnson, PLF and Slatkin M. 2009. "Inference of microbial recombination rates from metagenomic data." PLoS Genetics.
Previous version can be found here.
- adaptivetau
-
R package for approximating stochastic simulations (continuous-time Markovian processes) that implements the adaptive tau leaping algorithm of Cao et al. (2007) The Journal of Chemical Physics.
Think of differential equations forced to take integer values and allowing for stochastic effects at low numbers. Similar in spirit to GillespieSSA but a bazillion times faster (± a zillion) thanks to implementing in C instead of pure R.
Download from CRAN.
Useful tools, BibTeX styles, etc.
Sometimes I feel like I spend most of my time shuffling data about and fighting with computers, so I've written many a tool to make my life easier. Perhaps these will be useful to someone else. I use Linux, so most tools will run on Mac without trouble, but Windows could be a headache.
All tools are distributed under the GNU General Public License. Give me a shout if you find a bug or if you find a tool particularly useful. The extent of documentation varies, but everything displays at least a brief usage statement if you run it without parameters.
- FASTA manipulation
- Scripts for manipulating fasta files in descending order of bugfreeness / awesomeness:
- FaIndex.pm -- Perl module that creates an index of sequences in fasta file(s) and uses it to extract subregions. Disk access is via memory mapping, which is extremely fast. Requires the File::map package from CPAN.
- fa_extract_many -- very quickly extracts regions from fasta files using the above FaIndex.pm module (will look for module in standard directories and in ~/bin).
- fa_wrap -- wrap fasta sequence to specified width
- fa_length -- list sequence ids and lengths
- Improvements to standard bioinformatic tools
-
- UCSC liftOver is an great tool for mapping coordinates between genome assemblies, but the command line version is ridiculously slow if you have many isolated coordinates. Two scripts: sortChains performs a one-time inelegant, slow sort of the over.chain files. Then you can use fastLift to perform fast coordinate conversion on the sorted chain file.
- ms patch that:
- outputs the position of segregating sites with higher precision (8 instead of 4 decimal places)
- changes the random number seed to use /dev/random instead of a file. The seed file can be Bad News if you're running in parallel on a shared filesystem.
Apply via patch -p0 < ms.patch
from the directory that contains "msdir".
- LDhat patch that:
- adds -oSites and -oLoc command line options to
convert
(original hardcoded filenames "sites.txt" and "locs.txt")
- fixes a 1-byte read off the end of an array in "pairdip.c"
- silences a few compiler warnings
Apply via patch -p0 < LDhat.patch
from the directory that contains the "LDhat" directory.
- Flat file manipulation
-
- FF_Index.pm -- A clever (if I do say so myself) Perl module for indexing flat files for quick data retrieval. Crucially, this is easy to use and creates a separate index file instead of mucking about with the original file.
- groupby -- approximates "group by" functionality of SQL, but takes tab-delimited flat files with one line per record (must already be sorted according to grouping keys).
- Queueing scripts
- Condor provides an elegant queueing system for running programs on a cluster of machines (either dedicated compute nodes or temporarily unused desktops). However, the supplied interface makes submitting jobs a pain*. Submitting should be as easy as the supplying the exact same command line that you would use if executing locally, i.e.:
./my_program -f some_options > my_output
replaced by
qsub './my_program -f some_options > my_output'
I have a suite of scripts that does exactly this for Condor.
- Random
-
- devEMF is an R package that provides an EMF (enhanced metafile) graphics driver to make producing EMF graphics as easy as EPS/PDF/PNG/etc. EMF is a vector based format, so it will always look good no matter how much you enlarge it. I wrote this driver out of frustration with both LibreOffice and Microsoft Office's lousy importation of EPS graphics (they both import EMF files seamlessly).
- BibTeX style files (bst) for biology journals
-
People
Philip Johnson (PI)
Associate professor in the Department of Biology excited about all the projects below and more that he hasn't found the time to work on. His background is a mix of biophysics, computational biology, theoretical population genetics and mathematical immunology. For the gory details, see his
cv.
Flannery McLamb (CBBG grad student)
Co-advised by
Elissa Lei in NIH-NIDDK. Applying machine learning to build understanding of cis regulatory element interaction with chromatin topology to predict gene expression.
Guillermo Hoffmann Meyer (BEES grad student)
Co-advised by
Michael Braun at the Smithsonian National Museum of Natural History. Interested in hybridization and effects of habitat fragmentation.
Lab Alums
- Ridwan Bello (PhD summer 2025)
- Shauna Rasband (PhD summer 2025)
- Arvind Jaya Shankar (MS summer 2025)
- Hao Yiu (PhD spring 2025)
- Thomas Pranzatelli (PhD fall 2024), Postdoc at UCSD
- Wei Xiao (PhD spring 2024), ORISE Fellow at FDA
- Nick Rachmaninoff (PhD fall 2022), Data Scientist at GSK
- JL Weissman (PhD fall 2019), Assistant Professor at Stony Brook University
- Aidan Bissell-Siders (undergrad), 2017-2019
- Vinay Velovolu (undergrad), 2018-2019
- Rohan Laljani (undergrad), 2017-2019
- Brian Liu (undergrad), 2016-2017
Interested in applying quantitative methods to biological problems?
Contact Philip! Potential graduate students should look into applying to the BEES or CBBG concentration areas within the
BISI graduate program.
Contact
plfj
+1 301 405 6176
Physical location:
Mailing address:
Department of Biology
University of Maryland
1204 Biology-Psychology Bldg
4094 Campus Drive
College Park, MD 20742
Updated Mar 2025.