Wednesday, 1 March 2017

Extracting antigen-specific TCRs from vdjdb on the command line

My previous lab and I recently published a review on the techniques and possibilities of analysing TCR repertoire data produced from high-throughput sequencing.

By and large I'm exceedingly happy with it - apart from a couple of missed references and one very unfortunate mix up regarding the accessibility, I'm very pleased with how it came out (and hope it will prove useful!).

One thing that I'm particularly pleased that we included (in spite of the lack of published descriptions yet) is the pair of manually curated TCR databases that have recently emerged: VDJdb and McPAS-TCR, in which you can find a small (but growing) host of TCRs of known specificity and/or disease association. We thought it was important to get these out there as soon as possible, as this is a rapidly changing field which is currently sorely needing for such efforts at standardisation and resource development.

With that in mind, I've been playing around with both of these, and thought I'd share some of the bare bones of the bash code I've been using to pull out sequences related to epitopes I'm interested in. Here is my quick vignette using VDJdb to pull out HIV-reactive TCR sequences - and even then just the fields of the database I'm interested in - using basically just default terminal commands.