Monday, August 30, 2010

NGS viewers reviewed

I gathered up some of the recent free next generation sequence viewers that were capabale of viewing BAM files - and put each through the motions with a few BAM files and reference sequences of various sizes. While there are some great ideas and several choices to be found along the feature spectrum, I think we are still in the dark ages with this stuff. No viewer has really been able to entirely combine usability with performance and analysis capabilities, let alone extensibility and web connectivity.

tview
My take: tview is the barebones, text-rendered viewer that is included with Samtools. People who favor this as their BAM viewer probably think Vim is too polished. Even the very limited navigation is remarkably unituitive (goto a coordinate requires chr:position even if you have just one chromosome, no errors are displayed if you forget chr).
Most resembles which video game: Oregon Trail
Good standout feature: command line access
Bad feature: text display

BamView
My take: Bamview is a wicked fast simple BAM file viewer. It doesn't have much in the way of features, but for cursory examination of BAM files it is more palatable than tview.
Most resembles which video game: Burgertime
Good standout feature: strand split screen
Bad feature: drag selecting a region turns it red, and umm... that's all it does

GenoViewer
My take: The only NGS viewer endorsed by Speak the Hungarian rapper, unfortunately this recent entrant leaves a lot to be desired in terms of performance with large files. GenoViewer is very hard on the eyes - the indiscriminate use of primary colors looks like a kid somehow vomited up the ball pit at Chuck E. Cheese.
Most resembles which video game: Centipede
Good standout feature: promises that "You will not get lost in the details, and can easily figure out the true meaning behind the data. Guaranteed."
Bad feature: graphics

MagicViewer
My take: MagicViewer has come a long way since its initial release last December. With its interesting pie-chart icon renderings of SNP purity, decent treatment of annotation tracks, and improved performance, MagicViewer might soon be a contender to Tablet in the midweight category. The navigation is workable but takes some getting used to - it's unclear when to use scroll bars vs. arrows
Most resembles which video game: Moon Patrol
Good standout feature: primer design tool
Bad feature: some regions are simply not visible - no reference, no reads

Tablet
My take: Hands down the most attractive of the viewers, Tablet is aimed at fostering a delicate balance between performance, features, and aesthetics. Tablet comes with a suite of read views - Packed, Stacked, and Classic - to suit both young children and elderly scientists alike. GFF feature files can be loaded but they appear to merely serve as position indices.
Most resembles which video game: SimCity
Good standout feature: interface
Bad feature: read insertions not displayed correctly



IGV
My take: The Integrative Genomics Viewer is a serious tool for exploring and analyzing large datasets. In addition to viewing, IGV is designed to allow users to extract the kind of hard publishable data that has typically been the domain of Bio* scripts. Like the true product of an Ivy League education, IGV can appear aloof and arrogant to newcomers. The viewer will let you load BAM files and other annotation tracks that have nothing to do with the reference without comment or guidance, then require an extra unitutive click to actually generate a view. While not the easiest or even the best in performance, in terms of generating real queries on your data from the viewer there is nothing comparable.
Most resembles which video game: Dig Dug
Good standout feature: analysis tools
Bad feature: throws a hissy fit if it cannot connect to home server


gbrowse2
My take: gbrowse2 is the AJAXified protege to the venerable generic genome browser. Samtools integration is a recent addition to this highly extensible platform, which has been used for years to display everything from large genomes to small sequencing projects. Of all the viewers here, GB2 provides the best set of visualization tools, such that virtually any biological information that can be rendered linearly has been done so as gbrowse tracks. The lone true web application, gbrowse genomic positions can be hyperlinked or even snapshot-embedded on the web. The web provides the best platform to share visual genomic data among several users. However, a gbrowse2 installation with BAM tracks can be a massive pain to install, configure, and debug ("landmark chrI not found" is the most popular google search in all of bioinformatics). Novices can expect a minefield of historical gotchas and arcane conventions ("Name=" not "name=" field in gff3, bp_seqfeature_load instead of bp_load_gff for gff3, no validator for conf files), and even experienced users are often baffled by cryptic errors that pop up in server logs.
Most resembles which video game: Sissyfight 2000
Good standout feature: hyperlinks
Bad feature: setup

Wednesday, August 18, 2010

My thoughts on the acquisition of Ion Torrent by Life Technologies

Yesterday it was announced that Ion Torrent, makers of the Ion Personal Genome Machine (PGM) sequencer, would be acquired by Invitrogen ABI Life Technologies for $375M in cash and stock, with the possibility of another $350M if various milestones are met.

Gregory Lucier, chairman and CEO of Life Technologies, said, "We believe Ion Torrent's technology will represent a profound change for the life sciences industry, as fundamental as the one we saw with the introduction of qPCR." That analogy might not sound earth-shattering, but I suspect he is using qPCR as an example because it is one of the few molecular biology (as opposed to biochemical) techniques regularly used in clinical settings.

The PGM is a unique machine because it is the first second generation sequencer (tentative specs: ~3 million 200bp reads at $500 1hr run, $50k machine) that could be conceivably leased by a small clinical testing facility, like the ones fed by those LabCorp boxes you see scattered all over strip malls. Bear in mind even a capillary-based sanger sequencer like the ABI3730xl costs a whopping $375,000, which comes as a shock to those of us who mostly work with next-generation sequencing data. You can see not only why the PGM is really a game changer, but how it might fit into Life Technologies offerings.

Exactly which clinical diagnostics are, or will be, suited for this machine are unclear. With the right foolproof software this could replace a lot of PCR and microarray-based tests. Read lengths and throughput are bound to increase just as they did with Solexa. Future applications like tumor sequencing and 16S rRNA microbiome sequencing have not even entered into medical practice yet.

The key to Life Technology succeeding in these areas will likely come down to non-technical challenges:
  • Getting FDA approval for the PGM as a medical device and for various protocols based on the PGM as approved clinical tests. Life Technology has experience with this process. Ion Torrent clearly does not.
  • Convincing insurance companies and HMOs that these tests are cost-effective diagnostics. When you consider the exorbitant cost of other tests - e.g. several MRIs over a course of chemotherapy - this might not be such hard sell.
  • Developing a business model that will allow small clinics to lease the machine for little or even no cost provided they agree to purchase a minimum amount of consumables. Life Technologies will likely market the PGM to smaller labs who themselves will inevitably face competition from larger dedicated sequencing centers trying to centralize this type of work using bigger machines from Illumina or PacBio (or even LT's SOLiD).

Although these obstacles seem daunting, I have sat through academic departmental meetings where very knowledgeable sequencer salespeople were invited back multiple times only to be jerked around as the faculty hemmed and hawed over whether this was the right $400k machine to buy at the time. That was undoubtedly an expensive and frustrating sales process for Solexa and 454. With the PGM, Life Technologies can pitch this much cheaper machine to an albeit different group of skeptics, clinical lab managers, but one with a clear bottom line in mind.