Guest post by Sandra Smieszek
The creator of the basic local alignment search tool (BLAST) and an eminent bioinformatics forerunner and literati does not call for much introduction. Stephen Altschul (pictured right) graduated summa cum laude from Harvard University, and has a Ph.D. from MIT, both in mathematics. What BLAST can do for us is something we all know yet the quintessence is in how it originated, and even more interestingly, the man behind the scenes. It is certainly my great pleasure to introduce Stephen Altschul, who will provide us not solely with the story of his algorithms, but additionally of the power-law explosion of bioinformatics over the past decade.
SS BLAST was published in Journal of Molecular Biology in 1990. Since that time it has been cited over 43,568 times. How does that feel?
Certainly accomplished, it was designed to be faster than FASTA at finding very strong similarities. It was something of a surprise that it performed as well as it did at finding weak similarities as well.
SS What influences directed you to your specific area of research? Who influenced your scientific thinking early in your career, and how?
Having graduated, I spent a lot of time reading about potential applied mathematical problems in biology. Among the inspirational books I read was a textbook entitled Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison by Sankoff and Kruskal. I read The Double Helix by James Watson. I travelled a lot to conferences, yet speaking of individuals that had particular impact, David J. Lipman, the present director of NCBI, was my great inspiration. That is a route I took from mathematics per se to the world of bioinformatics.
SS What scientific discovery over the past couple of years had a major impact upon you?
The most exciting discovery I was involved with as it unfolded was certainly the characterisation of BRCA1 in early 1990s. It was a perfect example of applying sequence alignment tools for significant discovery of functional motifs of BRCA1. I want to credit Peer Bork along with Eugene Koonin. We mapped out the functional motifs. BRCA1 was partitioned into globular and not globular domains. We have noticed a similarity between the designated 53BP1 that has been identified by its ability to bind p53. The other hits included KIAA0170 and RAD9. The probability exceeded 87% that a pattern as strong as the previously noted ‘granin motif’ would be shared by a random sequence as long as BRCA1 and the then-extant motif database, thus lending no statistical support to the relevance of this motif. Now the C-terminus of BRCA1 is known to contain two 95-residue BRCT domains, which are also found in many other proteins involved in DNA repair and cell cycle regulation. The crystal structure was later defined. It is not solely the story of characterization of one of the most important tumor suppressor genes in cancer, but additionally the story of how well-applied statistics can shed light on true positive interesting domains in this example.
SS What was the most difficult stage in your career?
I guess right after graduation getting applied problems was the most difficult stage, but it did not last long. I ended up working in a ‘lucky field’ – one that has grown rapidly over the past decade.
Certainly beneficial for society, even I myself sometimes benefit from the aspects of online lectures and series. Nevertheless long term consequences may be ambiguous.
SS “Reductionism, as a paradigm, is expired, and complexity, as a field, is tired. Data-based mathematical models of complex systems are offering a fresh perspective, rapidly developing into a new discipline: Network Science.” (Barabási, 2012) Do you subscribe to that view?
It’s difficult for me to say ‘no’ although it is not my field of expertise, it certainly sounds interesting at first glance.
SS Who should and will fund future bioinformatic research, what is the interaction between government funding/private, commercial/charitable donations?
(With laughter) I might not be the correct person to ask, as I was lucky enough not to have applied for grants.
SS What ongoing ethical dilemmas is society facing in the light of present technological advances?
In fact we have a lot of such conversations here at NIH. We are facing a sort of Wild Duck dilemma of whether attaining truth at all costs is the desired destination (I am referring to the conflict between Gregers’ militant idealistic opinions and Relling’s more worldly point of view in Ibsen’s play The Wild Duck). It remains to be seen if the truth will out-compete the ruins that may be caused by abuse of the system.
SS You have monitored the rapid growth and expansion of the field. What do you think is the next big route bioinformatics science will take?
The term ‘bioinformatics science’ currently comprehends a vary large range of disciplines, which deal with data as disparate as medical records, literature, sequences, expression patterns, mass spectroscopy, protein interactions, etc. For the bioinformatics which has been focused on research in molecular biology, a challenge will be to create tools that are reliable and scalable enough to be useful in clinical practice.
SS What advice would you give students starting their bioinformatics careers?
If you come from the fields of computer science, math, or physics, learn as much biology as you can. If you come from any field, learn a lot of statistics.
SS Please provide us with anything that comes to your mind when you hear the following:
- Extremely high throughput … faster algorithms
- Tool you are most proud of … PSI-BLAST, which for the first time made “protein profile” searches accessible to the non-expert.
- Scientific superhero … Mendel and Turing
- Often read … the newspaper
- If you had a billion dollars to fund research or charity, where would it go? … It is not my professional field, but I am most concerned with the degradation of the planet’s environment. Many approaches have been tried to address this increasingly urgent problem, but ground continues to be lost.
Image credit: ISMB