I have always been curious about the fundamentals of how the world works. From a young age, I found it deeply pleasing that there were logical, undeniable truths that governed how objects interacted; how mathematical systems could be solved. That’s why I was naturally attracted to physics, and why I chose to study that when I started my undergraduate career.
For most of my younger years, while school was great (and I was good at it), it was never enough to satisfy me on its own. Studying meant being taught information that someone already knew... but I wanted to learn and prove things that no one knew the answers to. I chose to go to Ohio State University for undergrad for many reasons (location, capacity to study abroad, academic diversity), but the most important was that I wanted to get involved in research.
So when I got there, I followed the advice of some older peers. I aced my first few physics classes and read up about the research conducted by the professors in those classes; one of which caught my eye. His work applied techniques from statistical physics to solve problems in biological systems. I was immediately intrigued, intimidated, and impressed by this problem statement alone. I loved the idea that you could merge two seemingly disparate fields to develop creative solutions in each. Eager to get started in research, I reached out to this professor, Dr. Ralf Bundschuh, to learn more about his work. Ralf was gracious enough to take me under his wing, along with his collaborator, Dr. Pearlly Yan.
The next semester, my second year, I was writing my first lines of bioinformatics code. My first project was to develop a computational tool to align sequences of DNA over fusion junctions in the genome. Easy right? It may have been easier if I had ever written code outside of CodeCademy. I spent the entirety of my first semester with the lab learning python in order to write the tool — and honestly, I struggled. But I was determined to prove myself to the group. After months of sweat, googling and guidance from fellow lab members, I had the tool working beautifully and I had a lot to show for it. In those four months, I successfully fought through my first bouts with imposter syndrome, I learned the basics of how to write code, and I proved to myself that I could apply critical thinking to come up with creative solutions outside of the classroom. Coincidentally, that’s when I caught the bug for programming, and began to cultivate my interest for computer science — more on that later.
With the foundations of the tool in place, I began to take on more responsibilities with the lab. They work with teams of biologists who want to draw insights about the genetics of given patient or animal cohorts. This involves an interdisciplinary workflow that starts with lab personnel who perform genomic sequencing of tissue samples and ends with people like me, computation scientists, applying statistical techniques to analyze the resulting nucleotide sequences. The biologists often want details about genes that change under given circumstancees to learn more about the organism, or certain diseases like cancer.
I spent time with lab members learning how to run the computational workflows that we had in place to produce these results, and eventually started to develop workflows on my own. I even built one around that fusion tool I mentioned earlier; the one containing my first lines of code. During my time in charge of those workflows, I had the opportunity to work with several different research teams each studying unique problems in different organisms (human, dog, rat, mouse, etc.) Each study brought its own challenges requiring new tools, statistical approaches, visualizations, and general bash scripting magic to produce results meaningful to the biologists, and I thrived here. Each step of a study was its own problem that needed a creative approach which I found deeply engaging. And further, though it was nerve racking, I loved getting to present my findings to researchers on other teams (that were sometimes on the other side of the country), knowing that my solutions would ultimately aide their work. All of this summed up to my gaining invaluable experience in working on interdisciplinary teams to solve complex problems, such as those in the cancer research space. I also got comfortable with batch processing large data sets as most of this work was done with computing time at the Ohio Supercomputer Center.
In 2016, Dr. Bundschuh and Dr. Yan encouraged me to step it up a notch and apply for the Pelotonia undergraduate research fellowship. Pelotonia is a grass-roots cancer fundraiser in Columbus, OH for which 100% of the proceeds support research at the OSUCCC - James research hospital. For this I wrote a proposal to computationally correct for degradation effects in tissue samples stored using low quality methods and was awarded the fellowship. So for the next year I led this project, under my advisors, and got my first experience leading a research project through its entire life cycle -- from proposal to final report. My favorite moment from the project came early on. I was performing analysis to look for an effect that I was confident would be expressed in our low quality samples (preferential C -> T mutation.) However, when I completed the analysis, the effect was, perplexingly, utterly invisible. For a few hours I was dejected. My hypothesis heavily relied on this result... it felt like my fellowship was over even before it started. But I'm proud of what I did next; again determined to prove myself as a researcher, I went back to the drawing board and tried to find where I went wrong. I dug through literature for a few days to interpret this outcome and found a few important insights that led me to devise a tweak to my analysis. I needed to look at mutations of bases in given contexts (i.e. what role neighboring bases played in the mutation.) Sure enough, with my new analysis the effect exploded off the page (found preferential CG -> TG mutation.) I'll never forget how satisfying that feeling of discovery was.
I wish I could say it was all uphill from there -- It would certainly make a better story. But the truth is that the rest of project was as much of a rollercoaster of roadblocks beggeting creative hurdles as was the start. I spent the next few months working with Dr. Bundschuh to devise some crafty software adaptations to correct for the effect, but ultimately, the final benefit of the correction ended up being too small to be useful to the greater scientific community. While this outcome was disappointing, it taught me the most valuable lesson that I learned as a young researcher. That is, ultimately, no matter how hard you try, how much literature you read, or how many approaches you take, sometimes you are just going to fail. You have to pick yourself up, carefully note what you learned from your failure, and move on.
So I moved on. Remember that fusion tool that I keep mentioning — the one I wrote my first lines of code for? Dr. Bundschuh was keen on seeing us build it out futher. So still under his guidance, I spent some time building a user interface for it to be available to the public as a web-application. I crawled data repositories to find good cases to demonstrate the tool's utility. And finally, I used this to draft a manuscript for the tool which we sent out for first review in May, and resubmitted with addresses to reviewer comments in October. I was even invited as one of two undergraduate speakers to present the tool, FuSpot at the annual Pelotonia research symposium this past month.
So what’s next? As I mentioned, along the road of my research, Computer Science caught my eye -- and now it has truly enveloped me. I added it as a major to complement my physics degree and became deeply intrigued with AI techniques. They have a potential, if applied in creative ways, to dramatically affect social health and well being. And the fact is, almost every person in the US now walks around with immense computing power in their pocket capable of hosting intelligent systems. This was the motivation behind my first Computer Science research project, and my undergraduate thesis.
The idea is to read motion sensor data from smart phones and use it to detect heavy drinking events by training a classifier hosted on the smart phone. If you are interested, you can read more details on my projects page. To gather the data for the project, I collaborated with Dr. John Clapp and Dr. Danielle Madden from the OSU school of Social Work (both now at USC) who carry out a yearly study on student drinking habits, tracking their alcohol intake. Approaching them with my project, they graciously agreed to let me install an app on their participants' phones that would store motion sensor data. I then worked with my good friend and hacker, Dalton Flannagan (now at Facebook), to develop that app for Android and iOS to store and send streams of accelerometer data out to our server. To receive the data, we developed a server running InfluxDB capable of handling 20 concurrent data streams and that was easily queryable to create heads up displays to monitor through the day of the study. Now I am working Dr. Kevin Passino and Dr. Arnab Nandi to process the data and build a classifier on it. Currently, I am working with MATLAB's signal processing toolbox to clean and segment the data. After that, I'll be using some interesting results from literature to guide my feature extraction. When all is said and done, I'll be back here to post the results and a link to where you can download the trained application.
|BMC Medical Genomics||Manuscript (Methylation)||Rejected|
|Thyroid||Manuscript (PTC Fusion)||Accepted (4/12/18)|
|Pelotonia Fellowship Program||Proposal (DNA Degradation)||Awarded|
|Rustbelt RNA Conference||Poster (FuSpot)||Accepted|
|BMC Genomics||Manuscript (FuSpot)||Accepted - (1/17/18)|
|AI for Social Health|
|OSU CIS||Undergraduate Thesis (Sobriety Tracker)||Passed; 1st Place Denman Winner|