> >welcome to my personal website!

James Lee

Building Tools for Biological Discovery

About Me

I'm a researcher with expertise in Next-Generation Sequencing and cancer immunology, with a strong track record of developing novel methodologies and leading high-impact projects. I love developing platforms and tools to help researchers understand complex systems and diseases, using single-cell technologies and statistical modeling.

  • Research Translation: Experience in wet lab to interpret model outputs for prioritizing CRISPR targets and drug combinations.
  • ML Architecture Design: Learning to build transformer and graph neural network architectures that denoise and align perturbation data across patient cohorts.
  • Research Software Development: Develop robust, well-documented open-source software to facilitate reproducible analysis.

Featured Projects

Computational Biology

Tools for Biologists

Single-cell analysis tools & End-to-end ML platforms

cPerturb-CMap

Python toolkit for linking single-cell CRISPR screens to large drug-response datasets (Connectivity Map) to suggest drug repurposing strategies.

  • Uses deep learning to compare gene-editing and drug screens across tens of thousands of single cells.

CellJEPA

Investigation of Joint-Embedding Predictive Architectures (JEPA) for single-cell omics, focused on perturbation-response prediction.

  • Benchmark-driven study of when JEPA-style learning helps in single-cell perturbation prediction.
  • Targets latent-state prediction with strong baseline comparisons.
JEPA Single-cell

GPT-cell-annotator

AI assistant that annotates single-cell RNA-seq clusters.

  • Builds prompts from marker genes, QC summaries, and ontology terms so the model sees the same context a human annotator would.
  • Electron/React interface on top of a Python backend that writes annotations directly into your AnnData object.
Python React Electron LLM
Finance & Markets

Biotech/Pharma Finance

Biotech Earnings NLP, Healthcare ETF Research, etc

Biotech Earnings Call NLP

Pipeline to parse biotech earnings calls, extract catalysts, and surface sentiment/snippets for PM briefings.

  • Ingests call transcripts and segments by speakers/sections.
  • Pulls catalysts (trials, approvals, guidance) with sentiment flags.
FinBERT Python

Terminal.LLM

Streamlit mini Bloomberg Terminal that fetches live-ish prices + headlines and generates a Morning Debrief using Gemini or OpenAI.

  • Pulls a structured context snapshot (market tape + headlines) via Yahoo Finance (yfinance).
  • Displays retrieved/as-of timestamps so summaries stay grounded to the latest available data.
  • Follow-up chat uses the same provider/model as the debrief (Gemini or OpenAI).
Python LLM API

Health ETF Quant

Regime-aware XBI vs XPH spread plus momentum rotation across healthcare subsectors with clear risk notes and benchmarks.

  • Strategy 1: macro regime classifier gates a biotech vs pharma long/short spread.
  • Strategy 2: cross-sectional momentum with volatility targeting across XLV, XBI, XPH, IHF, IHI.
Python Backtesting ETF

Experience

Journey through cutting-edge research and innovation

August 2024 - Present

Graduate Student Researcher/Research Assistant

New York Genome Center - Sanjana Lab

Investigate synthetic genes in various cell types and diseases

CRISPR Single-cell Omics Nextflow
January 2023 - July 2024

Research Associate

Scale Biosciences

Played a key role in developing Quantum Barcoding technology single-cell RNA assay using combinatorial indexing.

Single-cell Omics Automation Manufacturing QC
January 2023 - October 2023

Research Associate

UC San Diego, Moores Cancer Center, Stupack Lab

Delivered scRNA-seq analysis of high-grade serous ovarian cancer, delivering key insights into platinum-based chemotherapy resistance (1st author publication).

Cancer Biology Machine Learning R Visualization
August 2021 - November 2022

Research Associate

UC San Diego, Moores Cancer Center, Chen Lab

Led a project investigating tumor microenvironment changes in pancreatic ductal adenocarcinoma, uncovering significant changes in tumor-infiltrating lymphocytes and macrophage polarization upon SUMOylation inhibition (1st author publication).

Cancer Biology Flow Cytometry Single-cell Omics

Technical Skills

Tools and technologies driving innovation

Programming & Data Science

Python R Bash SQL Machine Learning Deep Learning NLP Representation Learning Neural Networks PyTorch TensorFlow

Bioinformatics & Genomics

Single-cell RNA-seq NGS Analysis CRISPR Screens Multi-omics BioConductor BLAST

Infrastructure & Tools

AWS HPC Linux/Unix Git Docker Snakemake Nextflow

Laboratory Techniques

NGS Library Prep NGS Assay Development Flow Cytometry Cell Culture Western Blot In Vivo Models

Education

New York University

Master of Science in Bioinformatics and Systems Biology

September 2024 - May 2025 | New York, NY

Bioinformatics Systems Biology Computational Biology

UC San Diego

Bachelor of Science - Double Major

General Biology | Political Science - Data Analytics

January 2019 - July 2021 | La Jolla, CA

Molecular Biology Data Analytics Statistics

Let's Connect

Interested in collaboration or just want to chat about bioinformatics and technology? I'm always open to discussing innovative projects and potential partnerships.

📍 New York, NY 10013

"Advancing precision medicine through computational biology and single-cell genomics"

Home