profile photo

Linkedin   Scholar   GitHub   Twitter   Email   CV

Abdurrahman Abul-Basher

Research Fellow
Bioinformatics & Applied Machine Learning Scientist

Scholar

Jump to: News | Teachings | Research Articles

I am a research fellow at Lee Laboratory at Harvard Medical School and Boston Children's Hospital. I work in analyzing cell phenotypic transition dynamics from live-cell images and high-dimensional gene/protien expression data (e.g., transcriptomics, proteomics, and spatial transcriptomics) across human tissues. I am also expert in analyzing large-scale genomics data using cutting edge machine/deep learning algorithms (e.g., contrastive analysis).

I obtained my PhD in Bioinformatics (an interdisciplinary field) from the University of British Columbia (UBC), where I was advised by Dr. Steven J. Hallam (Hallam Lab). Previously, I obtained my bachelor's in computer science from King AbdulAziz University (KAU) and masters in information systems security from Concordia University, where I worked with Dr. Benjamin C. M. Fung in the Data Mining and Security (DMaS) Lab to discover topics from chat logs for crime investigation.

Misc: In my spare time, I enjoy reading novels, history, modern philosophy, and writing poetry.

Research
Education

PhD in Bioinformatics, 2020
The University of British Columbia (UBC)

Master of Applied Science in Information Systems Security, 2011
Concordia University

BSc in Computer Science, 2008
King AbdulAziz University (KAU)

Selected projects
phet
PHet
pride
pride (coming)
mlts
mltS (coming)
leads
leADS
triumpf
triUMPF
pathway2vec
pathway2vec
mllr
mlLGPR
remap
reMap
prepBioCyc
prepBioCyc
strasplit
straSplit
Selected fellowships and awards
  • Four Year Fellowships (4YF) ($18,200 per year + tuition fee), The University of British Columbia (UBC), Canada. 2013-2017.
  • Faculty of Science - Graduate Support Initiative (GSI) Fund ($8,500 per year), The University of British Columbia (UBC), Canada.2013-2017.
  • Power Corporation of Canada Graduate Fellowships ($5,000), Concordia University, Canada. 2009-2010.
  • Concordia Graduate Student Support Program (GSSP) ($15,000 per year), Concordia University, Canada. 2009-2011.
  • First Honor Graduate for graduating with high GPA from King AbdulAziz University, Saudi Arabia. 2008.
News

Teachings

The University of British Columbia, Master of Data Sciences (MDS)

  • Graduate Student Instructor, DSCI 571 Supervised Learning I, 2016
  • Graduate Student Instructor, DSCI 573 Feature and Model Selection, 2017
  • Graduate Student Instructor, DSCI 575 Advanced Machine Learning, 2017

Selected publications

published accepted preprint under preparation under review

PHet Heterogeneity-Preserving Discriminative Feature Selection for Subtype Discovery
Abdurrahman Abul-Basher, Caleb Hallinan, and Kwonmoo Lee
Under preparation, 2024
code

In this article, we present a novel approach, termed PHet (Preserving Heterogeneity), designed to capture the diversity within each disease condition while maintaining the discrimination of known disease states. Our analysis identified features with significant differences in interquartile range (IQR) between classes, indicating crucial subtype information. Validation using public single-cell RNA-seq and microarray datasets demonstrated PHet's effectiveness in preserving sample heterogeneity while maintaining discrimination of known disease/cell states, surpassing the performance of previous outlier-based methods.
mltS Leveraging multiple (less-trusted) sources to improve metabolic pathway prediction [TO BE ADDED]
Abdurrahman Abul-Basher, XXXX
Under preparation, 2024
[TO BE ADDED]

This paper presents mltS (multi-label learning based on less-trusted sources) which is an ensemble based learning that leverages the idea of estimating memebers reliability scores in an ensemble given a small reference collection dataset. Using mltS, one can the assess the reliability of each model while performing inference in three ways: "meta-predict (mp)", "meta-weight (mw)", and "meta-adaptive (ma)".
leADS leADS: improved metabolic pathway inference based on active dataset subsampling
Abdurrahman Abul-Basher, Aditi N. Nallan, Ryan J. McLaughlin, Julia Anstett, and Steven J. Hallam
Under review, 2024
code

This paper presents leADS (multi-label learning based on active dataset subsampling) that leverages the idea of subsampling examples from a pool of multi-label data to reduce the negative impact of training loss.
CHAP Aggregating statistically correlated metabolic pathways into groups to improve prediction performance
Abdurrahman Abul-Basher and Steven J. Hallam
BIOINFORMATICS (to appear), 2022
code

This paper presents the CHAP (correlated pathway group) package comprising of three hierarchical mixture models: SOAP (sparse correlated pathway group, SPREAT (distributed sparse correlated pathway group), and CTM (correlated topic model) to characterize pathways.
reMap Relabeling metabolic pathway data with groups to improve prediction outcomes
Abdurrahman Abul-Basher and Steven J. Hallam
ICCABS (to appear), 2021
code

This paper presents the reMap framework that performs mapping examples to a different set of labels, characterized as pathway groups, where a group comprises of statistically correlated pathways.
triUMPF Metabolic pathway prediction using non-negative matrix factorization with improved precision
Abdurrahman Abul-Basher, Ryan J. McLaughlin, and Steven J. Hallam
Journal of computational biology (JCB), 2021
code

This paper presents triUMPF (triple non-negative matrix factorization (NMF) with community detection for metabolic pathway inference) that combines three stages of NMF to capture myriad relationships between enzymes and pathways within a graph network followed by community detection to extract higher order structure based on the clustering of vertices sharing similar statistical properties.
pathway2vec Leveraging heterogeneous network embedding for metabolic pathway prediction
Abdurrahman Abul-Basher and Steven J. Hallam
Bioinformatics, 2020
code

This paper presents pathway2vec, a software package consisting of six representational learning based modules used to automatically generate features for pathway inference.
mlLGPR Metabolic pathway inference using multi-label classification with rich pathway features
Abdurrahman Abul-Basher, Ryan J. McLaughlin, and Steven J. Hallam
PLOS Computational Biology, 2020
code

This paper presents mlLGPR (multi-label logistic regression for pathway prediction) framework that uses supervised multi-label classification and rich pathway features to infer metabolic networks at the individual, population and community levels of organization.
Posters
cmde Machine Learning Approach to Recovering Metabolic Pathways from Metagenomics Sequences
Abdurrahman Abul-Basher and Steven J. Hallam
CMDE, 2016

www.symptoma.ro

Webpage design courtesy of Jon Barron.