Problem:

  • Detecting whether or not a cell is cancerous is of vital importance to medical practitioners. Many times the data gathered about the cells of a tumor are not easily human readable, and have complicated relationships. The question is, how can we use statistically data driven methods to determine if a particular sample cell is cancerous?

Solution:

  • Develop a supervised classification model to predict the likelihood of a given cell from a tumor to be malignant or benign.

Methods:

  • Custom random forest implementation, and a TAN Bayesian network.

Frameworks and Platforms:

  • Python, custom modeling code

Outcomes:

  • Developed a cancer classification system with over 90% accuracy in classifying cells as malignant or not.