Finding Significant Large-Average Submatrices in High Dimensional

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Friday February 6, 2009 - Saturday February 7, 2009
      10:00 am - 10:59 am
  • Location: Executive classroom 228 Main
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    $0.00
  • Extras:
Contact
Nicoleta Serban
ISyE
Contact Nicoleta Serban
404-385-7255
Summaries

Summary Sentence: Finding Significant Large-Average Submatrices in High Dimensional

Full Summary: Finding Significant Large-Average Submatrices in High Dimensional Data

TITLE: Finding Significant Large-Average Submatrices in High Dimensional Data

SPEAKER: Dr. Andrew Nobel

ABSTRACT:

Exploratory analysis of high dimensional data often begins with independent clustering of samples and variables, yielding a partition of the data matrix into disjoint row-column blocks (submatrices). Of particular interest in practice are submatrices whose entries are large on average. In conjunction with clinical and functional annotation, large average submatrices are frequently the starting point for subsequent analyses, such as the identification of genetic pathways and new disease subtypes.

This talk describes a simple algorithm, belonging to the general category of biclustering methods, for identifying large average submatrices in high dimensional data. Like other biclustering methods, the algorithm improves on independent sample variable clustering in several respects: the submatrices it identifies can overlap and they need not cover the entire data matrix (features that better reflect underlying biology), and the inclusion of samples and variables in a submatrix does not depend on their expression values outside the submatrix. The algorithm seeks to maximize a simple measure of statistical significance, which also provides an objective basis for comparing and selecting among submatrices of different sizes and average intensities. I will discuss the applications of the algorithm to a recent gene-expression based cancer study, and will provide a detailed comparison of its performance with several other biclustering method, including its application to semi-supervised classification.

Joint work with Andrey Shabalin, Vic Weigman, and Charles Perou.

Additional Information

In Campus Calendar
No
Groups

School of Industrial and Systems Engineering (ISYE)

Invited Audience
No audiences were selected.
Categories
Seminar/Lecture/Colloquium
Keywords
submatrices
Status
  • Created By: Anita Race
  • Workflow Status: Published
  • Created On: Oct 12, 2009 - 4:36pm
  • Last Updated: Oct 7, 2016 - 9:47pm