Ph.D. Defense of Dissertation: Mingxuan Sun

*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************

Event Details
  • Date/Time:
    • Monday August 6, 2012 - Tuesday August 7, 2012
      11:00 am - 12:59 pm
  • Location: KACB 1315
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact

Mingxuan Sun

Summaries

Summary Sentence: Visualizing and Modeling Partial Incomplete Ranking Data

Full Summary: No summary paragraph submitted.

Ph.D. Defense of Dissertation Announcement

Title: Visualizing and Modeling Partial Incomplete Ranking Data

Mingxuan Sun
College of Computing
Georgia Institute of Technology

Date: Monday, August 6th, 2012
Time: 11:00 AM - 1:00 PM (EDT)
Location: KACB 1315

Committee:

  • Dr. Guy Lebanon (Advisor, College of Computing, Georgia Tech)
  • Dr. Alexander Gray (College of Computing, Georgia Tech)
  • Dr. Charles Isbell (College of Computing, Georgia Tech)
  • Dr. Hongyuan Zha (College of Computing, Georgia Tech)
  • Dr. Kevyn Collins-Thompson (Microsoft Research)


Abstract:
Analyzing ranking data is an essential component in a wide range of important applications including web-search and recommendation systems. Rankings are difficult to visualize or model due to the computational difficulties associated with the large number of items. On the other hand, partial or incomplete rankings induce more difficulties since approaches that adapt well to typical types of rankings can not apply generally to all types. While analyzing ranking data has a long history in statistics, construction of an efficient framework to analyze incomplete ranking data (with or without ties) is currently an open problem.

This thesis addresses the problem of scalability for visualizing and modeling partial incomplete rankings. In particular, we propose a distance measure for top-k rankings with the following three properties: (1) metric, (2) emphasis on top ranks, and (3) computational efficiency. Given the distance measure, the data can be projected into a low dimensional continuous vector space via multi-dimensional scaling (MDS) for easy visualization. We further propose a non-parametric model for estimating distributions from partial incomplete rankings. For the non-parametric estimator, we use a triangular kernel that is a direct analogue of the Euclidean triangular kernel. The computational difficulties for large n are simplified using combinatorial properties and generating functions associated with symmetric groups. We show that our estimator is computational efficient for rankings of arbitrary incompleteness and tie structure. Moreover, we propose an efficient learning algorithm to construct a preference elicitation system from partial incomplete rankings, which can be used to solve the cold-start problems in ranking recommendations.

The proposed approaches are examined in experiments with real search engine and movie recommendation data. The visualization of top-k rankings is highly effective in summarizing and analyzing insights on search engine dissimilarities, query manipulation diversities, and query intent ambiguities. The partial ranking estimator is successfully applied to collaborative filtering recommendation systems, where the conditional probability estimation is naturally suited for tasks such as rank prediction and association rule discovery. The preference elicitation system constructed by our learning algorithm achieves high prediction accuracy with low user interaction time for cold-start recommendations.

Additional Information

In Campus Calendar
No
Groups

College of Computing, School of Computational Science and Engineering

Invited Audience
No audiences were selected.
Categories
No categories were selected.
Keywords
No keywords were submitted.
Status
  • Created By: Jupiter
  • Workflow Status: Published
  • Created On: Jul 25, 2012 - 4:45am
  • Last Updated: Oct 7, 2016 - 9:59pm