U of L
About usPeopleResearchPublicationResourcesCourses
 
Contact US


Olfa Nasraoui

Associate Professor and Endowed Chair of E-Commerce
Dept. of Computer Science and Computer Engineering
University of Louisville
Louisville, KY 40292
USA

phone: (502) 852-0191
e-mail (remove spaces): olfa . nasraoui @ louisville . edu
home page: http://webmining.spd.louisville.edu/

 
Our Lab Logo

 



Knowledge Discovery & Web Mining Lab

Mission & Scope

We conduct research to advance state of the art in the area of Knowledge Discovery in Data sets (KDD), with an emphasis on Data Mining, and in particular Web Mining and Stream Data Mining. Our daily activities consist of learning, investigation, design, implementation, testing, and evaluation of efficient algorithms and techniques to solve challenging problems in support of a variety of applications, such as

  • Web analytics & Web personalization for e-commerce and information retrieval,
  • Mining evolving data streams with an emphasis on evolving Web clickstreams
  • Scalable and/or personalized information retrieval in data such as text and astronomical data sets.

News

Nurcan ACM Grand Final

PhD student Artur Abdulin part of winning Data Mining team that beats out more than 700 other Participants, for designing the most accurate Newsstand Sell-Through Prediction Model

Dr. Olfa Nasraoui was selected among the "Top 9" Faculty Favorites for 2010 at University of Louisville

Esin Saka's team wins the Yahoo! Sponsored Search Contest during her internship at Yahoo!

 

Garrett Ridge Receives Honorable Mention in the Computing Research Association's Outstanding Undergraduate Research Award competition for 2010

Nurcan Durak wins third place in the ACM Student Research Competition at the 2010 Grace Hopper Conference

 

Esin Saka wins the best graduate poster at the Kentucky Women in Computing Conference

Esin Saka receives Google WSDM and WAW Conference Grant and Travel Award - 2 recipients worldwide

Fabio González was a Fulbright Visiting Scholar in our lab.

 

 

Undergraduate students Garrett Ridge and Mihir Kotwal present their research at Posters at the Capitol

Background

Recently, the world has witnessed an explosion of electronically stored data. Most organizations rely on huge databases that contain a wealth of information which unfortunately, is not fully exploited. In fact the increasing size of most data repositories is making the access to useful information more and more difficult. Hence, the saying that "The mechanical production of data has created the need for a mechanical consumption of data" is not in vain. Data mining (DM) comprises the set of intelligent tools that can be used to extract useful or interesting information, such as patterns, associations, change, anomalies and significant structures, from large amounts of data stored in various information repositories.

Data mining inherits a legacy from diverse disciplines, including:

  • Machine Learning & Artificial Intelligence
  • Pattern Recognition
  • Statistics
  • Database Systems
  • Information Retrieval

Recently, several particularly challenging areas of research in data mining have emerged, including:

  • Web Mining: mining web data (semi-structured to unstructured)
  • Text Mining: mining text data such as in Web pages and e-mails
  • Stream Data Mining: mining data that arrives in huge quantities under extremely stringent memory constraints, making it necessary to process the data in only one sequential direction (ex: clickstream data)
  • Mining Evolving Data Streams: mining data that not only arrives in huge quantities under harsh computational and space constraints, but that can also change unexpectedly

What we do

In our lab, we conduct research in all these challenging areas, which are often intertwined instead of being separate. For example Web Mining often involves data of different types: Web Usage data as found in Web logs that record user navigation or clicks on a website, Structure data as in the hyperlinks between Web pages, and Text data as in the content of Web pages. Text Mining is therefore a special subset of Web Mining.

Also, contrary to most assumptions, Web data on most busy websites is highly dynamic. In particular, Web usage data possesses all the challenging characteristics of massive and evolving data streams, with one added challenge: it is of much higher dimensionality and is very sparse!!!


Computer Science & Engineering Distinguished Lecture Series

A DECADE OF MINING THE WEB

A Special Issue in the Journal of Data Mining and Knowledge Discovery (DMKD 2010)

Call for Papers

 

This website is in compliance with ADA section 508 and W3C guidelines1.0.