Posts

Showing posts from November, 2010

Data Streams and VFML

We live in a technological world crowded of information. Every device we can think of can give us a bunch of such data, usually in the form of a flow or stream of information in, more or less, real time . In this particular situation classical knowledge discovery mechanisms (like our loved C4.5, a decision tree developed by Quinlan) are completely unable of extract a correct model of the situation. But, what is so special with flows of data? Following the words of Gama and Rodriques: a data stream is an ordered sequence of instances that can be read only once or a small number of times using limited computing and storage capabilities. These sources of data are characterized by being open-ended, following at high speed, and generated by non-stationary distributions in dynamic environments . So, to properly handle this kind of knowledge the learning algorithm has to learn on line and process massive amounts of data increasing the challenges to be faced. Let's hold one's breath w