Data mining and analysis of massive data sets

The emphasis is on practical work, where participants use basic algorithms of data mining on real cases. Additionally are presented methods of data storage and data mining algorithms.


  • What is data mining
    • Statistical modelling, machine learning and programming approaches to modelling, summarization
    • The statistical limitations of data mining
    • Bonferroni principle and other restrictions
  • MapReduce
    • Example of Google's MapReduce architecture system, distributed file system, a simple example of MapReduce algorithm: word count
  • The importance of words in documents
    • Frequency of words, measurement of importance
  • Using Hadoop
  • Implementation of words, counting words case
  • Upgrading an algorithm to count the word algorithm to search for a key word in the case of Wikipedia

Target group:

  • Designers of IoT and Big Data services
  • R&D specialists


The purpose of the workshop is to learn the basics of data mining, its limitations and problems. Using MapReduce system and solve real problems using Hadoop
Status:Closing date exceeded
Duration:1 day
Tutor:Matej Kren
Location: Fakulteta za elektrotehniko
Tržaška 25
1000 Ljubljana
Apply as
Status: Closing date exceeded

Subscribe to newsletter