Machine Learning Prague 2016

April 23–24, 2016, Prague

Conference on machine learning in practice

About the conference

Machines can learn. Incredibly fast. Faster than you. They are getting smarter and smarter every day. They are already changing your world, your business and your life. Artificial intelligence revolution is here. Come and learn how to turn this threat into your biggest opportunity.

This is not another academic conference. Our goal is to foster discussion between machine learning practitioners and all people who are interested in applications of modern trends in artificial intelligence. You can look forward to inspiring people, algorithms, data, applications, workshops and a lot of fun during both days as well as at two great parties.

conference speakers

Xuedong Huang

Xuedong Huang

Dr. Xuedong Huang is the chief scientist of speech R&D and the key person behind Microsoft's spoken language processing technologies. As the head of Microsoft's spoken language initiatives for more than a decade, he provided technical, engineering and business leadership to bring speech recognition to the mass market.

Xuedong Huang (USA)

Chief Scientist of Speech R&D, Microsoft
Tomas Mikolov

Tomas Mikolov

Tomas is a research scientist at Facebook AI Research. He is well known as the author of the word2vec algorithm.

Tomas Mikolov (USA)

Research Scientist, Facebook

Katerina Veselovska

Katerina combines both academic and industry career. She is a senior researcher at Charles University and a text analytics lead at Ataccama. She is an expert on sentiment analysis, surveying possibilities of automatic processing of emotions both at theoretical level and in real-life applications.

Katerina Veselovska (CZ)

Text Analytics Lead, Ataccama

Chris Brew

Chris Brew is a Senior Research Scientist with Thomson Reuters. He specializes in Statistical Natural Language Processing and in Computational Semantics.

Chris Brew (UK)

Senior Research Scientist, Thomson Reuters
Marek Rosa

Marek Rosa

Marek is the CEO, CTO and founder of two Prague-based companies: Keen Software House, an independent video game development studio, and GoodAI, a general artificial intelligence R&D company.

Marek Rosa (CZ)

CEO & CTO & founder, GoodAI

Jiri Materna

Jiri is a passionate researcher and software engineer mostly focusing on machine learning and NLP. He works as the head of research department at Seznam.cz.

Jiri Materna (CZ)

Head of Research, Seznam.cz
Adam Ashenfelter

Adam Ashenfelter

Adam is a co-founder and Chief Data Engineer at BigML. He has an MS from Oregon State University focused on machine learning.

Adam Ashenfelter (USA)

Chief Data Engineer & co-founder, BigML

Jan Sedivy

After 18 years in the industry (IBM and Google), Jan have returned to the CTU to share his experience with students.

Jan Sedivy (CZ)

Engineer, CTU

Michael Levin

Michael currently works as Chief Data Scientist at Yandex Data Factory. He graduated from Moscow State University, Department of Mathematics and Mechanics. Served as Ads Quality Tech Lead at Yandex 2010-2014 and Traffic Jams Infrastructure Tech Lead at Yandex 2009-2010.

Michael Levin (RUS)

Chief Data Scientist, Yandex Data Factory
Libor Morkovsky

Libor Morkovsky

Libor Morkovsky has been working for Avast since 2006. Having a background in bioinformatics, he’s trying to pick the best of the two worlds and make it work in malware analysis.

Libor Morkovsky (CZ)

Senior researcher, Avast
Matthew Tudor

Matthew Tudor

Matt has been working at the intersection of high throughput biology and informatics for over eleven years. He has been with MSD for 8.

Matthew Tudor (USA)

Principal Scientist, MSD
Jonas Seiler

Jonas Seiler

Jonas studied Cognitive Science and Computational Neuroscience in Osnabrueck and Berlin respectively. He started at Plista already during his masters and he is leading the machine learning team there now.

Jonas Seiler (DE)

Machine learning team Lead, Plista

workshop speakers

Poul Petersen

Poul Petersen

Poul Petersen, CIO, joined BigML in 2011 to build the infrastructure for BigML.

Poul Petersen (USA)

Chief Infrastructure Officer, BigML
Lukas Vrabel

Lukas Vrabel

Lukas works as research team lead at Seznam.cz. His team works on realization of various machine learning tasks ranging from text and web page analysis to image recognition.

Lukas Vrabel (CZ)

Research Team Lead, Seznam.cz
Vit Listik

Vit Listik

Vit is an enthusiastic developer interested in machine learning and web technologies. Currently he is working on spam and phishing detection at Email team at Seznam.cz.

Vit Listik (CZ)

Python/C++ developer, Seznam.cz
Tomas Cicha

Tomas Cicha

Tomas is a C++/Python developer and researcher at the search engine department of Seznam.cz. He is the main author of RC-Rank, a machine learning tool, extensively used at Seznam.cz.

Tomas Cicha (CZ)

Python/C++ developer, Seznam.cz
Jiri Hana

Jiri Hana

Jiri has worked in industry and academia, has taught linguistic and computer science in Europe and the U.S. He finds the intersection between exacting nature of computers and the fuzziness of language fascinating. He is a senior researcher at Charles University and a co-founder of Geneea, a text analytics startup.

Jiri Hana (CZ)

Co-founder, Geneea
Radoslav Klic

Radoslav Klic

Radoslav is a senior NLP developer at Geneea. He has a master's degree in Computational Linguistics from Charles University. Before Geneea and its start-up culture, he experienced corporate life in big companies like Seznam and Amazon.

Radoslav Klic (CZ)

NLP Software Engineer, Geneea
Jan Pospisil

Jan Pospisil

Jan is a senior technology evangelist at Microsoft, SW Architect and Developer. His passions are IoT gadgets, Big Data, New Technologies (even bleeding edge), robotics and automation.

Jan Pospisil (CZ)

Senior Technology Evangelist, Microsoft


 April 22, 2016

18.30 – ???

Warm-up party

The party will be held at Kavarna museum, Vinohradska 1, 120 00 Praha 1.

 Main Conference

 April 23, 2016



8.50 – 9.00

Welcome to ML Prague 2016

9.00 – 10.00

Keynote speech: Deep Learning and Intelligent Applications

With recent advances in deep learning, big data, and computing infrastructure, we are able to engage and converse with intelligent machines more effectively. For example, computers will realistically reach human parity in recognizing speech in the next a few years. The computing industry’s progress in speech and vision is apparent in Microsoft products and services, including Cortana, Skype Translator, and Project Oxford cloud services. Dr. Xuedong Huang will use these examples to illustrate not only our society’s historical collective efforts, but also enumerate our remaining challenges to reaching human parity, an accomplishment that will have a profound impact on society.

Xuedong Huang, Microsoft (USA)
10.00 – 10.30

Inventing General Artificial Intelligence: A Vision and Methodology

Marek Rosa, CEO/CTO of a Prague-based general AI development company, will introduce his methodological approach to general purpose artificial intelligence (AGI) research. He will highlight a key concept he is working to develop with his team: School for AI, a set of training environments where AI agents will learn in a gradual and guided way. To perform successfully in these environments, an agent needs to demonstrate a set of abilities. The analysis of these abilities leads to a specific list of functional requirements, which will then be implemented as a unified AGI architecture.

School for AI will be released as an open source tool where AI researchers and machine learning practitioners can benchmark their AI agents.

Marek Rosa, GoodAI (CZ)
10.30 – 11.00

Intelligent Personal Assistants

Apple Siri, Google Now, and Microsoft Cortana are emerging new web-applications known as Intelligent Personal Assistants. They combine many latest and developing AI technologies such as speech recognition, question answering, information retrieval, knowledge management, IoT etc. The presentation will discuss the basic architecture, challenges and use cases of the future intelligent assistants.

Jan Sedivy, CTU (CZ)
11.00 – 11.30

Coffee break

11.30 – 12.00

Artificial Intelligence in Creative Writing

Have you ever thought about artificial intelligence smart enough to write novels or poetry? Meet the future. I will describe to you a machine learning model which is able to write an entire book or assist you in writing your own fictional literature.

Jiri Materna, Seznam.cz (CZ)
12.00 – 12.30

Learning Representations for Drug Discovery

Utilizing gene expression measurements, it is possible to characterize drugs and drug candidates for their on- and off-target activities, predict treatments for new indications, and highlight potential safety concerns. Improvements in technology now permit this approach to be applied on a qualitatively larger scale, which in turn requires new analytics. We find that applying methods inspired by the deep learning community can improve the interpretability, fidelity, and predictivity of pharmaceutical genomics.

Matthew Tudor, MSD (USA)
12.30 – 13.00

Online Hyperparameter Tuning in Non-Stationary Environments

Fitting hyperparameters online usually requires an extensive parallel search in the parameter space, which is often very demanding in time and computational resources. A comparison of parameters sequentially is often not valid, due to non-stationarity. Here, we present a method using global Bayesian optimization with Gaussian Processes to model certain kind of non-stationarities.

Jonas Seiler, Plista (DE)
13.00 – 14:30


14.30 – 15.00

ML Approaches to Sentiment Analysis

Current computational linguistics witnesses a massive increase of interest in sentiment analysis, as it is a powerful means of public opinion mining. However, is it even possible to distinguish between good and evil using statistical methods? How would you treat irony, idioms, innovative vulgarisms and other inherent elements of natural language? Are emotions language-independent? I will share my thoughts and opinions towards these issues, describing the state-of-the art approaches employed in sentiment analysis.

Katerina Veselovska, Ataccama (CZ)
15.00 – 15.30

Distributed Representations for NLP

Distributed representations are currently a hot topic both in academic research and in industrial applied machine learning. I will talk about some breakthroughs from recent years, and will try to explain what the distributed representations are about and why is there so much excitement around techniques such as word2vec.

Tomas Mikolov, Facebook (USA)
15.30 – 16.00

TR Discover: A Natural Language Interface for Exploring Linked Datasets

Keywords are the dominant technology for providing non-technical users with access to linked data. This is problematic, because keywords cannot express all the necessary details of user intent. Non-technical users cannot be expected to use an expressive query language such as SQL, but still need to access the data.

They can, however, use English, Our system, called TR Discover, maps from a fragment of English into an intermediate first-order logic representation, which is in turn mapped into SPARQL or SQL. It has been tested both on a dataset relevant to drug research and on the publicly available QALD-4 dataset. The system incorporates a tailored autosuggest mechanism and a back end that delivers task appropriate analytics.

There will be a demo of a publicly visible version of the system.

Chris Brew, Thompson Reuters (UK)
16.00 – 16.30

Coffee break

16.30 – 17.00

Finding the Oddballs

Anomaly detection is a handy tool for a machine learning practitioner's toolbox. Not only is it useful for particular domains like fraud detection or intrusion detection. It's also nice for more general tasks, such as data cleaning or early warning of concept drift. We'll review a few approaches to anomaly detection, and then dig a little deeper into an ensemble technique called isolation forests.

Adam Ashenfelter, BigML (USA)
17.00 – 17.30

Recognizing Malware

Today’s malware ecosystem produces hundreds of thousands distinct samples per day. To leverage similarities between the samples for automated classification we built a distributed database engine relying on GPUs. We use instance based classification which gives us online learning and accountability for every decision. With query times of fraction of a second even using a compound distance function, this system is able to classify the incoming samples in the real time. Samples classified as malware are directly used to generate rules to identify similar samples in machines of our customers.

Libor Morkovsky, Avast (CZ)
17.30 – 18.00

MatrixNet Applications at Yandex

MatrixNet is a proprietary machine learning tool developed by Yandex and used widely throughout the company. It has different learning modes, such as ranking, regression and classification. The algorithm is based on gradient boosting over decision trees, but its implementation has a lot of heuristics that give strong defense against overfitting. We start with some details of the algorithm itself, then discuss its applications in web search, ad click prediction, churn prediction and other areas.

Michael Levin, Yandex Data Factory (RUS)
18.00 – 18.05


18.30 – ???



 April 24, 2016

Room 1

Room 2

09.00 – 10.30

Machine Learning Solutions at Seznam.cz (1)

In the first part of the workshop we will describe principles of RC-Rank, a multipurpose machine learning tool developed at Seznam.cz. Next, we will present a short introduction to deep convolutional neural networks and show how to apply deep learning approaches to image analysis. Finally, we will focus on personalized spam recognition.

Tomas Cicha & Lukas Vrabel & Vit Listik, Seznam.cz (CZ)

Azure Machine Learning in practice (1)

Come and try out in practice how to create, train and consequently integrate Machine Learning model into your own applications. At the workshop, we will go through the entire process literally from scratch without any previous experience needed. The workshop follows the lecture Azure ML and there are no prerequisites except computer with a web browser and an Internet connection.

Jan Pospisil, Microsoft (CZ)
10.30 – 11.00

Coffee break

11.00 – 12.30

Machine Learning Solutions at Seznam.cz (2)

Second part of the workshop.

Tomas Cicha & Lukas Vrabel & Vit Listik, Seznam.cz (CZ)

Azure Machine Learning in practice (2)

Second part of the workshop.

Jan Pospisil, Microsoft (CZ)
12.30 – 14:00


14.00 – 15.30

Romeo and the European Commission (1)

The workshop will provide a practical introduction to text analytics. We will analyze freely available datasets such as EU parliament sessions transcripts or Shakespeare's works, extracting the names of people, locations, and companies, tags expressing the theme, sentiment, etc. We will discuss the NLP theory, the necessary code and the linguistic and ML context.

Jiri Hana & Radoslav Klic, Geneea (CZ)

Building Predictive Apps with Machine Learning APIs (1)

With ever increasing data, Machine Learning is becoming the only way to get analytics done, making it possible to glean insights from vast amounts of data. But when starting a project, it is easy to ignore a critical fact: the value of data is also time sensitive – it expires! In order to get the highest value from data, Machine Learning needs to be applied in a rapid and repeatable way, so you can go from data to insight quickly. A Machine Learning API makes this possible.

In this workshop, Poul Petersen CIO of BigML will give an overview of BigML's Machine Learning API and then show real-world examples of predictive applications that can be built using Python and node.js. Several tools that have been built on top of BigML's API will be demonstrated including a loan risk assessment, real estate arbitrage, and the world's first voice controlled predictive assistant.

Poul Petersen, BigML (USA)
15.30 – 16.00

Coffee break

16.00 – 17.30

Romeo and the European Commission (2)

Second part of the workshop.

Jiri Hana & Radoslav Klic, Geneea (CZ)

Building Predictive Apps with Machine Learning APIs (2)

Second part of the workshop.

Poul Petersen, BigML (USA)



Staropramen Brewery
Nadrazni 43/84,
150 00 Praha 5


Riverview Building
Svornosti 3321/2
150 00 Praha 5

More information

Social program


Early bird

First 50 tickets

  • Early bird
  • First 50 tickets
  • Conference only: 80.00 EUR
  • Conference + workshops: 120.00 EUR
  • sold out


Before April 20, 2016

  • Standard
  • Registration before April 20, 2016
  • Conference only: 95.00 EUR
  • Conference + workshops: 145.00 EUR
  • sold out


From April 20, 2016

  • Late
  • Registration from April 20, 2016
  • Conference only: 130.00 EUR
  • Conference + workshops: 190.00 EUR
  • sold out


Workshops are already sold out but we still have some tickets for the main conference.


Platinum partner

Gold partners

silver partner

party sponsor

media partners


If you have any questions about Machine Learning Prague, please e-mail us:


Jiri Materna:

Sarka Strossova:

Copyright © 2016 Machine Learning Prague s.r.o.

Blanicka 589/5, Vinohrady, 120 00 Praha 2
ICO: 04744144

Terms and conditions