Machine Learning Prague 2016

conference speakers

Xuedong Huang (USA)

Chief Scientist of Speech R&D, Microsoft

Tomas Mikolov

Tomas is a research scientist at Facebook AI Research. He is well known as the author of the word2vec algorithm.

Chris Brew (UK)

Senior Research Scientist, Thomson Reuters

Marek Rosa

Marek is the CEO, CTO and founder of two Prague-based companies: Keen Software House, an independent video game development studio, and GoodAI, a general artificial intelligence R&D company.

Michael Levin (RUS)

Chief Data Scientist, Yandex Data Factory

Libor Morkovsky

Libor Morkovsky has been working for Avast since 2006. Having a background in bioinformatics, he’s trying to pick the best of the two worlds and make it work in malware analysis.

workshop speakers

Poul Petersen

Poul Petersen, CIO, joined BigML in 2011 to build the infrastructure for BigML.

Schedule

April 22, 2016

18.30 – ???

Warm-up party

The party will be held at Kavarna museum, Vinohradska 1, 120 00 Praha 1.

Main Conference

April 23, 2016

08.00

Registration

8.50 – 9.00

Welcome to ML Prague 2016

9.00 – 10.00

Keynote speech: Deep Learning and Intelligent Applications

With recent advances in deep learning, big data, and computing infrastructure, we are able to engage and converse with intelligent machines more effectively. For example, computers will realistically reach human parity in recognizing speech in the next a few years. The computing industry’s progress in speech and vision is apparent in Microsoft products and services, including Cortana, Skype Translator, and Project Oxford cloud services. Dr. Xuedong Huang will use these examples to illustrate not only our society’s historical collective efforts, but also enumerate our remaining challenges to reaching human parity, an accomplishment that will have a profound impact on society.

Xuedong Huang, Microsoft (USA)

10.00 – 10.30

Inventing General Artificial Intelligence: A Vision and Methodology

Marek Rosa, CEO/CTO of a Prague-based general AI development company, will introduce his methodological approach to general purpose artificial intelligence (AGI) research. He will highlight a key concept he is working to develop with his team: School for AI, a set of training environments where AI agents will learn in a gradual and guided way. To perform successfully in these environments, an agent needs to demonstrate a set of abilities. The analysis of these abilities leads to a specific list of functional requirements, which will then be implemented as a unified AGI architecture.

School for AI will be released as an open source tool where AI researchers and machine learning practitioners can benchmark their AI agents.

Marek Rosa, GoodAI (CZ)

10.30 – 11.00

Intelligent Personal Assistants

Apple Siri, Google Now, and Microsoft Cortana are emerging new web-applications known as Intelligent Personal Assistants. They combine many latest and developing AI technologies such as speech recognition, question answering, information retrieval, knowledge management, IoT etc. The presentation will discuss the basic architecture, challenges and use cases of the future intelligent assistants.

Jan Sedivy, CTU (CZ)

11.00 – 11.30

Coffee break

11.30 – 12.00

Artificial Intelligence in Creative Writing

Have you ever thought about artificial intelligence smart enough to write novels or poetry? Meet the future. I will describe to you a machine learning model which is able to write an entire book or assist you in writing your own fictional literature.

Jiri Materna, Seznam.cz (CZ)

12.00 – 12.30

Learning Representations for Drug Discovery

Utilizing gene expression measurements, it is possible to characterize drugs and drug candidates for their on- and off-target activities, predict treatments for new indications, and highlight potential safety concerns. Improvements in technology now permit this approach to be applied on a qualitatively larger scale, which in turn requires new analytics. We find that applying methods inspired by the deep learning community can improve the interpretability, fidelity, and predictivity of pharmaceutical genomics.

Matthew Tudor, MSD (USA)

12.30 – 13.00

Online Hyperparameter Tuning in Non-Stationary Environments

Fitting hyperparameters online usually requires an extensive parallel search in the parameter space, which is often very demanding in time and computational resources. A comparison of parameters sequentially is often not valid, due to non-stationarity. Here, we present a method using global Bayesian optimization with Gaussian Processes to model certain kind of non-stationarities.

Jonas Seiler, Plista (DE)

13.00 – 14:30

Lunch

14.30 – 15.00

ML Approaches to Sentiment Analysis

Current computational linguistics witnesses a massive increase of interest in sentiment analysis, as it is a powerful means of public opinion mining. However, is it even possible to distinguish between good and evil using statistical methods? How would you treat irony, idioms, innovative vulgarisms and other inherent elements of natural language? Are emotions language-independent? I will share my thoughts and opinions towards these issues, describing the state-of-the art approaches employed in sentiment analysis.

Katerina Veselovska, Ataccama (CZ)

15.00 – 15.30

Distributed Representations for NLP

Distributed representations are currently a hot topic both in academic research and in industrial applied machine learning. I will talk about some breakthroughs from recent years, and will try to explain what the distributed representations are about and why is there so much excitement around techniques such as word2vec.

Tomas Mikolov, Facebook (USA)

15.30 – 16.00

TR Discover: A Natural Language Interface for Exploring Linked Datasets

Keywords are the dominant technology for providing non-technical users with access to linked data. This is problematic, because keywords cannot express all the necessary details of user intent. Non-technical users cannot be expected to use an expressive query language such as SQL, but still need to access the data.

They can, however, use English, Our system, called TR Discover, maps from a fragment of English into an intermediate first-order logic representation, which is in turn mapped into SPARQL or SQL. It has been tested both on a dataset relevant to drug research and on the publicly available QALD-4 dataset. The system incorporates a tailored autosuggest mechanism and a back end that delivers task appropriate analytics.

There will be a demo of a publicly visible version of the system.

Chris Brew, Thompson Reuters (UK)

16.00 – 16.30

Coffee break

16.30 – 17.00

Finding the Oddballs

Anomaly detection is a handy tool for a machine learning practitioner's toolbox. Not only is it useful for particular domains like fraud detection or intrusion detection. It's also nice for more general tasks, such as data cleaning or early warning of concept drift. We'll review a few approaches to anomaly detection, and then dig a little deeper into an ensemble technique called isolation forests.

Adam Ashenfelter, BigML (USA)

17.00 – 17.30

Recognizing Malware

Today’s malware ecosystem produces hundreds of thousands distinct samples per day. To leverage similarities between the samples for automated classification we built a distributed database engine relying on GPUs. We use instance based classification which gives us online learning and accountability for every decision. With query times of fraction of a second even using a compound distance function, this system is able to classify the incoming samples in the real time. Samples classified as malware are directly used to generate rules to identify similar samples in machines of our customers.

Libor Morkovsky, Avast (CZ)

17.30 – 18.00

MatrixNet Applications at Yandex

MatrixNet is a proprietary machine learning tool developed by Yandex and used widely throughout the company. It has different learning modes, such as ranking, regression and classification. The algorithm is based on gradient boosting over decision trees, but its implementation has a lot of heuristics that give strong defense against overfitting. We start with some details of the algorithm itself, then discuss its applications in web search, ad click prediction, churn prediction and other areas.

Michael Levin, Yandex Data Factory (RUS)

18.00 – 18.05

Closing

18.30 – ???

Afterparty

Workshops

April 24, 2016

Room 1

Room 2

09.00 – 10.30

Machine Learning Solutions at Seznam.cz (1)

In the first part of the workshop we will describe principles of RC-Rank, a multipurpose machine learning tool developed at Seznam.cz. Next, we will present a short introduction to deep convolutional neural networks and show how to apply deep learning approaches to image analysis. Finally, we will focus on personalized spam recognition.

Tomas Cicha & Lukas Vrabel & Vit Listik, Seznam.cz (CZ)

Azure Machine Learning in practice (1)

Come and try out in practice how to create, train and consequently integrate Machine Learning model into your own applications. At the workshop, we will go through the entire process literally from scratch without any previous experience needed. The workshop follows the lecture Azure ML and there are no prerequisites except computer with a web browser and an Internet connection.

Jan Pospisil, Microsoft (CZ)

10.30 – 11.00

Coffee break

11.00 – 12.30

Machine Learning Solutions at Seznam.cz (2)

Second part of the workshop.

Tomas Cicha & Lukas Vrabel & Vit Listik, Seznam.cz (CZ)

Azure Machine Learning in practice (2)

Second part of the workshop.

Jan Pospisil, Microsoft (CZ)

12.30 – 14:00

Lunch

14.00 – 15.30

Romeo and the European Commission (1)

The workshop will provide a practical introduction to text analytics. We will analyze freely available datasets such as EU parliament sessions transcripts or Shakespeare's works, extracting the names of people, locations, and companies, tags expressing the theme, sentiment, etc. We will discuss the NLP theory, the necessary code and the linguistic and ML context.

Jiri Hana & Radoslav Klic, Geneea (CZ)

Building Predictive Apps with Machine Learning APIs (1)

With ever increasing data, Machine Learning is becoming the only way to get analytics done, making it possible to glean insights from vast amounts of data. But when starting a project, it is easy to ignore a critical fact: the value of data is also time sensitive – it expires! In order to get the highest value from data, Machine Learning needs to be applied in a rapid and repeatable way, so you can go from data to insight quickly. A Machine Learning API makes this possible.

In this workshop, Poul Petersen CIO of BigML will give an overview of BigML's Machine Learning API and then show real-world examples of predictive applications that can be built using Python and node.js. Several tools that have been built on top of BigML's API will be demonstrated including a loan risk assessment, real estate arbitrage, and the world's first voice controlled predictive assistant.

Poul Petersen, BigML (USA)

15.30 – 16.00

Coffee break

16.00 – 17.30

Romeo and the European Commission (2)

Second part of the workshop.

Jiri Hana & Radoslav Klic, Geneea (CZ)

Building Predictive Apps with Machine Learning APIs (2)

Second part of the workshop.

Poul Petersen, BigML (USA)

April 23–24, 2016, Prague

Conference on machine learning in practice

About the conference

conference speakers

Xuedong Huang

Tomas Mikolov

Katerina Veselovska

Chris Brew

Marek Rosa

Jiri Materna

Adam Ashenfelter

Jan Sedivy

Michael Levin

Libor Morkovsky

Matthew Tudor

Jonas Seiler

workshop speakers

Poul Petersen

Lukas Vrabel

Vit Listik

Tomas Cicha

Jiri Hana

Radoslav Klic

Jan Pospisil

Schedule

April 22, 2016

Warm-up party

Main Conference

April 23, 2016

Registration

Welcome to ML Prague 2016

Keynote speech: Deep Learning and Intelligent Applications

Inventing General Artificial Intelligence: A Vision and Methodology

Intelligent Personal Assistants

Coffee break

Artificial Intelligence in Creative Writing

Learning Representations for Drug Discovery

Online Hyperparameter Tuning in Non-Stationary Environments

Lunch

ML Approaches to Sentiment Analysis

Distributed Representations for NLP

TR Discover: A Natural Language Interface for Exploring Linked Datasets

Coffee break

Finding the Oddballs

Recognizing Malware

MatrixNet Applications at Yandex

Closing

Afterparty

Workshops

April 24, 2016

Room 1

Room 2

Machine Learning Solutions at Seznam.cz (1)

Azure Machine Learning in practice (1)

Coffee break

Machine Learning Solutions at Seznam.cz (2)

Azure Machine Learning in practice (2)

Lunch

Romeo and the European Commission (1)

Building Predictive Apps with Machine Learning APIs (1)

Coffee break

Romeo and the European Commission (2)

Building Predictive Apps with Machine Learning APIs (2)

Venue

Conference

Workshops

Social program

Warm-up party

Saturday party

registration

sponsors

Platinum partner

Gold partners

silver partner

party sponsor

media partners

Contact