Fundamentals of Machine Learning
2019-02-11 09:00 to2019-02-12 17:00(Europe/Tallinn)
He has been a popular speaker at major IT conferences since 1998, and he had the honour of sharing keynote platforms with Bill Gates and Neil Armstrong. A natural educator, he explains complex concepts in simple terms in his enjoyably energetic style
Trainer Rafal Lukawiecki
Course focuses on the newest technologies of Microsoft Machine Learning Server and SQL Server 2017. By popular demand, second part (3 days) of this course teaches programming in R, however most of the course is also applicable to Python programmers, as the key libraries are the same.
This course has two parts: 2-day part A: Fundamentals of Machine Learning followed by 3-day Part B: Immersion into Machine Learning in R, SQL Server 2017, and Microsoft ML Server. The first part introduces the most important concepts and tools, while the second part teaches you R and how to use it for machine learning on the Microsoft platform.
Because of Rafal's 10+ years of real-world machine learning experience.
You will not only learn all the concepts and tools that you need to know from a great teacher who has trained almost 500 data scientists world-wide, a highly-respected presenter, capable of holding your attention, but, above all, from a practitioner of machine learning. Rafal Lukawiecki has been delivering ML, data mining, and data science projects for customers in retail, banking, entertainment, healthcare, manufacturing, education, and government sectors for over ten years. Because of that, you will learn:
• how to avoid common pitfalls,
• how to get ahead of your competition by working faster,
• what is really useful and practical,
• what is more theoretical but still important,
• what hype you should be wary of.
You will be able to ask any questions related to your industry and you will get relevant, pragmatic, no-nonsense answers, helping you get ahead with your own projects.
Learn from Rafal who has done it all, not from those who just teach it—this is why it is called Practical Machine Learning.
We begin with a thorough introduction of all of the key concepts, terminology, components, and tools. Topics include:
• Machine learning vs. data mining vs. artificial intelligence
• Tool landscape: open source R vs. Microsoft R, Python, SQL Server, ML Server, Azure ML
There are hundreds of machine learning algorithms, yet they belong to just a dozen of groups, of which 5 are in very common use. We will introduce those algorithm classes, and we will discuss some of the most often used examples in each class, while explaining which technology tools (Azure ML, SQL, or R) provide their most convenient implementation. You will also learn how to find more algorithms on the Internet and how to figure out if they are any good for real use. Topics include:
• What do algorithms do?
• Algorithm classes in R, Python, ML Server, Azure ML, and SSAS Data Mining
• Supervised vs. unsupervised learning
• Similarity Matching
Machine learning requires you to prepare your data into a rather unique, flat, denormalised format. While features (inputs) are always necessary, and you may need to engineer thousands of them, we do not need labels (predictive outputs) in all cases. Topics include:
• Cases, observations, signatures
• Inputs and outputs, features, labels, regressors, independent and dependent variables, factors
• Data formats, discretization/quantizing vs. continuous
• Indicator columns
• Feature engineering
• Azure ML data preparation and manipulation modules
• Moving data around and its storage, SQL vs. NoSQL, files, data lakes, BLOBs, and Hadoop
The process consists of problem formulation, data preparation, modelling, validation, and deployment—in an iterative fashion. You will briefly learn about the CRISP-DM industry-standard approach but the key subject of this module will teach you how to apply the scientific method of reasoning to solve real-world business problems with machine learning and statistics. Notably, you will learn how to start projects by expressing needs as hypotheses, and how to test them. Topics include:
• Stating business question in data science term
• Hypothesis testing and experiments
• Student's t-test
• Pearson chi-squared test
• Iterative hypothesis refinement
At the heart of every project we build machine learning models! The process is simple and it follows a well-trodden path. In this module you will build your first decision tree and get it ready for validation in the next module. Topics include:
• Connecting to data
• Splitting data to create a holdout
• Training a decision tree
• Scoring the holdout
• Plotting accuracy
The most important aspect of any data science, artificial intelligence, and machine learning project is the iterative validation and improvement of the models. Without validation, your models cannot be reliably used. There are several tests of model validity, most importantly those that check accuracy and reliability. Topics include:
• Testing accuracy
• False positives vs. false negatives
• Classification (confusion) matrix
• Precision and recall
• Balancing precision with recall vs. business goals and constraints
• Introduction to lift charts and ROC curves
• Testing reliability
• Testing usefulness
50% lectures, 30% demos, 20% tutorials.
You are encouraged to follow the demos on your machine, and you will be challenged to find answers to 3 larger problems during the tutorials. While they are a hands-on part of the course, if you prefer not to practice, you are welcome to use that time for additional Q&A, or to analyse your own data. We will provide you with all the necessary data sets, and we will explain what free or evaluation edition software needs to be installed to follow the course on your own laptop. In some training centers we are able to provide pre-built machines which you can use instead of your own—please inquire. You will need an Azure account (even a free one) during the course. You can copy course experiments and data into your workspace for learning and for future reference after the course.
Analysts, budding data scientists, data scientists, database and BI developers, programmers, power users, DBAs, predictive modellers, forecasters, consultants.
If you have attended a prior course on Machine Learning, like Rafal's week-long class Practical Data Science that was offered in 2015–2017, and if you are versed in model validity, accuracy, and reliability, consider attending 3-day course. Ask yourself these questions: can I explain the difference between cross-validation and hold-out testing, do I know which business metrics correspond to precision and which to recall, is model accuracy more important than reliability, and how does a boosted decision tree work. If in doubt, please attend both 2-day course and 3-day course.
As Data Scientist at Project Botticelli Ltd, Rafal focuses on making advanced analytics and artiﬁcial intelligence easy and useful for his clients.
He can help you ﬁnd valuable, meaningful patterns and statistically valid correlations using data mining and machine learning in data sets both big and small. Rafal is also known for his work in business intelligence, data protection, enterprise architecture, and solution delivery. While majority of his clients come from consumer and corporate ﬁnance, entertainment, healthcare, IT, retail, and the public sectors, Rafal has worked in almost all industries.
He has been a popular speaker at major IT conferences since 1998, and he had the honour of sharing keynote platforms with Bill Gates and Neil Armstrong. A natural educator, he explains complex concepts in simple terms in his enjoyably energetic style.
Rafal was born in Poland. He left it in 1990 to study computing in United Kingdom, where he earned BEng in Computing Science, followed by MSc in Foundations of Advanced Information Technology, at Imperial College, University of London. His studies were sponsored by Oxford Computer Group Ltd, where he later worked as a developer, trainer, and a consultant. Since 2000 he has worked for Project Botticelli Ltd.
Outside of IT, Rafal spends a quarter of every year ﬁnding abstractions in natural landscapes, expressing them through traditional, black-and-white, large-format ﬁlm photography, making silver-gelatin prints by hand—see rafal.net. You can also follow Rafal on TwitterTwitter,, or connect with him on LinkedIn.