Dozent/in |
MSc, Sandro Cilurzo; MSc, Arthur Habicht |
Veranstaltungsart |
Vorlesung |
Code |
HS241081 |
Semester |
Herbstsemester 2024 |
Durchführender Fachbereich |
Wirtschaftswissenschaften |
Studienstufe |
Master |
Termin/e |
Do, 19.09.2024, 16:15 - 19:00 Uhr, ZOOM Do, 03.10.2024, 16:15 - 19:00 Uhr, ZOOM Do, 17.10.2024, 16:15 - 19:00 Uhr, 4.B54 Do, 31.10.2024, 16:15 - 19:00 Uhr, 4.B54 Do, 14.11.2024, 16:15 - 19:00 Uhr, 4.B54 Do, 28.11.2024, 16:15 - 19:00 Uhr, 4.B54 Do, 12.12.2024, 16:15 - 19:00 Uhr, 4.B54 |
Umfang |
2 Semesterwochenstunden |
Turnus |
bi-weekly |
Inhalt |
The field of data science has experienced a renaissance due to innovations in algorithms and widespread availability of affordable storage and compute capabilities. As a consequence, the growing, global stream of data has emerged as a significant economic factor.
Nonetheless, many companies struggle to make use of their data. A significant reason for this is a lack of experience in organizing data and software as well as managing a data science team in a collaborative setting.
This course sets off, where most data science courses end. It addresses technical and organizational challenges that are typically accompanied by operating data-driven software products in “production”.
In this context, the course aims to provide solutions for the aforementioned challenges. This includes toolkits and architectures that:
- render the management of data science projects more efficient - allow for versioning of data, software and runtime environments, in order to ensure reproducibility of data-driven systems - improve collaboration and knowledge transfer among members of a larger data science team - facilitate the deployment of data-driven products |
Lernziele |
- Understanding of the larger complexity of data-driven software compared to “traditional” software
- A firm grasp of the typical life cycle of machine learning projects in industry
- An overview of existing toolkits that address the challenges of data-driven products
- Knowledge in a subset of those toolkits that cover different areas, such as:
- code versioning (f.e. Git)
- data versioning (f.e. DVC)
- runtime versioning (f.e. Docker)
- testing frameworks
- experiment- and knowledge management (Weights & Bias, MLflow, DVC)
- production environments for machine learning models
- The students are expected to be able to create a workflow for the development of complex data science products |
Voraussetzungen |
- Experience with Python or R scripts
- Experience in training machine learning models (e.g. linear regression)
- First experiences with the command line (Unix and Windows) |
Sprache |
Englisch |
Begrenzung |
max. 25 participants |
Anmeldung |
To attend the course / exercise, registration via e-learning platform OLAT is required. Registration is possible from 2 – 27 September 2024. The students themselves are responsible for checking the creditability of the course to their course of study. Direct link to OLAT course: https://lms.uzh.ch/url/RepositoryEntry/17577443356 |
Prüfung |
***IMPORTANT*** In order to acquire credits, resp. to take the examination, registration via the Uni Portal within the examination registration period is ESSENTIALLY REQUIRED. Further information on registration: www.unilu.ch/wf/pruefungen |
Abschlussform / Credits |
Written paper / Project report / 6 Credits
|
Hinweise |
Project work |
Hörer-/innen |
Nach Vereinbarung |
Kontakt |
sandro.cilurzo@sedimentum.com
arthur.habicht@sedimentum.com |
Anzahl Anmeldungen |
8 von maximal 25 |
Literatur |
The Hundred-Page Machine Learning Book (Andriy Burkov) |