Sie sind nicht angemeldet

Data Science Toolkits and Architectures


Dozent/in MSc, Sandro Cilurzo; MSc, Arthur Habicht
Veranstaltungsart Vorlesung
Code HS241081
Semester Herbstsemester 2024
Durchführender Fachbereich Wirtschaftswissenschaften
Studienstufe Master
Termin/e Do, 19.09.2024, 16:15 - 19:00 Uhr, ZOOM
Do, 03.10.2024, 16:15 - 19:00 Uhr, ZOOM
Do, 17.10.2024, 16:15 - 19:00 Uhr, 4.B54
Do, 31.10.2024, 16:15 - 19:00 Uhr, 4.B54
Do, 14.11.2024, 16:15 - 19:00 Uhr, 4.B54
Do, 28.11.2024, 16:15 - 19:00 Uhr, 4.B54
Do, 12.12.2024, 16:15 - 19:00 Uhr, 4.B54
Umfang 2 Semesterwochenstunden
Turnus bi-weekly
Inhalt

The field of data science has experienced a renaissance due to innovations in algorithms and widespread availability of affordable storage and compute capabilities. As a consequence, the growing, global stream of data has emerged as a significant economic factor.

Nonetheless, many companies struggle to make use of their data. A significant reason for this is a lack of experience in organizing data and software as well as managing a data science team in a collaborative setting.

This course sets off, where most data science courses end. It addresses technical and organizational challenges that are typically accompanied by operating data-driven software products in “production”.

In this context, the course aims to provide solutions for the aforementioned challenges. This includes toolkits and architectures that:

- render the management of data science projects more efficient
- allow for versioning of data, software and runtime environments, in order to ensure
   reproducibility of data-driven systems
- improve collaboration and knowledge transfer among members of a larger data science team
- facilitate the deployment of data-driven products

Lernziele - Understanding of the larger complexity of data-driven software compared to “traditional” software
- A firm grasp of the typical life cycle of machine learning projects in industry
- An overview of existing toolkits that address the challenges of data-driven products
- Knowledge in a subset of those toolkits that cover different areas, such as:
- code versioning (f.e. Git)
- data versioning (f.e. DVC)
- runtime versioning (f.e. Docker)
- testing frameworks
- experiment- and knowledge management (Weights & Bias, MLflow, DVC)
- production environments for machine learning models
- The students are expected to be able to create a workflow for the development of complex data science products
Voraussetzungen - Experience with Python or R scripts
- Experience in training machine learning models (e.g. linear regression)
- First experiences with the command line (Unix and Windows)
Sprache Englisch
Begrenzung max. 25 participants
Anmeldung

To attend the course / exercise, registration via e-learning platform OLAT is required. Registration is possible from 2 – 27 September 2024. The students themselves are responsible for checking the creditability of the course to their course of study.

Direct link to OLAT course: https://lms.uzh.ch/url/RepositoryEntry/17577443356

Prüfung ***IMPORTANT*** In order to acquire credits, resp. to take the examination, registration via the Uni Portal within the examination registration period is ESSENTIALLY REQUIRED. Further information on registration: www.unilu.ch/wf/pruefungen
Abschlussform / Credits Written paper / Project report / 6 Credits
Hinweise Project work
Hörer-/innen Nach Vereinbarung
Kontakt sandro.cilurzo@sedimentum.com
arthur.habicht@sedimentum.com
Anzahl Anmeldungen 8 von maximal 25
Literatur

The Hundred-Page Machine Learning Book (Andriy Burkov)