Sie sind nicht angemeldet

Data Mining in R


Dozent/in Dr. Andrea De Angelis
Veranstaltungsart Masterseminar
Code FS231580
Semester Frühjahrssemester 2023
Durchführender Fachbereich Politikwissenschaft
Studienstufe Master
Termin/e Fr, 10.03.2023, 09:15 - 17:00 Uhr, 4.B54 (Terminierung 1)
Sa, 11.03.2023, 09:15 - 17:00 Uhr, 4.B54 (Terminierung 1)
Fr, 31.03.2023, 09:15 - 17:00 Uhr, 4.B02 (Termine)
Sa, 01.04.2023, 09:15 - 17:00 Uhr, 4.B02 (Termine)
Umfang 2 Semesterwochenstunden
Turnus Blockveranstaltung
Inhalt

Data analysis increasingly involves mining data from the Internet and handling big datasets. However, students often lack the knowledge and experience required to take full advantage of the Internet and social media's data opportunities. This course guides the students to move their first steps into data mining. The course offers case studies and exercises in a friendly class environment. Students will learn (by doing) how to collect and handle web data in their future work. The course covers the primary skills required to access web data confidently.

 

The course is structured in three blocks:

1.      an introductory block covers the essential knowledge for working with big data (notions of R programming, developing reproducible code, reporting in automated notebooks, version control, and Git/GitHub; secondary datasets for social science research & MySQL).

2.      A data access block focuses on web scraping and related tools (introduction to regular expressions, HTML language, XML, and JSON data structures).

A third block introduces more advanced data access concepts, such as API interaction, and allows students to practice with live coding sessions in class.

 

Check out the syllabus and the OLAT page of the course for more detail at this link:  https://lms.uzh.ch/auth/RepositoryEntry/17312448525/CourseNode/78100224418873.


Lernziele By the end of the course, active participants will:
1. gain proficiency in data analysis, learning to analyze data efficiently and reproducibly. [Data analysis]
2. understand and critically re-assess data-related issues arising in applied research problems with big data. [Data literacy]
3. learn how to develop and debug complex code throughout the data analysis cycle (mining, tidying, analyzing, reporting). [Programming and statistical skills]
4. developing feasible big data research designs. [Research and analytical skills]
handle unstructured text and unfold the tidying process to obtain structured data. [Text mining]
Voraussetzungen Begrenzung: priority for LUMACSS students. In case of too many registrations by other disciplines, a draw will be made to decide who may remain in the course.
Sprache Englisch
Begrenzung Begrenzung: priority for LUMACSS students
Anmeldung Masterstudierende
Prüfung No Exam
Abschlussform / Credits Aktive Teilnahme, Essay (benotet) / 4 Credits
Hinweise Begrenzung: priority for LUMACSS students. In case of too many registrations by other disciplines, a draw will be made to decide who may remain in the course.
Hörer-/innen Nach Vereinbarung
Kontakt andrea.deangelis@unilu.ch
Literatur

·         QSS: Imai, K. (2017). Quantitative Social Science: An Introduction. Princeton: Princeton University Press.

·         R4DS: Wickham, H., and G. Grolemund (2014). R for Data Science. O’Reilly Media. The book is also freely available online: https://r4ds.had.co.nz/.

ADCR: Munzert et al. (2014). Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining. London: Wiley & Sons.