Course Description
The course introduces students to fundamentals about data and the standards, technologies, and methods for organizing, managing, curating, preserving, and using data. It discusses broader issues relating to data management, ethics, quality control and publication of data. Applied examples of data collection, processing, transformation, management, and analysis as well as a hands-on introduction to the emerging field of data science are provided. Students will explore key concepts related to data science, including applied statistics, information visualization, text mining and machine learning. "R", the open source statistical analysis and visualization system, will be used throughout the course. R is reckoned by many to be the most popular choice among data analysts worldwide; having knowledge and skill with using it is considered a valuable and marketable job skill for most data scientists.
Credit(s)
3.0
Professor of Record
Jasmina Tacheva
Audience
Undergraduate students.
Learning Objectives
After taking this course, students will be expected to understand:
- Essential concepts and characteristics of data
- The purpose of scripting for data management using R and R-Studio
- Principles and practices in data screening, cleaning, linking, and visualizations
- The importance of clear communication of results to decision-makers
- The key ethical challenges associated with applications of data science in a variety of contexts
After taking this course, students will be able to:
- Identify a problem and the data needed for addressing the problem.
- Perform basic computational scripting using R and other optional tools.
- Transform data through processing, linking, aggregation, summarization, and searching.
- Organize and manage data at various stages of a project life cycle.
- Determine appropriate techniques for analyzing data.
Course Syllabi
IST 387 LECTURE Spring 2021 Syllabus- Jasmina Tacheva
IST 387 LAB Spring 2021 Syllabus - Chris Dunham
IST 387 Fall 2021 Syllabus - Stephen Wallace