Our love of data doesn't stop with Love Data Week. Join us at one of our many data workshops throughout the semester.
The Catalyst (Lib 199)
Join us for a 1-hour seminar focused on creating effective and appealing data visualizations. You'll learn how design choices can make or break a data viz and how simple changes can lead to big improvements. We'll cover topics such as text and fonts, color, layout, and labels.
The Catalyst (Lib 199)
Pie, doughnut, lollipop, waffle, and bee swarms? Oh my!
These are just some of the charts you can make, but should you? In this workshop we’ll cover the basics of how to pick a chart type to match your data and discuss some of the essential choices you’ll have to make to ensure it conveys your message - and data - clearly.
The Catalyst (Lib 199)
The Linux command line, also known as the terminal or shell, is a powerful tool that allows you to control your computer directly with text commands. While it might seem intimidating or strange at first, it offers a lot of flexibility and efficiency once you have acquired the skill. In addition, high performance computing (HPC) systems are powerful tools used for complex calculations and data analysis, and often controlled by Unix command lines. This workshop will get you started on the basics, such as navigating through folders and viewing files. You will need a Windows, Mac, or Linux computer to participate.
The Catalyst (Lib 199)
Git is a version control system: it tracks and manages changes to your files over time and allows you to include explanations for each change. This enables you to investigate changes and correct problems without losing code or data. Git works best with plain text data (such as txt, csv, tsv files) and most code files. This workshop will introduce you to Git–how it works, its benefits, and how to get started—by learning and practicing the basics. This workshop is for anyone that works with plain text data and code but requires no prior knowledge of Git. You will need a Windows, Mac, or Linux computer to participate.
The Catalyst (Lib 199)
Office hour and followup question session for the Unix and Git workshops hosted on the previous two Tuesdays. Bring your questions and what you want to learn more about, you don't have to stay the entire time.
Online
Do you want your research to be read by as many people as possible? Do you want to comply with federal requirements for making your scholarship and data publicly available? The library has you covered! In this workshop, we'll cover how ISU's open-access repositories--DataShare and the Digital Repository--can help ISU faculty, students, and staff share and preserve your paper and data publications. You'll learn how to add publications to these systems and about benefits like tracking statistics, stable links, and author profiles.
The Catalyst (Lib 199)
In preparation for data analysis or training a machine learning model, do you ever find yourself spending endless hours fixing typos and inconsistent formats in your dataset? Are you frustrated with repeating the same cleaning steps on multiple similar datasets? Faster than Excel, easier than Python, this workshop will showcase the basics of OpenRefine, a free and powerful software to explore, transform, and clean your data and you will learn to document your process so you can effortlessly apply the same steps to similar data. Bring your own messy data to tailor to your needs after a guided introduction. This is a beginner level workshop, and no prior experience is needed. You will need to bring a Windows, Mac, or Linux computer to participate.
Zoom Online
Spreadsheet software can be a research partner--or a research barrier. This workshop introduces ways to structure your research data so it's both computer and human friendly, practices for documenting your work, and functions for simple data processing and quality control. You can’t escape spreadsheets, but you can escape spreadsheet hell! This beginner-level Zoom workshop covers how to organize and document data in a spreadsheet program such as Excel. Includes some hands-on practice.
The Catalyst (Lib 199)
Data engineering is the foundation of data science and lays the groundwork for analysis and modeling. In order for organizations to extract knowledge and insights from structured and unstructured data, fast access to accurate and complete datasets is critical. Working with massive amounts of data from disparate sources requires complex infrastructure and expertise. Minor inefficiencies can result in major costs, both in terms of time and money, when scaled across millions to trillions of data points. Each session is roughly divided into 3 hours of active teaching and 1 hour of extra Q&A.
The concepts below will spread over the two sessions:
Upon successful completion of the assessment, you’ll receive an NVIDIA certificate.
The Catalyst (Lib 199)
An ounce of prevention is worth a pound of cure, especially when it comes to your valuable files. In the first 30 minutes of this workshop, we'll review 5 data management practices: backups, organization & naming, documentation, security, and futureproofing. The second half is optional time for you to stay around for discussion and questions. This workshop is for anyone who wants to improve their data management.