R101 - Learning R together (Lunch and Learn)
This page provides information related the First 2020 season of “Lunch and Learn R” seminars organized by Dmitry Gorodnichy with his colleagues in Gov’t of Canada. In interactive hands-on way, they show how to learn and master R - one of the most popular tools used for data analysis and visualization - using open source resources. The sessions are 45 mins long and run every Friday, with the objective to develop something useful for community by the end of 8-10 sessions, e.g., a Web App that can visualize some complex and valuable data, the plenty of which is available at open.canada.ca.
For Second 2021 season, please see this page No programming experience or data science background required. No installation of software is needed either. All coding is done in https://rstudio.cloud. It’s free, no subscription required, and is greatly supported by community.
- Dial-in instructions: will be posted here.
- GCCollab page (restricted access): here
- YouTube Channel: “Lunch and Learn R - together
- Codes that we write: in /r101 folder.
- Ideas for self-learning: in Resources
- Other: Data-sets and R projects about Ottawa and Canada
Season 1 (Summer 2020)
Topic: “Building COVID Web App from scratch”
Sub-topic: “Doing Data Science with R: Computer Science way”
Dates: April - June 2020
Summary:
In this first season of R101 training, we showed how to build from scratch AI and Data Science tools using R in RStudio. You can see the final result here!
Session 4: Fri 2020/05/15 (Catch-up session) - video
Special sessions to catch up for those who missed first two sessions. Topics :
- Recap of Learning resources (rstudio.com & datacamp.com)
- Assignment of homework
- What these R101 sessions are about
- they are not to replace your main Learning resouces, but to help you to get started
- to provide
Computer Science
(akaComputing Science
) perspective and tricks toData Science
projects. Think - Google as an example of Computer Science approach to Data Science problem. - to socialize, build connections, contribute to further development of community projects: e.g. iTrack Covid
- From Zero to Runnable code (automated report) - in one sprint of 30 min
- Assuming I know nothing… and starting from this R Ottawa - R101 page
- Just three tabs need to be opened: https://rstudio.cloud, /r101 and www.google.com
Session 3: Wed 2020/05/13 - video
Third session talks about the difference Computing Science (aka Computer Science) vs. Data Science, and what we are learning here: We learn how to program in R the way Computer Scietists program, and how to building tools (or algorithms) that find answers to our Data-related questions (such as “Where Covid is the worst tomorrow?” - which is Computer Science, as opposed to Data Science, where Data is visualized or processed, but no tools are developed and it is still a human who needs to find the answer.
A better name for this course should be CMPUT 101 - Introduction to Computing, which is the name of the course that Dmitry taught at the University of Alberta (back in 1999 it was Matlab that we used,rather than R), instead of R101.
Topics covered:
1- Refresher on how to start: from knowing nothing (see also first session)
2- Analysing situation in Quebec (recap of last session findings), and in US /New York (this session focus)
3- More about data.table
4- Functions
5- Simple but complete Covid App tool example (01-report.Rmd
) - no-interactivity
R: (01-read.R
)
- Working with
data.table
:dt[ i, j, by]
: - conditional viewing, value assignment
- show for state New York, city New York, for last date
- show all with more than 100,000 confirmed
- merging data-sets - using
merge
anddtGeo[dtUS]
R Markdown: (01-report.Rmd
)
- report the results from
01-read.R
- Showing how to run https://rmarkdown.rstudio.com/lesson-12.html in rstudio.cloud
Session 2: Wed 2020/05/06 - video
In our second session, we continue from where we left: we will open the .csv file from JHU and analyze it in a number of ways. The R script that we have started creating last time in rstudio.cloud is copied to /r101 folder: 01-read.R
Topics covered:
- Overview of the new purpose and functionality of iTrack Covid App (v0.0.5 Canadian Edition - “Should I go or should I stay”). We’ll now be adding the same functionality to US data - together with you.
- General coding process & mental framework: Running our first line - Trouble-shooting - Organizing code
- Getting help: all knowledge you need is with you already! : Build-in Help and www.Stackoverflow.com
- General process of getting to know your “stranger” (data) and making something nice out of it:
- ways to view it, and print it, to manipulate so it is easier to work with
- removing unneeded columns,
melt
ing data, renaming columns
- Your next best friend:
library(magrittr)
Session 1: Wed 2020/04/29 - video / transcript
This is our first Live recorded session…
Topics covered:
- your first steps to start learning R: go to www.rstudio.com, and follow to Resources- Education-For Learners-Tutorials- ending up in https://rstudio.cloud and finding tutorials there: Learn - Primers - The Basics - Visualization Basics
- your first steps to start programming in R: New Project, New File - R Script, first executed one line (to read .csv file using
fread() )
, first error (could not find function
), first installed package (library(data.table)
) - your first R-powered document and App: New File - R Markdown
- your first tricks and take-aways
Take-aways:
- Keep all useful libraries and functions in one place:
source("000-common.R')
library(data.table)
is your best new friend- Run line by line with (CTRL+ENTER)
- Make use of the
Table of contents
to build the structure for your code that is easy to navigate:# 1.1 Merging data ----
- Comment out unneeded code in R Script with
#
(CTRL+SHIFT+C) - Comment out in chunks using
if (F) { # 0. General libraries and functions ----
- Think and code in chunks
- Use R Markdown to organize your ideas and results (iTrack Covid App is just an R Markdown)
- Comment out unneeded text in R Markdown with
<!--
--->
(CTRL+SHIFT+C) - Two ways of educating yourself: 1) follow many tutorials (some bult in rstudio.cloud), 2) start building own own data science tool and seek answers as you go!