Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

Lecture 3

Git and GitHub

Ivan Rudik

AEM 7130

1 / 50
2 / 50

Software and stuff

Necessary things to do:

  • Install Git

  • Create an account on GitHub

  • Install GitHub Desktop if you want a GUI for Git

  • Accept invite to the AEM 7130 classroom repository on GitHub

3 / 50

Quick note

A decent chunk of these slides is inspired by Grant McDermott's data science course

If you're interested in more data science-y (similar to Ariel's class) and less numerical/structural material, check his material out

4 / 50

Why bother with this new fangled Git stuff?

The classic date your file name method is not good

5 / 50

Why bother with this new fangled Git stuff?

The classic date your file name method is not good

When did you make changes? Who made them?
How do you undo only some changes from one update to the next?

5 / 50

Why bother with this new fangled Git stuff?

The classic date your file name method is not good

When did you make changes? Who made them?
How do you undo only some changes from one update to the next?

If you've ever had a disaster managing code changes (you will), Git can help

5 / 50

Git is the smart way to handle code

What is git?

Git is a distributed version control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

6 / 50

Git is the smart way to handle code

Okay, so what?

7 / 50

Git is the smart way to handle code

Okay, so what?

Git combines a bunch of very useful features:

7 / 50

Git is the smart way to handle code

Okay, so what?

Git combines a bunch of very useful features:

  • Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox
7 / 50

Git is the smart way to handle code

Okay, so what?

Git combines a bunch of very useful features:

  • Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox

  • Tracking of changes to files in a very clean way

7 / 50

Git is the smart way to handle code

Okay, so what?

Git combines a bunch of very useful features:

  • Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox

  • Tracking of changes to files in a very clean way

  • Easy ways to test out experimental changes (e.g. new specifications, additional model states) and not have them mess with your main code

7 / 50

Git is the smart way to handle code

Okay, so what?

Git combines a bunch of very useful features:

  • Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox

  • Tracking of changes to files in a very clean way

  • Easy ways to test out experimental changes (e.g. new specifications, additional model states) and not have them mess with your main code

  • Built for versioning code like R, Julia, LaTeX, etc

7 / 50

Git histories in GitHub Desktop

Some apps can give you a pretty visual of the history of changes to your code (shell can too, but not as nice)

8 / 50

GitHub

Git GitHub

9 / 50

GitHub

Git GitHub

GitHub hosts a bunch of online services we want when using Git

9 / 50

GitHub

Git GitHub

GitHub hosts a bunch of online services we want when using Git

  • Allows for people to suggest code changes to existing code
9 / 50

GitHub

Git GitHub

GitHub hosts a bunch of online services we want when using Git

  • Allows for people to suggest code changes to existing code

  • It's the main location for non-base Julia packages (and tons of other stuff) to be stored and developed

9 / 50

GitHub

Git GitHub

GitHub hosts a bunch of online services we want when using Git

  • Allows for people to suggest code changes to existing code

  • It's the main location for non-base Julia packages (and tons of other stuff) to be stored and developed

  • It has services that I used to set up this class, etc

9 / 50

The differences

10 / 50

The differences

Git is the infrastructure for versioning and merging files

10 / 50

The differences

Git is the infrastructure for versioning and merging files

GitHub provides an online service to coordinate working with Git repositories, and adds some additional features for managing projects

10 / 50

The differences

Git is the infrastructure for versioning and merging files

GitHub provides an online service to coordinate working with Git repositories, and adds some additional features for managing projects

GitHub stores the project on the cloud, allows for task management, creation of groups, etc

10 / 50

Why Git and GitHub?

Selfish reasons

11 / 50

Why Git and GitHub?

Selfish reasons

The private benefits of having well-versioned code in case you need to go back to previous stages

11 / 50

Why Git and GitHub?

Selfish reasons

The private benefits of having well-versioned code in case you need to go back to previous stages

Your directories will be super clean

11 / 50

Why Git and GitHub?

Selfish reasons

The private benefits of having well-versioned code in case you need to go back to previous stages

Your directories will be super clean

It is MUCH easier to collaborate on projects

11 / 50

Why Git and GitHub?

Semi-altruistic reasons

12 / 50

Why Git and GitHub?

Semi-altruistic reasons

The external benefits of open science, collaboration, etc

12 / 50

Why Git and GitHub?

Semi-altruistic reasons

The external benefits of open science, collaboration, etc

These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)

12 / 50

Why Git and GitHub?

Semi-altruistic reasons

The external benefits of open science, collaboration, etc

These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)

My code for everything I've ever published is on my GitHub (I'll look real shady if I don't post code in the future)

12 / 50

Why Git and GitHub?

Semi-altruistic reasons

The external benefits of open science, collaboration, etc

These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)

My code for everything I've ever published is on my GitHub (I'll look real shady if I don't post code in the future)

Ideally yours will be too

12 / 50

Git basics

Everything on Git is stored in something called a repository or repo for short

13 / 50

Git basics

Everything on Git is stored in something called a repository or repo for short

This is the directory for a project

13 / 50

Git basics

Everything on Git is stored in something called a repository or repo for short

This is the directory for a project

  • Local: a directory with a .git subdirectory that stores the history of changes to the repository
  • Remote: a website, e.g. see the GitHub repo for the Optim package in Julia
13 / 50

Creating a new repo on GitHub

Let's create a new repo

14 / 50

Creating a new repo on GitHub

Let's create a new repo

This is pretty easy from the GitHub website: just click on that green new button from the launch page

14 / 50

Creating a new repo on GitHub

Next steps:

  1. Choose a name
  2. Choose a description
  3. Choose whether the repo is public or private
  4. Choose whether you want to add a README.md (yes), or a .gitignore or a LICENSE.md file (more next slide)
15 / 50

Git basics

Repos come with some common files in them

  • .gitignore: lists files/directories/extensions that Git shouldnt track (raw data, restricted data, those weird LaTeX files); this is usually a good idea
  • README.md: a Markdown file that is basically the welcome content on repo's GitHub website, you should generally initialize a repo with one of these
  • LICENSE.md: describes the license agreement for the repository
16 / 50

Creating a new repo on GitHub

You can find the repo at https://github.com/irudik/example-repo-7130

17 / 50

How do I get a repo on GitHub onto on my computer?

Clone

To get the repository on your local machine you need to clone the repo, you can do this in a few ways from the repo site

18 / 50

How do I get a repo on GitHub onto on my computer?

Clone

To get the repository on your local machine you need to clone the repo, you can do this in a few ways from the repo site

Key thing: this will link your local repository to the remote, you'll be able to update your local when the remote is changed

18 / 50

Cloning

  1. If you want to use the GitHub desktop app instead of command line, click on "Open in Desktop"
  2. You can use command line git clone https://github.com/irudik/example-repo-7130.git
19 / 50

Cloning

You're done! Now create and clone your own repository, initialized with a README.md, and follow along.

20 / 50

Cloning

You're done! Now create and clone your own repository, initialized with a README.md, and follow along.

21 / 50

The flow of Git

Workspace: the actual files on your computer
Repository: your saved local history of changes to the files in the repository
Remote: The remote repository on GitHub that allows for sharing across collaborators

22 / 50

Using Git

There are only a few basic Git operations you need to know for versioning solo economics research efficiently

23 / 50

Using Git

There are only a few basic Git operations you need to know for versioning solo economics research efficiently

Add/Stage: This adds files to the index, in other words, it takes a snapshot of the changes you want updated/saved in your local repository (i.e. your computer)

  • git add -A Adds all files to the index
23 / 50

Using Git

There are only a few basic Git operations you need to know for versioning solo economics research efficiently

Add/Stage: This adds files to the index, in other words, it takes a snapshot of the changes you want updated/saved in your local repository (i.e. your computer)

  • git add -A Adds all files to the index

Commit: This records the changes to your local repository

  • git commit -m "Updated some files" Commits the changes added to the index with the commit message in quotations
23 / 50

Using Git

Push: This sends the changes to the remote repository (i.e. GitHub)

  • git push origin master Pushes changes on your local repo to a branch called master on your remote, typically named origin (can often omit origin master)
24 / 50

Using Git

Push: This sends the changes to the remote repository (i.e. GitHub)

  • git push origin master Pushes changes on your local repo to a branch called master on your remote, typically named origin (can often omit origin master)

Pull: This takes changes on the remote and integrates them with the local repository (technically two operations are going on: fetch and merge)

  • git pull origin master Integrates the changes on the master branch of your remote origin into your local repo (again, can often omit origin master)
24 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

25 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

  1. Open README.md in some text editor and insert the following code: # Hello World!
  2. Save README.md
  3. Add the changes to README.md to the index
  4. Commit the changes to your local repo with the message: "First README.md edit."
  5. Push the changes to your remote
25 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

  1. Open README.md in some text editor and insert the following code: # Hello World!
  2. Save README.md
  3. Add the changes to README.md to the index
  4. Commit the changes to your local repo with the message: "First README.md edit."
  5. Push the changes to your remote

Did the changes show up your repo's GitHub page?

25 / 50

Using Git: branching

Some more (but not very) advanced operations relate to branching

Branching creates different, but parallel, versions of your code

e.g. If you want to test out a new feature of your model but don't want to contaminate your master branch, create a new branch and add the feature there

If it works out, you can bring the changes back into master

If it doesn't, just delete it

26 / 50

Using Git: branching

Branch: This adds/deletes/merges different branches of your repository

  • git branch Lists all local branches
  • git branch -a Lists all remote branches
  • git branch solar-panels Creates a new branch called solar-panels
  • git branch -d solar-panels Deletes the local solar-panels branch
27 / 50

Using Git: branching

Checkout: This switches you between different commits or branches

  • git checkout solar-panels Switches you to branch solar-panels
  • git checkout -b wind-turbines Creates a new branch called wind-turbines and checks it out
28 / 50

Using Git: branching

Merge: This merges two separate histories together (e.g. merges a separate branch back into the master)

  • git checkout master
    git merge wind-turbines
    Checks out master and then merges wind-turbines back into the master

This brings the changes from wind-turbines since the initial branch back into the master branch

29 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

30 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

  1. Create and checkout a new branch called test-branch
  2. Edit README.md and add the following code: ## your_name_here
  3. Save README.md
  4. Add the changes to README.md to the index
  5. Commit the changes to your local repo with the message: "Test change to README.md."
  6. Merge the changes back into the master branch
  7. Push the changes to your remote
30 / 50

Using Git

In your own repository do the following using either shell or GitHub Desktop:

  1. Create and checkout a new branch called test-branch
  2. Edit README.md and add the following code: ## your_name_here
  3. Save README.md
  4. Add the changes to README.md to the index
  5. Commit the changes to your local repo with the message: "Test change to README.md."
  6. Merge the changes back into the master branch
  7. Push the changes to your remote

Did the changes show up your repo's GitHub page?

30 / 50

Teaming up

Find a partner for this next piece:

One of you invite the other to collaborate on the project (GitHub page Settings Manage access invite a collaborator)

31 / 50

Teaming up

If you were the one being invited, accept the invite, and clone the repo to your local

32 / 50

Teaming up

If you were the one being invited, accept the invite, and clone the repo to your local

Now do the following:

  1. Each of you edit the # Hello World! line of code to be something else and different from each other
32 / 50

Teaming up

If you were the one being invited, accept the invite, and clone the repo to your local

Now do the following:

  1. Each of you edit the # Hello World! line of code to be something else and different from each other
  2. Commit the changes to your local
32 / 50

Teaming up

If you were the one being invited, accept the invite, and clone the repo to your local

Now do the following:

  1. Each of you edit the # Hello World! line of code to be something else and different from each other
  2. Commit the changes to your local
  3. Have the repo creator push their changes
32 / 50

Teaming up

If you were the one being invited, accept the invite, and clone the repo to your local

Now do the following:

  1. Each of you edit the # Hello World! line of code to be something else and different from each other
  2. Commit the changes to your local
  3. Have the repo creator push their changes
  4. Have the collaborator push their changes
32 / 50

Can't push changes when you aren't updated

Shell

It turns out that the second person can't push their local changes to the remote

The second person is pushing their history of changes

But the remote is already one commit ahead because of the first person, so the second person's changes can't be pushed

33 / 50

Update by pulling after you commit local changes

You need to pull the remote changes first, but then you get the following message:

And we got a merge conflict in README.md

This means there were differences between the remote and your local that conflicted

34 / 50

Merge conflicts

Sometimes there will be conflicts between two separate histories

  • e.g. if you and your collaborator edited the same chunk of code separately on your local repos
35 / 50

Merge conflicts

Sometimes there will be conflicts between two separate histories

  • e.g. if you and your collaborator edited the same chunk of code separately on your local repos

When you try to merge these histories by pushing to the remote, Git will throw a merge conflict

35 / 50

Merge conflicts

Sometimes there will be conflicts between two separate histories

  • e.g. if you and your collaborator edited the same chunk of code separately on your local repos

When you try to merge these histories by pushing to the remote, Git will throw a merge conflict

When you get a merge conflict, the conflicted part of the code in your file will look like:

$ <<<<<<< HEAD
$ # nascar_and_unleaded <-- my local version
$ =======
$ # nascar_and_leaded <-- the remote version
$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2
35 / 50

Merge conflicts

$ <<<<<<< HEAD
$ # nascar_and_unleaded <-- my local version
$ =======
$ # nascar_and_leaded <-- the remote version
$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2

<<<<<<< HEAD indicates the start of the conflicted code

36 / 50

Merge conflicts

$ <<<<<<< HEAD
$ # nascar_and_unleaded <-- my local version
$ =======
$ # nascar_and_leaded <-- the remote version
$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2

<<<<<<< HEAD indicates the start of the conflicted code
======= separates the two different conflicting histories

36 / 50

Merge conflicts

$ <<<<<<< HEAD
$ # nascar_and_unleaded <-- my local version
$ =======
$ # nascar_and_leaded <-- the remote version
$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2

<<<<<<< HEAD indicates the start of the conflicted code
======= separates the two different conflicting histories
>>>>>>> lots of numbers and letters indicates the end of the conflicted code and the hash (don't worry about it) for the specific commit

36 / 50

Fixing the merge conflict

Merge conflicts can be fixed by directly editing the file, then doing an add of the conflicted file, a commit, and then a push to the remote

37 / 50

Fixing the merge conflict

Merge conflicts can be fixed by directly editing the file, then doing an add of the conflicted file, a commit, and then a push to the remote

Fixed!

37 / 50

Git help pages are excellent, so is StackExchange

$ git help add
38 / 50

Managing tasks and workflow

GitHub is also very useful for task management in solo or group projects using issues and pull requests

Issues: task management for you and your collaborators, should be able to completely replace email

Let's look at the issues for the Optim package in Julia

39 / 50

Issues

The issues tab reports a list of 56 open issues (286 closed, meaning the task or problem has been solved)

Each issue has its own title

Lets check out the issue about the Double64 type

40 / 50

Issues

The issue is because one person has found an error with the package where it doesn't seem to work correctly with a certain type of variable Double64

Someone else has responded with some feedback

41 / 50

Issues

From the issues tab, click the green new issue button which takes you here

You can:

  • create a title
  • add some text for the body of the issue
  • select people to assign the issue to
  • add some labels
42 / 50

Issues

The issue keeps track of the history of everything that's happened to it

43 / 50

Issues

You can reference people with @ which brings up a dropdown menu of all collaborators on the project

44 / 50

Issues

You can also reference other issues if they're related by using # which brings up a dropdown of all issues for your repository

45 / 50

Issues

Issues can also be referenced in your commits to your project by adding #issue_number_here to the commit message

46 / 50

Issues

Then those commits show up in your issue so you have a history of what code changes have been made.

47 / 50

Issues

If you click on the commit, it takes you to the git diff which shows you any changes to files made in that commit

48 / 50

Windows users!!!!!

Do the following:

  • Open up a command prompt or Git Bash (recommend Bash from here on out)
  • Run the following commands:
    git config --global core.eol lf
    git config --global core.autocrlf false
49 / 50

Windows users!!!!!

Do the following:

  • Open up a command prompt or Git Bash (recommend Bash from here on out)
  • Run the following commands:
    git config --global core.eol lf
    git config --global core.autocrlf false

Git tracks changes by specific characters at the end of each line, it does this differently across Windows/Unix by default

49 / 50

Next up:

Optimization: root-finding and maximization/minimization

50 / 50
2 / 50
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow