Necessary things to do:
Install Git
Create an account on GitHub
Install GitHub Desktop if you want a GUI for Git
Accept invite to the AEM 7130 classroom repository on GitHub
A decent chunk of these slides is inspired by Grant McDermott's data science course
If you're interested in more data science-y (similar to Ariel's class) and less numerical/structural material, check his material out
The classic date your file name method is not good
The classic date your file name method is not good
When did you make changes? Who made them?
How do you undo only some changes from one update to the next?
The classic date your file name method is not good
When did you make changes? Who made them?
How do you undo only some changes from one update to the next?
If you've ever had a disaster managing code changes (you will), Git can help
Git is a distributed version control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows.
Git combines a bunch of very useful features:
Git combines a bunch of very useful features:
Git combines a bunch of very useful features:
Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox
Tracking of changes to files in a very clean way
Git combines a bunch of very useful features:
Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox
Tracking of changes to files in a very clean way
Easy ways to test out experimental changes (e.g. new specifications, additional model states) and not have them mess with your main code
Git combines a bunch of very useful features:
Remote storage of code on a host like GitHub/GitLab/Bitbucket/etc, just like Dropbox
Tracking of changes to files in a very clean way
Easy ways to test out experimental changes (e.g. new specifications, additional model states) and not have them mess with your main code
Built for versioning code like R, Julia, LaTeX, etc
Some apps can give you a pretty visual of the history of changes to your code (shell can too, but not as nice)
Allows for people to suggest code changes to existing code
It's the main location for non-base Julia packages (and tons of other stuff) to be stored and developed
Allows for people to suggest code changes to existing code
It's the main location for non-base Julia packages (and tons of other stuff) to be stored and developed
It has services that I used to set up this class, etc
Git is the infrastructure for versioning and merging files
Git is the infrastructure for versioning and merging files
GitHub provides an online service to coordinate working with Git repositories, and adds some additional features for managing projects
Git is the infrastructure for versioning and merging files
GitHub provides an online service to coordinate working with Git repositories, and adds some additional features for managing projects
GitHub stores the project on the cloud, allows for task management, creation of groups, etc
The private benefits of having well-versioned code in case you need to go back to previous stages
The private benefits of having well-versioned code in case you need to go back to previous stages
Your directories will be super clean
The private benefits of having well-versioned code in case you need to go back to previous stages
Your directories will be super clean
It is MUCH easier to collaborate on projects
The external benefits of open science, collaboration, etc
The external benefits of open science, collaboration, etc
These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)
The external benefits of open science, collaboration, etc
These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)
My code for everything I've ever published is on my GitHub (I'll look real shady if I don't post code in the future)
The external benefits of open science, collaboration, etc
These external benefits also generate some downstream private reputational benefits (must be confident in your code to make it public) and can improve future social efficiency (commitment device to post future code)
My code for everything I've ever published is on my GitHub (I'll look real shady if I don't post code in the future)
Ideally yours will be too
Everything on Git is stored in something called a repository or repo for short
Everything on Git is stored in something called a repository or repo for short
This is the directory for a project
Everything on Git is stored in something called a repository or repo for short
This is the directory for a project
.git
subdirectory that stores the history of changes to the repositoryThis is pretty easy from the GitHub website: just click on that green new
button from the launch page
Next steps:
README.md
(yes), or a .gitignore
or a LICENSE.md
file (more next slide) Repos come with some common files in them
.gitignore
: lists files/directories/extensions that Git shouldnt track (raw data, restricted data, those weird LaTeX files); this is usually a good ideaREADME.md
: a Markdown file that is basically the welcome content on repo's GitHub website, you should generally initialize a repo with one of theseLICENSE.md
: describes the license agreement for the repositoryYou can find the repo at https://github.com/irudik/example-repo-7130
To get the repository on your local machine you need to clone the repo, you can do this in a few ways from the repo site
To get the repository on your local machine you need to clone the repo, you can do this in a few ways from the repo site
Key thing: this will link your local repository to the remote, you'll be able to update your local when the remote is changed
git clone https://github.com/irudik/example-repo-7130.git
You're done! Now create and clone your own repository, initialized with a README.md
, and follow along.
You're done! Now create and clone your own repository, initialized with a README.md
, and follow along.
Workspace: the actual files on your computer
Repository: your saved local history of changes to the files in the repository
Remote: The remote repository on GitHub that allows for sharing across collaborators
There are only a few basic Git operations you need to know for versioning solo economics research efficiently
There are only a few basic Git operations you need to know for versioning solo economics research efficiently
Add/Stage: This adds files to the index, in other words, it takes a snapshot of the changes you want updated/saved in your local repository (i.e. your computer)
git add -A
Adds all files to the indexThere are only a few basic Git operations you need to know for versioning solo economics research efficiently
Add/Stage: This adds files to the index, in other words, it takes a snapshot of the changes you want updated/saved in your local repository (i.e. your computer)
git add -A
Adds all files to the indexCommit: This records the changes to your local repository
git commit -m "Updated some files"
Commits the changes added to the index with the commit message in quotationsPush: This sends the changes to the remote repository (i.e. GitHub)
git push origin master
Pushes changes on your local repo to a branch called master
on your remote, typically named origin
(can often omit origin master
)Push: This sends the changes to the remote repository (i.e. GitHub)
git push origin master
Pushes changes on your local repo to a branch called master
on your remote, typically named origin
(can often omit origin master
)Pull: This takes changes on the remote and integrates them with the local repository (technically two operations are going on: fetch and merge)
git pull origin master
Integrates the changes on the master
branch of your remote origin
into your local repo (again, can often omit origin master
)In your own repository do the following using either shell or GitHub Desktop:
In your own repository do the following using either shell or GitHub Desktop:
README.md
in some text editor and insert the following code: # Hello World!
README.md
README.md
to the indexIn your own repository do the following using either shell or GitHub Desktop:
README.md
in some text editor and insert the following code: # Hello World!
README.md
README.md
to the indexDid the changes show up your repo's GitHub page?
Some more (but not very) advanced operations relate to branching
Branching creates different, but parallel, versions of your code
e.g. If you want to test out a new feature of your model but don't want to contaminate your master
branch, create a new branch and add the feature there
If it works out, you can bring the changes back into master
If it doesn't, just delete it
Branch: This adds/deletes/merges different branches of your repository
git branch
Lists all local branchesgit branch -a
Lists all remote branchesgit branch solar-panels
Creates a new branch called solar-panels
git branch -d solar-panels
Deletes the local solar-panels
branchCheckout: This switches you between different commits or branches
git checkout solar-panels
Switches you to branch solar-panels
git checkout -b wind-turbines
Creates a new branch called wind-turbines
and checks it outMerge: This merges two separate histories together (e.g. merges a separate branch back into the master)
git checkout master
git merge wind-turbines
master
and then merges wind-turbines
back into the masterThis brings the changes from wind-turbines
since the initial branch back into the master
branch
In your own repository do the following using either shell or GitHub Desktop:
In your own repository do the following using either shell or GitHub Desktop:
test-branch
README.md
and add the following code: ## your_name_here
README.md
README.md
to the indexmaster
branchIn your own repository do the following using either shell or GitHub Desktop:
test-branch
README.md
and add the following code: ## your_name_here
README.md
README.md
to the indexmaster
branchDid the changes show up your repo's GitHub page?
Find a partner for this next piece:
One of you invite the other to collaborate on the project (GitHub page → Settings → Manage access → invite a collaborator)
If you were the one being invited, accept the invite, and clone the repo to your local
If you were the one being invited, accept the invite, and clone the repo to your local
Now do the following:
# Hello World!
line of code to be something else and different from each otherIf you were the one being invited, accept the invite, and clone the repo to your local
Now do the following:
# Hello World!
line of code to be something else and different from each otherIf you were the one being invited, accept the invite, and clone the repo to your local
Now do the following:
# Hello World!
line of code to be something else and different from each otherIf you were the one being invited, accept the invite, and clone the repo to your local
Now do the following:
# Hello World!
line of code to be something else and different from each otherIt turns out that the second person can't push their local changes to the remote
The second person is pushing their history of changes
But the remote is already one commit ahead because of the first person, so the second person's changes can't be pushed
You need to pull the remote changes first, but then you get the following message:
And we got a merge conflict in README.md
This means there were differences between the remote and your local that conflicted
Sometimes there will be conflicts between two separate histories
Sometimes there will be conflicts between two separate histories
When you try to merge these histories by pushing to the remote, Git will throw a merge conflict
Sometimes there will be conflicts between two separate histories
When you try to merge these histories by pushing to the remote, Git will throw a merge conflict
When you get a merge conflict, the conflicted part of the code in your file will look like:
$ <<<<<<< HEAD$ # nascar_and_unleaded <-- my local version$ =======$ # nascar_and_leaded <-- the remote version$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2
$ <<<<<<< HEAD$ # nascar_and_unleaded <-- my local version$ =======$ # nascar_and_leaded <-- the remote version$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2
<<<<<<< HEAD
indicates the start of the conflicted code
$ <<<<<<< HEAD$ # nascar_and_unleaded <-- my local version$ =======$ # nascar_and_leaded <-- the remote version$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2
<<<<<<< HEAD
indicates the start of the conflicted code=======
separates the two different conflicting histories
$ <<<<<<< HEAD$ # nascar_and_unleaded <-- my local version$ =======$ # nascar_and_leaded <-- the remote version$ >>>>>>> 03c774b0e9baff0230855822a11e6ed24a0aa6b2
<<<<<<< HEAD
indicates the start of the conflicted code=======
separates the two different conflicting histories>>>>>>> lots of numbers and letters
indicates the end of the conflicted code and the hash (don't worry about it) for the specific commit
Merge conflicts can be fixed by directly editing the file, then doing an add
of the conflicted file, a commit
, and then a push
to the remote
Merge conflicts can be fixed by directly editing the file, then doing an add
of the conflicted file, a commit
, and then a push
to the remote
Fixed!
$ git help add
GitHub is also very useful for task management in solo or group projects using issues and pull requests
Issues: task management for you and your collaborators, should be able to completely replace email
Let's look at the issues for the Optim
package in Julia
The issues tab reports a list of 56 open issues (286 closed, meaning the task or problem has been solved)
Each issue has its own title
Lets check out the issue about the Double64
type
The issue is because one person has found an error with the package where it doesn't seem to work correctly with a certain type of variable Double64
Someone else has responded with some feedback
From the issues tab, click the green new issue button which takes you here
You can:
The issue keeps track of the history of everything that's happened to it
You can reference people with @
which brings up a dropdown menu of all collaborators on the project
You can also reference other issues if they're related by using #
which brings up a dropdown of all issues for your repository
Issues can also be referenced in your commits to your project by adding #issue_number_here
to the commit message
Then those commits show up in your issue so you have a history of what code changes have been made.
If you click on the commit, it takes you to the git diff
which shows you any changes to files made in that commit
git config --global core.eol lfgit config --global core.autocrlf false
git config --global core.eol lfgit config --global core.autocrlf false
Git tracks changes by specific characters at the end of each line, it does this differently across Windows/Unix by default
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |