#+TITLE: Drawing Git Graphs with Graphviz and Org-Mode #+AUTHOR: Correl Roush #+STARTUP: indent inlineimages showall hideblocks #+OPTIONS: toc:nil num:nil #+PROPERTY: header-args :exports both :results silent #+KEYWORDS: emacs org-mode git graphviz #+begin_src emacs-lisp :exports results :results silent (defun vector-image (name) (let ((basename (concat (file-name-base buffer-file-name) "-" name))) (cond ((eq org-export-current-backend 'latex) (concat basename ".eps")) (t (concat basename ".svg"))))) #+end_src #+name: inline-image #+begin_src emacs-lisp :exports none :var name="example" (if (org-export-derived-backend-p org-export-current-backend 'html) (concat "#+BEGIN_HTML\n" (let ((image-file (vector-image name))) (with-temp-buffer (insert-file-contents image-file) (buffer-string))) "#+END_HTML\n") (concat "[[file:" (vector-image name) "]]\n")) #+end_src #+BEGIN_HTML #+END_HTML Digging through Derek Feichtinger's [[https://github.com/dfeich/org-babel-examples][org-babel examples]] (which I came across via [[http://irreal.org/blog/?p%3D4162][irreal.org]]), I found he had some great examples of displaying git-style graphs using graphviz. I thought it'd be a fun exercise to generate my own graphs based on his graphviz source using elisp, and point it at actual git repos. * Getting Started I started out with the goal of building a simple graph showing a mainline branch and a topic branch forked from it and eventually merged back in. Using Derek's example as a template, I described 5 commits on a master branch, plus two on a topic branch. #+begin_src dot :file (vector-image "graph-example") digraph G { rankdir="LR"; bgcolor="transparent"; node[width=0.15, height=0.15, shape=point]; edge[weight=2, arrowhead=none]; node[group=master]; 1 -> 2 -> 3 -> 4 -> 5; node[group=branch]; 2 -> 6 -> 7 -> 4; } #+end_src The resulting image looks like this: #+CALL: inline-image(name="graph-example") :results raw replace #+RESULTS: [[file:2015-07-12-git-graphs-graph-example.svg]] ** Designing the Data Structure The first thing I needed to do was describe my data structure. Leaning on my experiences reading and working through [[https://www.google.com/url?sa%3Dt&rct%3Dj&q%3D&esrc%3Ds&source%3Dweb&cd%3D1&cad%3Drja&uact%3D8&ved%3D0CB8QFjAA&url%3Dhttps%253A%252F%252Fmitpress.mit.edu%252Fsicp%252F&ei%3DlH6gVau5OIGR-AG8j7yACQ&usg%3DAFQjCNHTCXQK7qN-kYibdy_MqRBWxlr8og&sig2%3DLu9WIhyuTJS92e8hxne0Aw&bvm%3Dbv.97653015,d.cWw][SICP]], I got to work building a constructor function, and several accessors. I decided to represent each node on a graph with an id, a list of parent ids, and a group which will correspond to the branch on the graph the commit belongs to. #+begin_src emacs-lisp (defun git-graph/make-node (id &optional parents group) (list id parents group)) (defun git-graph/node-id (node) (nth 0 node)) (defun git-graph/node-parents (node) (nth 1 node)) (defun git-graph/node-group (node) (nth 2 node)) #+end_src ** Converting the structure to Graphviz Now that I had my data structures sorted out, it was time to step through them and generate the graphviz source that'd give me the nice-looking graphs I was after. The graph is constructed using the example above as a template. The nodes are defined first, followed by the edges between them. #+name: git-graph/to-graphviz #+begin_src emacs-lisp (defun git-graph/to-graphviz (id nodes) (string-join (list (concat "digraph " id " {") "bgcolor=\"transparent\";" "rankdir=\"LR\";" "node[width=0.15,height=0.15,shape=point,fontsize=8.0];" "edge[weight=2,arrowhead=none];" (string-join (-map #'git-graph/to-graphviz-node nodes) "\n") (string-join (-uniq (-flatten (-map #'git-graph/to-graphviz-edges nodes))) "\n") "}") "\n")) #+end_src For the sake of readability, I'll format the output: #+name: git-graph/to-graphviz #+begin_src emacs-lisp (defun git-graph/to-graphviz-pretty (id nodes) (with-temp-buffer (graphviz-dot-mode) (insert (git-graph/to-graphviz id nodes)) (indent-region (point-min) (point-max)) (buffer-string))) #+end_src Each node is built, setting its group attribute when applicable. #+begin_src emacs-lisp (defun git-graph/to-graphviz-node (node) (let ((node-id (git-graph/to-graphviz-node-id (git-graph/node-id node)))) (concat node-id (--if-let (git-graph/node-group node) (concat "[group=\"" it "\"]")) ";"))) #+end_src Graphviz node identifiers are quoted to avoid running into issues with spaces or other special characters. #+name: git-graph/to-graphviz-nodes #+begin_src emacs-lisp (defun git-graph/to-graphviz-node-id (id) (format "\"%s\"" id)) #+end_src For each node, an edge is built connecting the node to each of its parents. #+name: git-graph/to-graphviz-edges #+begin_src emacs-lisp (defun git-graph/to-graphviz-edges (node) (let ((node-id (git-graph/node-id node)) (parents (git-graph/node-parents node))) (-map (lambda (parent) (git-graph/to-graphviz-edge node-id parent)) parents))) (defun git-graph/to-graphviz-edge (from to) (concat (git-graph/to-graphviz-node-id to) " -> " (git-graph/to-graphviz-node-id from) ";")) #+end_src With that done, the simple graph above could be generated with the following code: #+name: git-example #+begin_src emacs-lisp :exports code :results silent (git-graph/to-graphviz-pretty "example" (list (git-graph/make-node 1 nil "master") (git-graph/make-node 2 '(1) "master") (git-graph/make-node 3 '(2) "master") (git-graph/make-node 4 '(3 7) "master") (git-graph/make-node 5 '(4) "master") (git-graph/make-node 6 '(2) "branch") (git-graph/make-node 7 '(6) "branch"))) #+end_src Which generates the following graphviz source: #+begin_src dot :noweb yes :file (vector-image "generated-git-example") <> #+end_src The generated image matches the example exactly: #+CALL: inline-image(name="generated-git-example") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-generated-example.svg]] * Adding Labels The next thing my graph needed was a way of labeling nodes. Rather than trying to figure out some way of attaching a separate label to a node, I decided to simply draw a labeled node as a box with text. #+begin_src dot :file (vector-image "graph-labels") digraph G { rankdir="LR"; bgcolor="transparent"; node[width=0.15, height=0.15, shape=point,fontsize=8.0]; edge[weight=2, arrowhead=none]; node[group=main]; 1 -> 2 -> 3 -> 4 -> 5; 5[shape=box,label=master]; node[group=branch1]; 2 -> 6 -> 7 -> 4; 7[shape=box,label=branch]; } #+end_src #+CALL: inline-image(name="graph-labels") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-labels.svg]] ** Updating the Data Structure I updated my data structure to support an optional label applied to a node. I opted to store it in an associative list alongside the group. #+name: git-graph/structure #+begin_src emacs-lisp (defun git-graph/make-node (id &optional parents options) (list id parents options)) (defun git-graph/node-id (node) (nth 0 node)) (defun git-graph/node-parents (node) (nth 1 node)) (defun git-graph/node-group (node) (cdr (assoc 'group (nth 2 node)))) (defun git-graph/node-label (node) (cdr (assoc 'label (nth 2 node)))) #+end_src ** Updating the Graphviz node generation The next step was updating the Graphviz generation functions to handle the new data structure, and set the shape and label attributes of labeled nodes. #+name: git-graph/to-graphviz-nodes #+begin_src emacs-lisp (defun git-graph/to-graphviz-node (node) (let ((node-id (git-graph/to-graphviz-node-id (git-graph/node-id node)))) (concat node-id (git-graph/to-graphviz-node--attributes node) ";"))) (defun git-graph/to-graphviz-node--attributes (node) (let ((attributes (git-graph/to-graphviz-node--compute-attributes node))) (and attributes (concat "[" (mapconcat (lambda (pair) (format "%s=\"%s\"" (car pair) (cdr pair))) attributes ", ") "]")))) (defun git-graph/to-graphviz-node--compute-attributes (node) (-filter #'identity (append (and (git-graph/node-group node) (list (cons 'group (git-graph/node-group node)))) (and (git-graph/node-label node) (list (cons 'shape 'box) (cons 'label (git-graph/node-label node))))))) #+end_src I could then label the tips of each branch: #+name: graph-example-labels #+begin_src emacs-lisp :exports code :results silent (git-graph/to-graphviz-pretty "labeled" (list (git-graph/make-node 1 nil '((group . "master"))) (git-graph/make-node 2 '(1) '((group . "master"))) (git-graph/make-node 3 '(2) '((group . "master"))) (git-graph/make-node 4 '(3 7) '((group . "master"))) (git-graph/make-node 5 '(4) '((group . "master") (label . "master"))) (git-graph/make-node 6 '(2) '((group . "branch"))) (git-graph/make-node 7 '(6) '((group . "branch") (label . "branch"))))) #+end_src #+begin_src dot :file (vector-image "graph-labels-generated") :noweb yes <> #+end_src #+CALL: inline-image(name="graph-labels-generated") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-labels-generated.svg]] * Automatic Grouping Using Leaf Nodes Manually assigning groups to each node is tedious, and easy to accidentally get wrong. Also, with the goal to graph git repositories, I was going to have to figure out groupings automatically anyway. To do this, it made sense to traverse the nodes in [[https://en.wikipedia.org/wiki/Topological_sorting][topological order]]. Repeating the example above, #+begin_src dot :file (vector-image "graph-topo") digraph G { rankdir="LR"; bgcolor="transparent"; node[width=0.15, height=0.15, shape=circle]; edge[weight=2, arrowhead=none]; node[group=main]; 1 -> 2 -> 3 -> 4 -> 5; node[group=branch1]; 2 -> 6 -> 7 -> 4; } #+end_src #+CALL: inline-image(name="graph-topo") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-topo.svg]] These nodes can be represented (right to left) in topological order as either ~5, 4, 3, 7, 6, 2, 1~ or ~5, 4, 7, 6, 3, 2, 1~. Having no further children, ~5~ is a leaf node, and can be used as a group. All first parents of ~5~ can therefore be considered to be in group ~5~. ~7~ is a second parent to ~4~, and so should be used as the group for all of its parents not present in group ~5~. #+name: git-graph/group-topo #+begin_src emacs-lisp (defun git-graph/group-topo (nodelist) (reverse (car (-reduce-from (lambda (acc node) (let* ((grouped-nodes (car acc)) (group-stack (cdr acc)) (node-id (git-graph/node-id node)) (group-from-stack (--if-let (assoc node-id group-stack) (cdr it))) (group (or group-from-stack node-id)) (parents (git-graph/node-parents node)) (first-parent (first parents))) (if group-from-stack (pop group-stack)) (if (and first-parent (not (assoc first-parent group-stack))) (push (cons first-parent group) group-stack)) (cons (cons (git-graph/make-node node-id parents `((group . ,group) (label . ,(git-graph/node-label node)))) grouped-nodes) group-stack))) nil nodelist)))) #+end_src While iterating through the node list, I maintained a stack of pairs built from the first parent of the current node, and the current group. To determine the group, the head of the stack is checked to see if it contains a group for the current node id. If it does, that group is used and it is popped off the stack, otherwise the current node id is used. The following table illustrates how the stack is used to store and assign group relationships as the process iterates through the node list: #+caption: Progressing through the nodes | Node | Parents | Group Stack | Group | |------+---------+-----------------+-------| | 5 | (4) | (4 . 5) | 5 | | 4 | (3 7) | (3 . 5) | 5 | | 3 | (2) | (2 . 5) | 5 | | 7 | (6) | (6 . 7) (2 . 5) | 7 | | 6 | (2) | (2 . 5) | 7 | | 2 | (1) | (1 . 5) | 5 | | 1 | | | 5 | ** Graph without automatic grouping #+name: graph-no-auto-grouping #+begin_src emacs-lisp :exports code :results silent (git-graph/to-graphviz-pretty "nogroups" (list (git-graph/make-node 5 '(4) '((label . master))) (git-graph/make-node 4 '(3 7)) (git-graph/make-node 3 '(2)) (git-graph/make-node 7 '(6) '((label . develop))) (git-graph/make-node 6 '(2)) (git-graph/make-node 2 '(1)) (git-graph/make-node 1 nil))) #+end_src #+begin_src dot :noweb yes :file (vector-image "graph-no-auto-grouping") <> #+end_src #+CALL: inline-image(name="graph-no-auto-grouping") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-no-auto-grouping.svg]] ** Graph with automatic grouping #+name: graph-with-auto-grouping #+begin_src emacs-lisp :exports code :results silent (git-graph/to-graphviz-pretty "autogroups" (git-graph/group-topo (list (git-graph/make-node 5 '(4) '((label . master))) (git-graph/make-node 4 '(3 7)) (git-graph/make-node 3 '(2)) (git-graph/make-node 7 '(6) '((label . develop))) (git-graph/make-node 6 '(2)) (git-graph/make-node 2 '(1)) (git-graph/make-node 1 nil)))) #+end_src #+begin_src dot :noweb yes :file (vector-image "graph-with-auto-grouping") <> #+end_src #+CALL: inline-image(name="graph-with-auto-grouping") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-graph-with-auto-grouping.svg]] * Graphing a Git Repository Satisfied that I had all the necessary tools to start graphing real git repositories, I created an example repository to test against. ** Creating a Sample Repository Using the following script, I created a sample repository to test against. I performed the following actions: - Forked a develop branch from master. - Forked a feature branch from develop, with two commits. - Added another commit to develop. - Forked a second feature branch from develop, with two commits. - Merged the second feature branch to develop. - Merged develop to master and tagged it. #+begin_src sh :exports results :results silent rm -rf /tmp/test.git #+end_src #+begin_src sh :exports both :results silent mkdir /tmp/test.git cd /tmp/test.git git init touch README git add README git commit -m 'initial' git commit --allow-empty -m 'first' git checkout -b develop git commit --allow-empty -m 'second' git checkout -b feature-1 git commit --allow-empty -m 'feature 1' git commit --allow-empty -m 'feature 1 again' git checkout develop git commit --allow-empty -m 'third' git checkout -b feature-2 git commit --allow-empty -m 'feature 2' git commit --allow-empty -m 'feature 2 again' git checkout develop git merge --no-ff feature-2 git checkout master git merge --no-ff develop git tag -a 1.0 #+end_src ** Generating a Graph From a Git Branch The first order of business was to have a way to call out to git and return the results: #+name: git-graph/from-git #+begin_src emacs-lisp (defun git-graph/git-execute (repo-url command &rest args) (with-temp-buffer (shell-command (format "git -C \"%s\" %s" repo-url (string-join (cons command args) " ")) t) (buffer-string))) #+end_src Next, I needed to get the list of commits for a branch in topological order, with a list of parent commits for each. It turns out git provides exactly that via its =rev-list= command. #+name: git-graph/from-git #+begin_src emacs-lisp (defun git-graph/git-rev-list (repo-url head) (-map (lambda (line) (split-string line)) (split-string (git-graph/git-execute repo-url "rev-list" "--topo-order" "--parents" head) "\n" t))) #+end_src I also wanted to label branch heads wherever possible. To do this, I looked up the revision name from git, discarding it if it was relative to some other named commit. #+name: git-graph/from-git #+begin_src emacs-lisp (defun git-graph/git-label (repo-url rev) (let ((name (string-trim (git-graph/git-execute repo-url "name-rev" "--name-only" rev)))) (unless (s-contains? "~" name) name))) #+end_src Generating the graph for a single branch was as simple as iterating over each commit and creating a node for it. #+begin_src emacs-lisp (defun git-graph/git-graph-head (repo-url head) (git-graph/group-topo (-map (lambda (rev-with-parents) (let* ((rev (car rev-with-parents)) (parents (cdr rev-with-parents)) (label (git-graph/git-label repo-url rev))) (git-graph/make-node rev parents `((label . ,label))))) (git-graph/git-rev-list repo-url head)))) #+end_src Here's the result of graphing the =master= branch: #+name: graph-git-branch #+begin_src emacs-lisp (git-graph/to-graphviz-pretty "git" (git-graph/git-graph-head "/tmp/test.git" "master")) #+end_src #+begin_src dot :file (vector-image "git-graph-branch") :noweb yes <> #+end_src #+CALL: inline-image(name="git-graph-branch") :exports results :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-git-graph-branch.svg]] ** Graphing Multiple Branches To graph multiple branches, I needed a function for combining histories. To do so, I simply append any nodes I don't already know about in the first history from the second. #+name: git-graph/adder #+begin_src emacs-lisp (defun git-graph/+ (a b) (append a (-remove (lambda (node) (assoc (git-graph/node-id node) a)) b))) #+end_src From there, all that remained was to accumulate the branch histories and output the complete graph: #+name: git-graph/from-git #+begin_src emacs-lisp (defun git-graph/git-load (repo-url heads) (-reduce #'git-graph/+ (-map (lambda (head) (git-graph/git-graph-head repo-url head)) heads))) #+end_src And here's the example repository, graphed in full: #+name: graph-git-repo #+begin_src emacs-lisp (git-graph/to-graphviz-pretty "git" (git-graph/git-load "/tmp/test.git" '("master" "feature-1"))) #+end_src #+begin_src dot :file (vector-image "git-graph-repo") :noweb yes <> #+end_src #+CALL: inline-image(name="git-graph-repo") :results raw replace #+RESULTS: [[file:2015-07-10-git-graphs-git-graph-repo.svg]] * Things I may add in the future ** Limiting Commits to Graph Running this against repos with any substantial history can make the graph unwieldy. It'd be a good idea to abstract out the commit list fetching, and modify it to support different ways of limiting the history to display. Ideas would include: - Specifying commit ranges - Stopping at a common ancestor to all graphed branches (e.g., using =git-merge-base=). - Other git commit limiting options, like searches, showing only merge or non-merge commits, etc. ** Collapsing History Another means of reducing the size of the resulting graph would be to collapse unimportant sections of it. It should be possible to collapse a section of the graph, showing a count of skipped nodes. The difficult part would be determining what parts aren't worth drawing. Something like this would be handy, though, for concisely graphing the state of multiple ongoing development branches (say, to get a picture of what's been going on since the last release, and what's still incomplete). #+begin_src dot :file (vector-image "git-graph-long") digraph G { rankdir="LR"; bgcolor="transparent"; node[width=0.15,height=0.15,shape=point]; edge[weight=2,arrowhead=none]; node[group=main]; 1 -> 2 -> 3 -> 4 -> 5; node[group=branch]; 2 -> 6 -> 7 -> 8 -> 9 -> 10 -> 4; } #+end_src #+CALL: inline-image(name="git-graph-long") :results raw replace #+caption: A graph with multiple nodes on a branch. #+RESULTS: [[file:2015-07-12-git-graphs-git-graph-long.svg]] #+begin_src dot :file (vector-image "git-graph-collapsed") digraph G { rankdir="LR"; bgcolor="transparent"; node[width=0.15,height=0.15,shape=point]; edge[weight=2,arrowhead=none]; node[group=main]; 1 -> 2 -> 3 -> 4 -> 5; node[group=branch]; 2 -> 6; 6 -> 10[style=dashed,label="+3"]; 10 -> 4; } #+end_src #+CALL: inline-image(name="git-graph-collapsed") :results raw replace #+caption: The same graph, collapsed. #+RESULTS: [[file:2015-07-12-git-graphs-git-graph-collapsed.svg]] ** Clean up and optimize the code a bit Some parts of this (particularly, the grouping) are probably pretty inefficient. If this turns out to actually be useful, I may take another crack at it. * Final Code In case anyone would like to use this code for anything, or maybe just pick it apart and play around with it, all the Emacs Lisp code in this post is collected into a single file below: #+begin_src emacs-lisp :noweb yes :exports code :tangle "../files/git-graph.el" ;;; git-graph.el --- Generate git-style graphs using graphviz ;; Copyright (c) 2015 Correl Roush ;;; License: ;; This program is free software; you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation; either version 3, or (at your option) ;; any later version. ;; ;; This program is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; ;; You should have received a copy of the GNU General Public License ;; along with GNU Emacs; see the file COPYING. If not, write to the ;; Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, ;; Boston, MA 02110-1301, USA. ;;; Commentary: ;;; Code: (require 'dash) <> <> <> <> <> <> (provide 'git-graph) ;;; git-graph.el ends here #+end_src Download: [[file:/files/git-graph.el]]