18 min read
Write your git - Part 6: References

In our previous chapter, we implemented Git’s commit functionality, which allowed us to create snapshots of our repository at specific points in time. These commits form the backbone of our version control system, storing both file content and metadata.

However, working directly with commit hashes is cumbersome. Who wants to type checkout 67e0119fc7d73c11f5e3c7d3fe51015bf6804503 instead of checkout main? This is where Git’s references system comes in.

References provide human-friendly names that point to specific commits. They allow us to use names like “main” or “feature-branch” instead of long SHA-1 hashes. This system is essential for day-to-day Git operations, particularly for branching and navigating commit history.

In this chapter, we’ll implement Git’s references system, which will enable us to:

  • Create and manage branches
  • Track the current branch with HEAD
  • Switch between branches
  • List all available branches
  • Delete branches when they’re no longer needed

By the end of this chapter, we’ll have a functional reference system that makes our Git implementation practical for everyday use.

What are References in Git?

In Git, references (or “refs”) are simply pointers to commits. They provide a layer of abstraction that makes Git more user-friendly by giving human-readable names to specific points in the commit history.

There are several types of references in Git:

  1. Branches (stored in refs/heads/): Point to the latest commit in a particular development line
  2. Remote branches (stored in refs/remotes/): Track branches from remote repositories
  3. Tags (stored in refs/tags/): Mark specific commits, typically for releases
  4. HEAD: A special reference that points to the current branch or commit

For our implementation, we’ll focus on local branches and the HEAD reference, which are the most fundamental for basic Git functionality.

How References are Stored

Git stores references in a remarkably simple way. In the basic form, a reference is just a text file containing the SHA-1 hash of a commit. For example, the main branch is stored in the config.GitDirName/refs/heads/main file and might contain:

67e0119fc7d73c11f5e3c7d3fe51015bf6804503

The special HEAD reference is different. It’s typically a symbolic reference that points to another reference rather than directly to a commit. It’s stored in config.GitDirName/HEAD and might contain:

ref: refs/heads/main

This indicates that HEAD currently points to the main branch. When you switch branches, Git updates this file to point to the new branch.

In some cases, like when you check out a specific commit instead of a branch (creating a “detached HEAD” state), the HEAD file can directly contain a commit hash.

The References System in Action

To better understand how references work in Git, let’s look at an example:

  1. You create a new repository and make your first commit

    • Git creates a main branch pointing to this commit
    • HEAD points to main
  2. You create a new branch called feature

    • Git creates a new reference refs/heads/feature pointing to the current commit
    • HEAD continues to point to main
  3. You switch to the feature branch

    • Git updates HEAD to point to feature (ref: refs/heads/feature)
    • Your working directory is updated to match
  4. You make changes and commit them

    • The feature reference is updated to point to the new commit
    • HEAD still points to feature
    • The main reference remains unchanged
  5. You switch back to main

    • Git updates HEAD to point to main again
    • Your working directory is updated to match

This system allows Git to track different lines of development while keeping track of where you currently are.

Project Structure

Let’s update our project structure to include the references implementation:

gitgo/
├── go.mod
└── internal/
    ├── blob/                # From part 2
    │   ├── blob.go
    │   └── blob_test.go
    ├── config/              # From part 1
    │   └── config.go
    ├── repository/          # From part 1
    │   ├── repository.go
    │   └── repository_test.go
    ├── staging/             # From part 3
    │   ├── staging.go
    │   └── staging_test.go
    ├── tree/                # From part 4
    │   ├── tree.go
    │   └── tree_test.go
    ├── commit/              # From part 5
    │   ├── commit.go
    │   └── commit_test.go
    └── refs/                # NEW DIRECTORY
        ├── refs.go
        └── refs_test.go

Tests First

As we’ve done throughout this series, we’ll start by writing tests to define the expected behavior of our references implementation. This test-driven approach ensures our implementation meets all requirements.

// internal/refs/refs_test.go
package refs

import (
	"os"
	"path/filepath"
	"testing"
	
	"github.com/HalilFocic/gitgo/internal/config"
)

func TestReferences(t *testing.T) {
	t.Run("1.1: Read HEAD reference", func(t *testing.T) {
		cwd, err := os.Getwd()
		if err != nil {
			t.Fatalf("Failed to get working directory: %v", err)
		}

		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName), 0755)
		defer os.RemoveAll(testDir)

		headPath := filepath.Join(testDir, config.GitDirName, "HEAD")
		err = os.WriteFile(headPath, []byte("ref: refs/heads/main\n"), 0644)
		if err != nil {
			t.Fatalf("Failed to create HEAD file: %v", err)
		}

		ref, err := ReadRef(testDir, "HEAD")
		if err != nil {
			t.Fatalf("Failed to read HEAD: %v", err)
		}

		if ref.Type != RefTypeSymbolic {
			t.Error("Expected HEAD to be symbolic reference")
		}
		if ref.Target != "refs/heads/main" {
			t.Errorf("Wrong target: got %s, want refs/heads/main", ref.Target)
		}
	})

	t.Run("1.2: Read branch reference", func(t *testing.T) {
		cwd, err := os.Getwd()
		if err != nil {
			t.Fatalf("Failed to get working directory: %v", err)
		}

		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName, "refs", "heads"), 0755)
		defer os.RemoveAll(testDir)

		commitHash := "1234567890123456789012345678901234567890"
		branchPath := filepath.Join(testDir, config.GitDirName, "refs", "heads", "main")
		err = os.WriteFile(branchPath, []byte(commitHash), 0644)
		if err != nil {
			t.Fatalf("Failed to create branch file: %v", err)
		}

		ref, err := ReadRef(testDir, "refs/heads/main")
		if err != nil {
			t.Fatalf("Failed to read branch: %v", err)
		}

		if ref.Type != RefTypeCommit {
			t.Error("Expected branch to be commit reference")
		}
		if ref.Target != commitHash {
			t.Errorf("Wrong target: got %s, want %s", ref.Target, commitHash)
		}
	})

	t.Run("1.3: Invalid reference", func(t *testing.T) {
		cwd, _ := os.Getwd()
		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(testDir, 0755)
		defer os.RemoveAll(testDir)

		_, err := ReadRef(testDir, "nonexistent")
		if err == nil {
			t.Error("Expected error for nonexistent reference")
		}
	})
	t.Run("2.1: Create and delete branch", func(t *testing.T) {
		cwd, _ := os.Getwd()
		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName, "refs", "heads"), 0755)
		defer os.RemoveAll(testDir)

		err := WriteHead(testDir, "refs/heads/main", true)
		if err != nil {
			t.Fatalf("Failed to write HEAD: %v", err)
		}

		commitHash := "1234567890123456789012345678901234567890"

		err = CreateBranch(testDir, "dev", commitHash)
		if err != nil {
			t.Fatalf("Failed to create branch: %v", err)
		}

		ref, err := ReadRef(testDir, "refs/heads/dev")
		if err != nil {
			t.Fatalf("Failed to read created branch: %v", err)
		}
		if ref.Target != commitHash {
			t.Errorf("Branch points to wrong commit: got %s, want %s", ref.Target, commitHash)
		}

		err = CreateBranch(testDir, "dev", commitHash)
		if err == nil {
			t.Error("Expected error when creating duplicate branch")
		}

		err = DeleteBranch(testDir, "dev")
		if err != nil {
			t.Fatalf("Failed to delete branch: %v", err)
		}

		_, err = ReadRef(testDir, "refs/heads/dev")
		if err == nil {
			t.Error("Branch still exists after deletion")
		}
	})

	t.Run("2.2: Cannot delete current branch", func(t *testing.T) {
		cwd, _ := os.Getwd()
		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName, "refs", "heads"), 0755)
		defer os.RemoveAll(testDir)

		commitHash := "1234567890123456789012345678901234567890"

		err := CreateBranch(testDir, "main", commitHash)
		if err != nil {
			t.Fatalf("Failed to create main branch: %v", err)
		}
		err = WriteHead(testDir, "refs/heads/main", true)
		if err != nil {
			t.Fatalf("Failed to write HEAD: %v", err)
		}

		err = DeleteBranch(testDir, "main")
		if err == nil {
			t.Error("Should not be able to delete current branch")
		}
	})

	t.Run("2.3: Branch operations with detached HEAD", func(t *testing.T) {
		cwd, _ := os.Getwd()
		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName, "refs", "heads"), 0755)
		defer os.RemoveAll(testDir)

		commitHash := "1234567890123456789012345678901234567890"

		err := WriteHead(testDir, commitHash, false)
		if err != nil {
			t.Fatalf("Failed to create detached HEAD: %v", err)
		}

		err = CreateBranch(testDir, "feature", commitHash)
		if err != nil {
			t.Fatalf("Failed to create branch in detached HEAD: %v", err)
		}

		err = DeleteBranch(testDir, "feature")
		if err != nil {
			t.Fatalf("Failed to delete branch in detached HEAD: %v", err)
		}
	})

	t.Run("2.4: Invalid branch names", func(t *testing.T) {
		cwd, _ := os.Getwd()
		testDir := filepath.Join(cwd, "testdata")
		os.RemoveAll(testDir)
		os.MkdirAll(filepath.Join(testDir, config.GitDirName, "refs", "heads"), 0755)
		defer os.RemoveAll(testDir)

		commitHash := "1234567890123456789012345678901234567890"
		invalidNames := []string{
			"",
			"branch/with/slash",
			".",
			"..",
		}

		for _, name := range invalidNames {
			err := CreateBranch(testDir, name, commitHash)
			if err == nil {
				t.Errorf("Expected error for invalid branch name: %q", name)
			}
		}
	})
}

Our tests cover several critical scenarios:

  1. Reading References (1.1 - 1.3):

    • Reading the HEAD symbolic reference
    • Reading a branch reference that points directly to a commit
    • Handling invalid/nonexistent references
  2. Branch Management (2.1 - 2.4):

    • Creating and deleting branches
    • Preventing deletion of the current branch
    • Handling branch operations with a detached HEAD
    • Validating branch names (preventing invalid characters)

These tests ensure our references system will handle all the typical operations Git users need.

Implementation Overview

Now that we have our tests, let’s implement the references functionality. Here’s what our refs.go file will contain:

// internal/refs/refs.go
package refs

import (
	"fmt"
	"os"
	"path/filepath"
	"strings"
	
	"github.com/HalilFocic/gitgo/internal/config"
)

const (
	RefTypeCommit = iota
	RefTypeSymbolic
)

const (
	HeadFile = "HEAD"
	RefsDir  = "refs"
	HeadsDir = "refs/heads"
)

type Reference struct {
	Name     string
	Type     int
	Target   string
	rootPath string
}

func ReadRef(rootPath, name string) (Reference, error) {}
func ReadHead(rootPath string) (Reference, error) {}
func UpdateRef(rootPath, name, target string, isSymbolic bool) error {}
func WriteHead(rootPath, target string, isSymbol bool) error {}
func CreateBranch(rootPath, name, commitHash string) error {}
func DeleteBranch(rootPath, name string) error {}
func ListBranches(rootPath string) ([]string, error) {}

Our key components are:

  1. Constants: Define reference types and important paths
  2. Reference struct: Represents a Git reference
  3. Functions:
    • ReadRef: Read any reference
    • ReadHead: Read the special HEAD reference
    • UpdateRef: Update any reference
    • WriteHead: Update the HEAD reference
    • CreateBranch: Create a new branch
    • DeleteBranch: Delete an existing branch
    • ListBranches: List all branches

Let’s implement each of these functions one by one.

Implementing the ReadRef Function

The ReadRef function is responsible for reading a reference from disk and determining its type. Let’s implement it:

func ReadRef(rootPath, name string) (Reference, error) {
	refPath := filepath.Join(rootPath, config.GitDirName, name)

	content, err := os.ReadFile(refPath)
	if err != nil {
		return Reference{}, fmt.Errorf("failed to read reference %s: %v", name, err)
	}
	ref := Reference{
		Name:     name,
		rootPath: rootPath,
	}

	text := strings.TrimSpace(string(content))
	if strings.HasPrefix(text, "ref: ") {
		ref.Type = RefTypeSymbolic
		ref.Target = strings.TrimPrefix(text, "ref: ")
	} else {
		ref.Type = RefTypeCommit
		ref.Target = text
	}
	return ref, nil
}

Let’s break down this implementation:

  1. Path Construction:

    refPath := filepath.Join(rootPath, config.GitDirName, name)
    • Builds the full path to the reference file
    • Can handle both direct references like HEAD and nested ones like refs/heads/main
  2. File Reading:

    content, err := os.ReadFile(refPath)
    if err != nil {
        return Reference{}, fmt.Errorf("failed to read reference %s: %v", name, err)
    }
    • Reads the reference file content
    • Returns an error if the file doesn’t exist or can’t be read
  3. Reference Creation:

    ref := Reference{
        Name:     name,
        rootPath: rootPath,
    }
    • Creates a new Reference with the provided name
    • Stores the root path for potential future operations
  4. Reference Type Determination:

    text := strings.TrimSpace(string(content))
    if strings.HasPrefix(text, "ref: ") {
        ref.Type = RefTypeSymbolic
        ref.Target = strings.TrimPrefix(text, "ref: ")
    } else {
        ref.Type = RefTypeCommit
        ref.Target = text
    }
    • Trims whitespace from the content
    • Checks if it starts with “ref: ” (indicating a symbolic reference)
    • Sets the type and target accordingly
    • For symbolic refs, the target is another reference path
    • For commit refs, the target is a commit hash

Implementing the ReadHead Function

The ReadHead function is a convenience wrapper around ReadRef specifically for reading the HEAD reference:

func ReadHead(rootPath string) (Reference, error) {
	return ReadRef(rootPath, HeadFile)
}

This simple implementation:

  1. Calls ReadRef with the HEAD file constant
  2. Returns the result directly
  3. Makes the code more readable where HEAD is specifically needed

Implementing the UpdateRef Function

The UpdateRef function is responsible for writing a reference to disk, handling both symbolic and direct references:

func UpdateRef(rootPath, name, target string, isSymbolic bool) error {
	fullPath := filepath.Join(rootPath, config.GitDirName, name)

	var content string
	if isSymbolic {
		content = "ref: " + target + "\n"
	} else {
		content = target + "\n"
	}
	if err := os.MkdirAll(filepath.Dir(fullPath), 0755); err != nil {
		return fmt.Errorf("failed to create directories for %s: %v", name, err)
	}
	if err := os.WriteFile(fullPath, []byte(content), 0644); err != nil {
		return fmt.Errorf("failed to write reference %s: %v", name, err)
	}
	return nil
}

Let’s analyze this implementation:

  1. Path Construction:

    fullPath := filepath.Join(rootPath, config.GitDirName, name)
    • Builds the full path to the reference file using the configured Git directory name
    • Works for any reference within the repository
  2. Content Formatting:

    var content string
    if isSymbolic {
        content = "ref: " + target + "\n"
    } else {
        content = target + "\n"
    }
    • Formats the content based on whether it’s symbolic or direct
    • Symbolic references get the “ref: ” prefix
    • Direct references contain just the commit hash
    • Both include a trailing newline for Git compatibility
  3. Directory Creation:

    if err := os.MkdirAll(filepath.Dir(fullPath), 0755); err != nil {
        return fmt.Errorf("failed to create directories for %s: %v", name, err)
    }
    • Ensures the directory structure exists
    • Creates any missing parent directories
    • Important for new branches in previously empty directories
  4. File Writing:

    if err := os.WriteFile(fullPath, []byte(content), 0644); err != nil {
        return fmt.Errorf("failed to write reference %s: %v", name, err)
    }
    • Writes the content to the file
    • Sets appropriate file permissions
    • Returns an error if the write fails

Implementing the WriteHead Function

Similar to ReadHead, the WriteHead function is a wrapper around UpdateRef for the HEAD reference:

func WriteHead(rootPath, target string, isSymbol bool) error {
	return UpdateRef(rootPath, HeadFile, target, isSymbol)
}

This implementation:

  1. Calls UpdateRef with the HEAD file constant
  2. Passes through the target and isSymbolic flag
  3. Makes code more readable when specifically updating HEAD

Implementing the CreateBranch Function

The CreateBranch function creates a new branch pointing to a specific commit:

func CreateBranch(rootPath, name, commitHash string) error {
	if strings.Contains(name, "/") {
		return fmt.Errorf("branch cannot contain slashes")
	}
	if name == "" {
		return fmt.Errorf("branch name cannot be empty")
	}

	branchRef := filepath.Join("refs", "heads", name)
	if _, err := ReadRef(rootPath, branchRef); err == nil {
		return fmt.Errorf("branch %s already exists", name)
	}
	return UpdateRef(rootPath, branchRef, commitHash, false)
}

Let’s break down this implementation:

  1. Branch Name Validation:

    if strings.Contains(name, "/") {
        return fmt.Errorf("branch cannot contain slashes")
    }
    if name == "" {
        return fmt.Errorf("branch name cannot be empty")
    }
    • Prevents slashes in branch names (which would create subdirectories)
    • Ensures the branch name is not empty
    • These validations prevent potential security issues and file system problems
  2. Construct Branch Reference Path:

    branchRef := filepath.Join("refs", "heads", name)
    • Creates the standard path for a branch reference
    • Follows Git’s convention of storing branches in refs/heads/
  3. Check for Existing Branch:

    if _, err := ReadRef(rootPath, branchRef); err == nil {
        return fmt.Errorf("branch %s already exists", name)
    }
    • Attempts to read the branch reference
    • If successful (no error), branch already exists
    • Prevents accidentally overwriting existing branches
  4. Create the Branch:

    return UpdateRef(rootPath, branchRef, commitHash, false)
    • Uses UpdateRef to write the branch reference
    • Sets isSymbolic to false since branches point directly to commits
    • The commit hash becomes the target of the reference

Implementing the DeleteBranch Function

The DeleteBranch function removes a branch reference, but only if it’s not the current branch:

func DeleteBranch(rootPath, name string) error {
	branchRef := filepath.Join("refs", "heads", name)
	_, err := ReadRef(rootPath, branchRef)
	if err != nil {
		return fmt.Errorf("branch %s does not exist", name)
	}
	head, err := ReadHead(rootPath)
	if err != nil {
		return fmt.Errorf("failed to read head: %v", err)
	}

	if head.Type == RefTypeSymbolic && head.Target == branchRef {
		return fmt.Errorf("cannot delete current branch %s", name)
	}
	branchPath := filepath.Join(rootPath, config.GitDirName, branchRef)
	if err := os.Remove(branchPath); err != nil {
		return fmt.Errorf("failed to delete branch %s: %v", name, err)
	}
	return nil
}

Let’s analyze this implementation:

  1. Construct Branch Reference Path:

    branchRef := filepath.Join("refs", "heads", name)
    • Creates the standard path for a branch reference
    • Same as in CreateBranch
  2. Check Branch Exists:

    _, err := ReadRef(rootPath, branchRef)
    if err != nil {
        return fmt.Errorf("branch %s does not exist", name)
    }
    • Attempts to read the branch reference
    • Returns an error if the branch doesn’t exist
    • Prevents attempts to delete non-existent branches
  3. Check if Current Branch:

    head, err := ReadHead(rootPath)
    if err != nil {
        return fmt.Errorf("failed to read head: %v", err)
    }
    
    if head.Type == RefTypeSymbolic && head.Target == branchRef {
        return fmt.Errorf("cannot delete current branch %s", name)
    }
    • Reads the HEAD reference
    • Checks if HEAD is a symbolic reference pointing to this branch
    • Prevents deletion of the current branch, which would leave HEAD dangling
  4. Delete the Branch:

    branchPath := filepath.Join(rootPath, config.GitDirName, branchRef)
    if err := os.Remove(branchPath); err != nil {
        return fmt.Errorf("failed to delete branch %s: %v", name, err)
    }
    • Constructs the full path to the branch file using the configured Git directory name
    • Removes the file
    • Returns an error if deletion fails

Implementing the ListBranches Function

Finally, the ListBranches function retrieves all branches in the repository:

func ListBranches(rootPath string) ([]string, error) {
	headsDir := filepath.Join(rootPath, config.GitDirName, "refs", "heads")

	files, err := os.ReadDir(headsDir)
	if err != nil {
		return nil, fmt.Errorf("failed to read refs directory: %v", err)
	}

	var branches []string
	for _, file := range files {
		if !file.IsDir() {
			branches = append(branches, file.Name())
		}
	}

	return branches, nil
}

This implementation:

  1. Find the Branches Directory:

    headsDir := filepath.Join(rootPath, config.GitDirName, "refs", "heads")
    • Constructs the path to the directory containing branch references using the configured Git directory name
    • This follows Git’s standard location for branches
  2. Read the Directory Contents:

    files, err := os.ReadDir(headsDir)
    if err != nil {
        return nil, fmt.Errorf("failed to read refs directory: %v", err)
    }
    • Reads all files and directories in the heads directory
    • Returns an error if the directory can’t be read
  3. Extract Branch Names:

    var branches []string
    for _, file := range files {
        if !file.IsDir() {
            branches = append(branches, file.Name())
        }
    }
    • Iterates through all directory entries
    • Includes only regular files (not subdirectories)
    • Adds each file name to the branches list
    • File names correspond directly to branch names
  4. Return the List:

    return branches, nil
    • Returns the list of branch names
    • Returns an empty slice if no branches exist

Testing Our Implementation

Now that we’ve implemented all the components of our references system, let’s test it to ensure everything works as expected:

go test ./internal/refs

If all tests pass, congratulations! You’ve successfully implemented Git’s references. Now we have all the functionalities set for one final chapter.

Summary

In this chapter, we’ve implemented Git’s references system, which provides human-friendly names for commits and enables branch management. Here’s what we’ve accomplished:

  1. Reference Reading and Writing: We’ve implemented functions to read and write both symbolic references (like HEAD) and direct references (like branches).

  2. Branch Management: Our implementation supports creating, listing, and deleting branches, with proper validation and safety checks.

  3. HEAD Tracking: We can track and update the HEAD reference, which is essential for knowing the current state of the repository.

  4. Error Handling: Our implementation includes comprehensive error handling for cases like missing references, invalid branch names, and attempts to delete the current branch.

These components work together to create a references system that makes our Git implementation practical for everyday use. Instead of working with cryptic SHA-1 hashes, users can now interact with friendly branch names.

What’s Next?

In the next chapter, we’ll implement command-line functionality for our Git clone, building on top of all the components we’ve created so far.

We’ll create commands for:

  1. init: Initialize a new repository
  2. add: Add files to the staging area
  3. commit: Create a new commit
  4. branch: List, create, and delete branches
  5. checkout: Switch between branches
  6. log: View commit history

These commands will provide a user-friendly interface to our Git implementation, making it usable for actual version control tasks.

We’ll also tie together all the components we’ve built throughout this series, showing how they work together to create a functional version control system.

The command-line interface is the final piece that turns our implementation from a collection of libraries into a usable tool, providing a familiar experience for Git users.