BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Modeling Complex Data Structure in Golang Using Pointers, References, and Reverse Indices

Modeling Complex Data Structure in Golang Using Pointers, References, and Reverse Indices

Key Takeaways

  • Go pointers and references are a fundamental feature in the Go language. Using them correctly can help you build efficient systems.
  • Go’s explicit memory management via pointers and references gives developers direct control over memory usage, enabling optimized and predictable performance in memory-intensive applications.
  • Writing complex data structures in Go can help developers better understand the principles of pointers and references.
  • Choosing the right programming language to solve your problem is an important step to accomplish your goals.
  • Graphs can help you build granular permission systems.

Understanding pointers and references in Go can be challenging for many developers, especially those new to low-level programming languages. As someone from such a language (Python), I also found the learning curve hard. A project I worked on recently, where I had to utilize pointers to implement our product, helped me understand them better.

In this article, I’ll walk you through a program that leverages Go pointers and references to solve a real-world problem. Learning from an actual use case is a method that has always helped me more than abstract explanations, and I hope you will experience the same when reading this article.

Introduction to Pointers and References in Go

Developers transitioning to Go from languages without explicit pointers and references often find this concept challenging. Indeed, languages like Python or JavaScript handle memory management automatically, abstracting them away.

Before we dive into the hands-on part of this story, let’s have a quick walk-through of the syntax of pointers and references in Go.

Basic Syntax

In Go, a pointer is a type that stores a memory address, allowing direct access and manipulation of the data located at that address. The syntax for declaring a pointer uses the * mark. For example, var myPointer *int declares a variable whose type is a pointer to an integer.

Getting a reference represents the action of reading the memory address of a variable. The outcome of this action is a pointer.

Getting references is accomplished using the & operator. For example, var myPointer *int = &myInt declares a variable whose type is a pointer to an integer and initializes it with the reference (memory address) of the variable myInt. In this example, changing the value of myInt with myInt = 2 would also change the underlying value of myPointer. The reason is that myPointer doesn’t store the actual value, but the reference to the myInt variable. Taking the above example, trying to check the equality of myPointer and myInt will result in false because myPointer doesn’t hold the actual integer but only the memory reference.

If you find it hard to understand this level of complexity, it can get worse. Let’s consider an example that you can find on many websites that teach Go - how to get the concrete value from a pointer. To achieve that, we can use the * operator. For example, *myPointer == myInt retrieves the actual concrete value that myPointer references and checks its equality against myInt variable. This can also alter the value stored at that memory address; for example, *myPointer = 2 will change the concrete value to 2. This change will also affect the variable myInt.

It’s essential to understand the difference between using the * operator in the variable declaration, e.g., var myPointer *int, and using it to get the concrete value, e.g., var concrete int = *myPointer. When you use it in the variable declaration or as a function’s return type, the * operator refers to the type of a pointer (reference) used for type checking and compilation. When you use it to get the concrete value of a variable, you refer to the referenced value of a variable the program currently holds in memory.

Code example

Another way to deal with the complexity of understanding pointers and references is to write code demonstrating their behavior.
To illustrate this, let’s create a basic program based on the previous examples:

package main

import "fmt"

func main() {
    var myInt int = 1
    var myPointer *int = &myInt // myPointer holds the memory address of myInt
    fmt.Println("Value of myInt: ", myInt)
    fmt.Println("Address of myInt:", &myInt)
    fmt.Println("Value of myPointer ( pointer ): ", myPointer)
    fmt.Println("Value at address of myPointer ( concrete ): ", *myPointer)
    
    myInt = 2
    fmt.Println("Value of myInt: ", myInt) // This will print 2
    fmt.Println("Value of myPointer ( pointer ): ", myPointer) // This will also print 2 because myPointer references myInt

    *myPointer = 1 // myPointer is dereferenced to change the value at the address it holds
    fmt.Println("New Value of myInt: ", myInt)
    fmt.Println("Address of myInt:", &myInt)
    fmt.Println("Value of myPointer ( pointer ): ", myPointer)
    fmt.Println("Value at address of myPointer ( concrete ): ", *myPointer)
}

The code above will print the following output to the console:

Value of myInt:  1
Address of myInt: 0x140000b2008
Value of myPointer ( pointer ):  0x140000b2008
Value at address of myPointer ( concrete ):  1
New Value of myInt:  2
Address of myInt: 0x140000b2008
Value of myPointer ( pointer ):  0x140000b2008
Value at address of myPointer ( concrete ):  2

Our Use Case: Graph-Based Permissions

What helped me the most in learning new concepts was implementing features that use them extensively. In our present case, at the company I worked with, there was a need for a permissions system that would allow users to run queries based on relationships among resources in a third-party system. This led me to implement a graph-based permission system using pointers extensively.

What are Graph-Based Permissions?

The most intuitive way to explain graph-based permissions is through Google Drive, whose permission system is documented here. The model Google Drive uses to store, retrieve, and edit documents has several special permissions requirements, including:

  • Object Hierarchy: A Google Drive account can contain many folders and files. Folders can also contain other files and folders.
  • Direct File access: A user can get a role assignment directly on a file, and this role gives them a limited set of permissions for the particular file. For example, a single user assigned as a viewer to a file can view it, while another user assigned as an editor will be able to view and edit this file.
  • Folder-level access: A user can be granted access to a folder; this grant will also propagate recursively to all files and folders it contains.
  • Cross-application access: If you have ever tried to share a Drive document on Gmail, you are probably familiar with the pop-up asking you to grant permission to several email recipients. This functionality requires understanding permission levels across different apps; for example, Gmail is aware of the Google Drive permission system, which makes it possible to show the pop-up and provide a better UX.

The model above also ensures that the permission model is both scalable and flexible, and as we will see later on, the requirements mentioned are a perfect fit for a graph.

Out-of-the-box solutions

Several tools and databases help use graphs for permissions, such as Graph DB, Google Zanzibar implementations based on Graph DB or SpiceDB. In our case, these solutions weren’t fit for us as we had a strict requirement to mix conditional rules with the graph permissions model. We had to extend our Open Policy Agent (OPA) capabilities to implement such a permission model. Using a policy engine such as OPA, we could mix a rule-based permission model with a graph-based permission model.

Graphs in Go

When looking for a way to implement our permission model using a policy engine, we found that OPA supports Go plugins, allowing us to write a Go plugin to carry through graph-based operations on top of the permission data stored in the OPA.

When designing our solution and deciding how we wanted to build our Go plugin, we realized that Go pointers and references could help us create an efficient graph, reducing the number of operations required to build it and the overall memory used by the plugin.

Graphs, Pointers, and References

Writing complex, tailored-made data structures often involves a deep understanding of the problem and the capabilities of the language. The following section will walk you through our experience creating such a data structure using Go.

Graph Building Blocks

Every programming language offers a set of basic data structures, such as arrays, lists, and maps, which serve as the building blocks for handling and manipulating data. However, when dealing with more complex scenarios, these basic structures aren’t always enough. In such cases, developers must use different data structures tailored to their specific needs, such as graphs, trees, stacks, etc.

There are many types of graphs, all of which are structured from two main elements:

  • Node: a point or vertex that can store data and connect to other nodes via edges. We implemented this using a struct with a field of type map to reference the edges to the connecting nodes.
  • Edge: a connection or link between two nodes, representing a relationship or pathway between them. In some graphs, an edge also has a name. We implemented that as a map whose key is the name of the edge and the value is the referenced node.

The Most Classic Interview Question

I was thrilled to finally implement a task typically regarded as a classic interview question. I never thought I would encounter such a task in a real-world scenario. In our use case, we had to build a graph data structure with subjects, relations, and objects. To do this, we used many basic data structures in Go, such as structs, slices, maps, and strings.

This approach allowed me to design a highly efficient and flexible system tailored to my use case. In the following examples, I will demonstrate that concepts often considered purely theoretical in a classic interview question can have practical and impactful implementations inside real-life production applications.

Challenges when performing Graph Lookups

One of the main challenges with graphs is implementing "graph queries", especially when you want to track the path of permissions the user is granted. In many ways, a graph is similar to a tree, but the main difference is that a graph doesn’t have a root like a tree and might contain loops. This makes lookup queries much more complex, with the main complexity being looking up the graph efficiently while avoiding infinite recursion in case any loop is found.

What are Graph Lookups or Reverse Indices?

A very common question in a graph is which nodes are connected to a given node x. Answering this question requires us to include reverse indices from node x to every other node x is connected to. For instance, say we have a Folder:rnd node connected to node File:architecture.pdf via a relation called parent. Storing its children in Folder:rnd is not sufficient because, if we ask, "Who is the parent of File:architecture.pdf ?" we would have to iterate through all the nodes in the graph. One way to solve this is using reverse links, which are also called "reverse indices", and it is a commonly used technique in graph databases and graph-based systems.

Our Unique Graph Needs

Our graph data structure has many requirements:

  • Reverse indices - Perform complex graph lookups 
  • Minimal Memory Footprint - Avoid large memory footprints to minimize resource consumption 
  • Performance Aware - Create efficient queries on the graph to achieve a highly performant permission system

Using Pointers and References for Our Needs

In a naive implementation of such a custom-made graph structure, I would write several graph utility functions to help me meet the abovementioned requirements. In the following code, I will show you how I used pointers and references to build the graph data structure efficiently, perform quick lookups, and consume as little memory as possible.

1. Let’s start by implementing our two fundamental Node and Edge data structures: 

// Node Implementation
// is a node in a graph
type Node struct {
    // data is the actual data that this node represents
    data string
    // subjectRelationships are the edges / relationships this node is related to as the subject
    // the key in the map is the edge / relation key that the node is related by
    // and the value is the actual node
    subjectRelationships Edges
    // objectRelationships are the edges / relationships this node is related to as the object
    // the key in the map is the edge / relation key that the node is related by
    // and the value is the actual node
    objectRelationships Edges
}

// Edge Implementation
// is a map of edge / relation keys to slice node references
type Edges map[string]LinkedNodes
type LinkedNodes []*Node

2. Continuing with the structs’ implementations, let’s create constructors and utility functions that will help us going forward: 

// Node functions
// NewNode creates a new node in the graph without any edges / relationships
func NewNode(data string) *Node {
    node := &Node{
        data:                 data,
        subjectRelationships: NewEdges(),
        objectRelationships:  NewEdges(),
    }

    return node
}

func (n *Node) String() string {
    return n.data
}

func (n *Node) AddSubjectRelationship(relation string, node *Node) {
    n.subjectRelationships.AddEdge(relation, node)
    // We also need to add the relationship to the other node as an object relationship
    // so that we can traverse the graph in both directions
    node.objectRelationships.AddEdge(relation, n)
}

func (n *Node) AddObjectRelationship(relation string, node *Node) {
    n.objectRelationships.AddEdge(relation, node)
    // We also need to add the relationship to the other node as an object relationship
    // so that we can traverse the graph in both directions
    node.subjectRelationships.AddEdge(relation, n)
}

func (n *Node) Merge(node *Node) {
    for relation, nodes := range node.subjectRelationships {
        // iterate on all the subject edges
        for _, subjectNode := range nodes {
            // iterate on all the nodes and add them
            n.AddSubjectRelationship(relation, subjectNode)
        }
    }
    for relation, nodes := range node.objectRelationships {
        // iterate on all the object edges
        for _, objectNode := range nodes {
            // iterate on all the object nodes and add them
            n.AddObjectRelationship(relation, objectNode)
        }
    }
    // change the node reference to the one it was merged into
    *node = *n
}

// Edge Implementation
// NewRelationships creates a new edges map
func NewEdges() Edges {
    return make(Edges)
}

func NewLinkedNodes() LinkedNodes {
    return make(LinkedNodes, 0)
}

func NewLinkedNodesWithNode(node *Node) LinkedNodes {
    newLinkedNodes := NewLinkedNodes()
    return append(newLinkedNodes, node)
}

func (r *LinkedNodes) AddNode(node *Node) {
    // Use append to add a new node to the slice
    // Since r is a pointer, it modifies the original slice directly
    *r = append(*r, node)
}

// AddRelationship adds a relationship to the linkedNodes slice
func (r Edges) AddEdge(edge string, node *Node) {
    linkedNodes, isFound := r[edge]
    if !isFound {
        // if we didn't find this edge we create a new one
        // in the map with the current node
        linkedNodes = NewLinkedNodesWithNode(node)
    } else {
        // if we found, it means we already have a node related by an edge
        // with this name so we simply add this node to the slice of
        // nodes related with this edge
        linkedNodes.AddNode(node)
    }
    // finally update the linked nodes for this edge
    r[edge] = linkedNodes
}

3. After we have defined the fundamental elements of our graph, let’s implement the actual Graph data structure.

 // Graph is a map of all the nodes in the graph represented by
// their string representation
type Graph map[string]*Node

// NewGraphPointer creates a new graph pointer
func NewGraph() Graph {
    return make(Graph)
}

// AddNode adds a node to the graph pointer
func (g Graph) AddNode(node *Node) {
    existingNode, exists := g[node.String()]
    if exists {
        // Merge the relationships
        existingNode.Merge(node)
    } else {
        g[node.String()] = node
    }
}

func (g Graph) IsNodeInGraph(node *Node) bool {
    return g.isNodeInGraphString(node.String())
}

func (g Graph) isNodeInGraphString(node string) bool {
    _, exists := g[node]
    return exists
}

func (g Graph) GetNodeByString(nodeString string) *Node {
    return g[nodeString]
}

func (g Graph) NewNodeInGraph(data string) *Node {
    if node := g.GetNodeByString(data); node != nil {
        return node
    }
    node := NewNode(data)
    g.AddNode(node)
    return node
}

4.    Once we finished defining our structs and constructors, the last step was to write the functions to query the graph. In our example, we have two main use cases:

  1. Which node is linked to this node (as an object)?
  2. Which node is linked from this node (as a subject)?

We can implement both of them very easily thanks to the way we structured our data:

// lookUpNode is a private function to help iterate recursively all the nodes
// by a specific direction subject / object
// this recursive can also be done concurrently for optimizations
func (g Graph) lookUpNode(node string, subject bool, visitedNodes map[string]*Node) error {
    if !g.isNodeInGraphString(node) {
        return  fmt.Errorf("node not found")
    }
    var directEdges Edges
    if subject {
        // iterate over the subject edges / relationships 
        directEdges = g[node].subjectRelationships
    } else {
        // iterate over the object edges / relationships 
        directEdges = g[node].objectRelationships
    }
    for _,linkedNodes := range directEdges {
        // iterate all the edges names
        for _, linkedNodeRef := range linkedNodes {
            // iterate all the nodes "by edge name"
            if _, isVisited := visitedNodes[linkedNodeRef.String()]; !isVisited {
                // if we haven't visited this node in the recursive yet
                // we want to add it to the visitedNodes map
                // this will be the same map because maps are passed by ref
                visitedNodes[linkedNodeRef.String()] = linkedNodeRef
                // after adding this node we can continue in the recursive to iterate this node
                if err := g.lookUpNode(linkedNodeRef.String(), subject, visitedNodes); err != nil {
                    return err
                }
            }
        }
    }
    return nil
}

func (g Graph) LookUpNodeAsSubject(node string) (map[string]*Node, error) {
    // create a map of the visited nodes in the recursive
    visitedNodes := make(map[string]*Node)
    // lookup the node as subject
    if err := g.lookUpNode(node, true, visitedNodes); err != nil {
        return nil, err
    }
    return visitedNodes, nil
}

func (g Graph) LookUpNodeAsObject(node string) (map[string]*Node, error) {
    // create a map of the visited nodes in the recursive
    visitedNodes := make(map[string]*Node)
    // lookup the node as subject
    if err := g.lookUpNode(node, false, visitedNodes); err != nil {
        return nil, err
    }
    return visitedNodes, nil
}

As you can see in these code examples, using pointers and references not only helped me to create a memory-efficient and cleaner codebase but also provided a more reliable way to trust the performance of the graph. Let’s dive into the benefits it brings to our graph permissions system.

Graph Implementations with and Without Pointers

As you can see, in our implementation, we reused the pointers and references a lot; it helped us save many lines of code and, more importantly, optimize performance and memory consumption.

Two quick examples of the optimizations it helped us with are:

  • Memory Optimization - Helped us maintain only references when storing bidirectional indices. Without access to memory pointers, this would end with bloated memory usage.
  • Cascading Effect - When deleting a node, we had to delete every reference to it; having the stored reverse indices combined with memory pointers, we ensured there wasn’t any reference to the deleted object very easily.

Besides those and many other issues that might come up without direct memory access, the code to perform such a thing wouldn’t look as clean as it is when using pointers.

Conclusion

When faced with a problem or complex task, assessing the project requirements and considering the programming language that best suits your needs is essential. Many developers today embrace a language-agnostic approach, valuing the ability to choose the correct language for the task as a critical skill.

This article highlighted the practical use of Go’s pointers and references in developing an efficient, scalable graph-based permissions system. Utilizing these low-level constructs streamlined our code and boosted system performance. This case illustrates the importance of understanding your programming tools and making informed language choices.

About the Author

Rate this Article

Adoption
Style

BT