Wrap-up

Hello, all.  Yes, I’m still sick.  Yes, I totally fsck’d over my Linux partition.  No, I don’t currently have Rust installed.  No, I have not yet made a new pull request with my recent and totally awesome changes.  I’ve just started a new job, so I’ve been busy!  The insertion method and underlying helper methods now work, and I feel that’s a great accomplishment for someone who started from square one and essentially had to start over again suddenly at the end.  I am proud of myself for creating something relevant that, even if I don’t finish it, someone can!

Posted in Uncategorized | Tagged , , , , | Leave a comment

The Work Week!

Hello!

Yes. I know. I said I’d post every day. However, I’ve been sick in a strange and debilitating way for about three weeks, and I have been feeling rather run down lately.  I figured that I’d post a quick update today, though, to talk about what I’ve done recently, especially since my meeting with colleague Niko today was particularly interesting!

Niko and I sat down after lunch and discussed my data structure.  He suggested an unexpectedly slick and elegant way of representing a B-tree, and I have more or less started from absolute scratch.  This felt bad initially–like, what happened to all the progress back-and-forth that I’ve been making since December?  However, I think I have substantially and suddenly enhanced my understanding of How Rust Works (and How Rust Should Look), and I am anticipating a great deal of progress to follow very soon.  I regret to say that I have not yet pushed my changes to my GitHub repo, but!  I hope by the end of the week I will have something new.

Posted in Uncategorized | Leave a comment

It’s almost over! D:”

Here’s an update on my project with Mozilla Rust:

I’ve spent the past several weeks experimenting with the concepts of references and named lifetime parameters in Rust.  This, I believe, is my key to avoiding the .clone() operator.  Borrowed pointers are one way Rust ensures memory safety.  The compiler will only accept a borrowed pointer to memory if it can guarantee that the memory being borrowed will not be changed in any way throughout the duration of the borrow.  This protects the user from such nuisances as dangling pointers, which you might encounter in C or C++.  In my code, I need to return these references (another name for borrowed pointers), which complicates the issue of working with them.  In order to continue to assure memory safety, I use named lifetimes to state explicitly to the compiler the span for which the reference will be valid. This ends up working because, typically (and this is the case in my code as well) one only returns references if they are derived from the function’s parameters.  I am continuing to enhance my understanding of these concepts as I work with them!  So perhaps someday I can give an even better explanation–maybe I’ll even write some documentation!

I am heading across the country soon to conclude my internship in the company of my collaborators.  I will hope to have daily (?!) blog posts discussing my role at the work week and what I learned there!  I’m starting to reflect on my experience with this internship and I am really taken aback by how much I learned in the process.  I am now familiar on a deep level with a language I barely knew at the start.  Even though, it looks like, at this rate, my library will not be finished by the end of the internship, I have left a well-designed framework for whoever steps up to the challenge of creating a b-tree in Rust next…which might still be me!

Posted in Uncategorized | Tagged , , , , , | Leave a comment

A brief update on the tree

It’s been a long journey with the “add method” (now called insert), and it’s still not over.  My pull request for the original implementation of insert was finally accepted, but now it’s time to improve upon it.  There was some backlash while I was trying to get my pull request to go through.  Some developers took issue with the idea of a first attempt that uses clone for support, so I decided to take the time to improve the current implementation rather than add new features, but it has not been easy.  I talked with my mentor about the problems I was having and he told me that my project was pushing the bounds of what is currently possible in Rust–that’s a great feeling!  I’m glad to be working on such a challenging problem, and I’m also glad that when I have questions, sometimes they even manage to stump my mentor.

The next attempt on the insert method will use named lifetime pointers, which are an interesting and uncommon feature of a programming language.  Once I understand them a little better, I will share my knowledge and post an explanation of named lifetime pointers.

Time to experiment with named lifetimes!  Expect another update soon.

Posted in Uncategorized | Leave a comment

How the insertion method works, never mind the .clone()s

After a major setback, I got to work on an add method for my B-tree that would use Rust’s clone() functionality extensively to get around the problems I was having with borrowing and lifetimes.  Of course, this is not a permanent solution, but rather a way to get ideas down on the screen more immediately.

Here’s how it works:

Let’s first examine the top level–

    pub fn add(mut self, k: K, v: V) -> BTree<K, V> {
        let (a, b) = self.root.clone().add(k, v, self.upper_bound.clone()); //Each lower level returns a tuple containing a Branch
        if b {                                                                                             //and a boolean to confirm that a change was made.
            self.root = Node::new_branch(a.clone().elts, a.clone().rightmost_child);
        }
        self
    }

We first call the add method on the root, which is a Node.  If the result of this addition comes back positive (it returns “true”, which is stored in the “b” part of the tuple), we reassign the root to the Branch that results from adding.  The root is the only part of the BTree that doesn’t have to adhere to a lower bound for this reason, as we only return a Branch with one actual element (and a left and right child) every time we return from adding.

Let’s see what happens at the Node level:

    fn add(mut self, k: K, v: V, ub: uint) -> (Branch<K, V>, bool) {
        match self {
            LeafNode(leaf) => leaf.add(k, v, ub),
            BranchNode(branch) => branch.add(k, v, ub)
        }
    }

A Node is an enum, which means it could have several variations on its type.  Here, it is either a LeafNode or a BranchNode.  Each contain either a Leaf or a Branch, which holds an owned vector of elements (much like a resizeable array), and maybe (in the case of a Branch), a far-right child–a Node–which has values greater than all elements in the vector.  This part of the add method tells you which way to go next, depending on the type of Node you’re dealing with.

Let’s first check out the Branch’s add method, as it is generally the next stop when you’re working with a well-populated B-tree full of elements.

fn add(mut self, k: K, v: V, ub: uint) -> (Branch<K, V>, bool) {
        let mut new_branch = Branch::new_empty(ub.clone()); 
        let mut outcome = false;
        let mut index = -1;
        for i in range(0, self.elts.len()) {
            let order = k.cmp(&self.elts[i].key);
            match order {
                Less => {
                    let new_outcome = self.clone().elts[i].left.add(k.clone(), 
                                                                    v.clone(), 
                                                                    ub.clone());
                    new_branch = new_outcome.clone().n0();
                    outcome = new_outcome.n1();
                    index = i;
                    break;
                }
                Equal => {
                    return (new_branch, outcome);
                }
                Greater => {
                    continue;
                }
            }
        }
        if index == -1 {
            let new_outcome = self.clone().rightmost_child.add(k, v, ub);
            new_branch = new_outcome.clone().n0();
            outcome = new_outcome.n1();
        }

———-

As you might guess, this is only about half the code for the Branch’s add method.  We’ll talk about the other half later, after the call to Leaf!  We can’t add at a Branch, so we use this first part of the method to determine which part of the tree to visit next.  We could end up on a Branch or a Leaf, but in either case, we’re making a recursive call to Node::add.  Here, we look through the branches and determine whether to stop and integrate the new element (designated by k and v, a soon-to-be key-value pair) at the next level down in the tree.  If the element is larger than all of the elements in the vector, we keep track of this with the “index” variable and use this to determine if we should pursue the rightmost_child as a location for the new element.

So this might continue on for some time until we find a Leaf!  Let’s check out the full code for Leaf::add.

    fn add(mut self, k: K, v: V, ub: uint) -> (Branch<K, V>, bool) {
        let new_branch = Branch::new_empty(ub.clone());
        let to_add = LeafElt::new(k, v);
        let mut index = self.elts.clone().len();
        for i in range(0, self.elts.clone().len()) {
            let order = to_add.key.cmp(&self.elts[i].key);
            match order {
                Less => {
                    self.elts.insert(i, to_add.clone());
                    index = i;
                    break;
                }
                Equal => {
                    return (new_branch, false);
                }
                Greater => {
                    continue;
                }
            }
        }
        if index == self.elts.clone().len() {
            self.elts.insert(index, to_add.clone());
        }
        if self.elts.len() > ub {
            let midpoint = self.elts.remove(ub / 2);
            let (left_leaf, right_leaf) = self.elts.partition(|le| le.key.cmp(&midpoint.key) == Greater);
            let branch_return = Branch::new(~[BranchElt::new(midpoint.clone().key, 
                                                             midpoint.clone().value,
                                                             ~Node::new_leaf(left_leaf))],
                                            ~Node::new_leaf(right_leaf));
            return (branch_return, true);
        }
        (new_branch, false)
    }

This starts out much like the Branch’s add method where we check to see where in the Node we should go from here.  Instead of making a recursive call, though, it simply smooshes the new LeafElt into the vector of elements, which is called elts.  From there we check and see if the Leaf has exceeded the size of the upper bound.  If so, we pop out the middle element of the vector, create a new Branch with a left child and a right child comprised of either half of the vector, and we return that so the level that called the leaf can see it.

Now let’s return to the Branch.

if outcome {
            let new_elt = new_branch.clone().elts[0];
            for i in range(0, self.elts.len()) {
                let order = new_elt.key.cmp(&self.elts[i].key);
                match order {
                    Less => {
                        continue;
                    }
                    Equal => {
                        return (Branch::new_empty(ub.clone()), false);
                    }
                    Greater => {
                        self.elts.insert(i + 1, new_elt);
                        if i + 2 >= self.elts.len() {
                            self.rightmost_child = new_branch.clone().rightmost_child;
                        }
                        else {
                            self.elts[i + 2].left = new_branch.clone().rightmost_child;
                        }
                        break;
                    }
                }
            }
            if self.elts.len() > ub {
                let midpoint = self.elts.remove(ub / 2);
                let (new_left, new_right) = self.clone().elts.partition(|le| 
                                                                midpoint.key.cmp(&le.key) == Greater);
                new_branch = Branch::new(~[BranchElt::new(midpoint.clone().key,
                                                          midpoint.clone().value,
                                                          ~Node::new_branch(new_left,
                                                                           midpoint.clone().left))],
                                           ~Node::new_branch(new_right, self.clone().rightmost_child));
                return (new_branch, true);
                                         
            }
        }
        (new_branch, outcome)
    }

Here, we ask ourselves if there was any “outcome” from adding: did we actually add a new element into the tree, and do we now need to resize?  If we do need to resize, we will have returned a Branch containing a single element, as you have seen at the Leaf level.  We now search the Branch to find where in the Branch this new element would go, and perform a similar maneuver as with the Leaf: if the vector elts is too large, pop out the middle and return it.  This process continues until either we return false (and then we rocket up the tree back to the BTree level, where we return self), or until we have a new root, which we reassign to the root container in the topmost level, BTree.

Pretty simple, right?  I hope to be able to take out all the clone()s, which ensure that the code passes the borrow-checker–Rust as a language is committed to memory safety, so it doesn’t let you do anything that might be potentially silly with respect to memory.  This means that once an object is used in another application, it can’t be used again.  For example, if I were to call Node::add on self.root, that would be the last time I could use it in that function, unless I use clone(), which makes a new, deep copy of the object, or I am very tricky in ways I have not yet discovered.  There is a whole mess of commented-out code that you don’t see here that represents my efforts to perform this method without using clone and while appeasing the compiler.  Turns out, some of Rust’s handy builtins don’t work exactly the way I thought they would, which is kind of a bummer.  I decided to stop and switch gears for the moment, and above, you see the results.  Pretty nifty!

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Now for the fun stuff: the add method for btree.rs

It’s about time I started on something exciting!  It’s time to implement the add method in my btree library.

Here’s how it works:

Find the appropriate leaf node in which to add the new key-value pair by searching down the tree by keys.  If the key is already found, then no change occurs.

Insert the key into the leaf node.  If the leaf node is larger than the designated upper bound (I often use 4 elements as the upper bound, but the user can specify this in my implementation [a feature which may or may not disappear by the end]), pop out the middle element, turn it into a branch with its leaves on either side being the two remaining halves of the leaf node, and integrate that element into the branch above.

Keep going until you reach the root!

This is a preliminary post to keep me on track as I get my bearings about this add method.  Once it’s completely implemented, expect a detailed code analysis that discusses, in particular, the intricacies of ownership and borrowing in Rust, which were major struggles for me at first.  I still have much to learn, but I hope I can give an adequate explanation of Rust’s idiosyncrasies, which make it extremely safe and reliable w/r/t memory and pointers.

Stay tuned!

Posted in Uncategorized | Tagged , , , , | Leave a comment

pull request pull request pull request!

I looked into some of the resources people gave me for git, and I felt like I had figured things out!  However, that wasn’t enough to stop me from making a mysterious error of unnatural causes.  I did my best to fix it, but my mentor and I had to sit down and step-by-step go through some git commands to figure out what had happened.  Everything seemed to fall into place when we reset –soft back to the newest commit before mine, but we still don’t know why git refused our other attempts to set things right.

But the fact that we had set my commit history to rights (that is, to one with a minimal amount of commits and no history of randomly deleted and resurrected files) meant that I could have my pull request approved!  I got it approved by my mentor, and the Rust build-bot Bors tried to merge it in that night.  Unfortunately, I was asleep when it failed, so I awoke to a list of error messages from Bors.  I wasn’t sure what was up, but everyone assured me that this happens from time to time, even with a successful build.  We retried the merge, and Bors co-operated.

I always get really excited when one of my pull requests gets merged.  It means some of my code made it into a big project!  It feels so cool to contribute.  I know I can’t yet count myself among the Rust greats–you know, those people who submit four new PRs a day and write the big, important features of the language–but it’s awesome to be back in my corner, writing my own little library from the ground up.  It’s nice to see my name on the authors list, get tipped in bitcoins for my work, and tell all my friends that I’m contributing to a major open-source project.

Posted in Uncategorized | Tagged , , , , , | Leave a comment