[haiku-development] Re: Git/Hg: some speed tests

  • From: "Ingo Weinhold" <ingo_weinhold@xxxxxx>
  • To: haiku-development@xxxxxxxxxxxxx
  • Date: Wed, 04 May 2011 20:39:33 +0200

On Wed, 4 May 2011 17:56:33 +0100 Sean Healy <jalopeura@xxxxxxxxxxx> wrote:
> On 04-May-11 16:35, Ingo Weinhold wrote:
> 
> >> And finally, does hg support updating the local repository without
> >> (temporarily) throw away my local changes?
> >> As far as I understand, that takes 3-4 commands with git which I find
> >> very annoying (git stash, git rebase, git stash apply[, git stash
> >> clear]) -- or is that only relevant when working with SVN as a source
> >> repository?
> >
> > You might want to rethink your workflow. Before even starting to work on
> something a local branch should be created. And before playing with remote
> repositories local changes should be committed anyway. Uncommitted local
> changes are so svn. :-)
> 
> Wait, are you saying that the DEFAULT on both git and hg is to overwrite 
> local changes when updating from another branch? And that to get around 
> that, you have to take several other steps? (That is, either committing 
> your changes to your local branch or stashing them; in each case, at 
> least two commands, IIRC.)

I'm not saying that. I'm pretty sure that with git switching branches locally 
just keeps uncommitted changes on top, and others have already written that 
this is also what happens in case of pulling from a remote repository. If I 
understood Axel correctly the issue he described pertains to git-svn.

> So a tool that theoretically provides more flexibility in workflow is 
> dictating a particular workflow? That seems really wrong-headed to me. 
> The workflow that it's dictating (forcing users to commit changes that 
> may not be ready to be committed) also seems wrong-headed.

I wouldn't say the tool dictates a particular workflow, but there are certain 
workflows that work better than others (well, that applies to any tool really). 
You can use an svn-like workflow with git and hg (basically only requiring an 
additional push after a commit), but there are better workflows. Using 
fine-grained local branches gives you nice features almost for free.

> It is my understanding that local branches are available to anyone who 
> knows the address of the machine they're stored on. Is this true, or is 
> there a way to mark your branches private so that nobody else can pull 
> them? In that case, requiring a commit of a local branch isn't quite as 
> wrong-headed, because it doesn't allow the world to get at changes that 
> aren't ready.

I don't know about marking branches private, but I think you're overlooking 
that you usually won't make your local repository publicly available. The local 
repository is a .git or .hg directory in your working copy. Your repository 
readable by others would be located on some server (your own or provided by 
GitHub, Gitorious,...). Your local branches remain local until you decide to 
push them to your public repository. You can freely decide what changes you 
consider good enough for public consumption and which you don't want anyone to 
see.

Generally I'd recommend not to be afraid and to develop as openly as possible. 
Even if something doesn't really work yet or needs cleanup others might find it 
interesting. They might even help you by pointing out issues or possible 
improvements.

> Of course, it also requires you to save everything twice - once in the 
> OS, once by committing to the local branch, which still seems like a 
> waste of time. If nobody can get at it anyway, why should I commit it 
> before I'm ready to let other people have access? If people can get at 
> it, why should I commit it before I'm ready to let other people have
> access?

You don't have to, but, even though you'd have to issue an additional command 
every now and then, using branches makes your life easier. With a large and 
diverse code base like Haiku's obvious opportunities for using branches are 
when you work on different components. But even when you're working only on a 
single component it is helpful to create a branch for each new (sub)task. So, 
if you have an implementation idea, first create a branch an start working in 
it. When you're done an happy with it, merge it back.(*) When it turns out to 
be a dead end, scrap it. When you get stuck and first need to work on something 
else before you know how to continue best or if you want to try an alternative, 
go back and create a new branch. If you get interrupted -- be it because you 
want to test someone else's changes (related or not), because you want to work 
on something else (e.g. fixing an unrelated bug you noticed), or whatever -- 
leave the branch alone and create a new one for your ne
 w task, so you can simply continue later on just where you left off.

This fine-grained task-based branching quite nicely organizes your tasks at the 
SCM level and allows you to easily mix and match (i.e. merge :-)) your tasks as 
you see fit.

From my experience with an svn(-like) workflow I always end up with the 
following phenomena:

* I have a bunch of *.diff files in my working directory, each of which 
constituting an experimental change, which either isn't quite ready yet or 
didn't work out, but still might be interesting for future reference. On the 
machine I'm writing this there are currently 10 of those.

* When interrupting work on something bigger with work on something else, 
unless the new task affects a completely unrelated component, there's a good 
chance that the changes will mix in a way that I can't commit the changes of 
the finished second task (because they affect common files). This leads to 
commits being postponed and joined unnecessarily.

"Preemptive" branching solves both issues. The first because the changes will 
at least be in the repository (and I could even decide to push the branches to 
my public repository) and the second because tasks are separated by design.

> I'm still trying to wrap my head around two conflicting ideas coming out 
> of git (I don't know if they're also valid for hg or not):
> 
> 1) It takes up less storage space
> 2) You should create a local branch for everything you work on (because 
> otherwise you'll lose uncommitted changes when you update)
>
> Doesn't point 2 invalidate point 1? If I'm working on four different 
> things, I now have four branches on my hard drive. Even if each 
> individual branch takes up half the space, I'm still using twice the
> space.
> 
> Or do additional branches not take up as much space as the local master 
> branch created by cloning?

Branches are cheap. The VCS doesn't copy the whole source tree. Even svn 
doesn't do that. I believe when merging svn actually stores the merged changes 
twice. AFAIK DVCSs are change set oriented and even a merged change set is 
still only stored once. The overhead for branches is really small.

CU, Ingo

(*) Just to be clear on the granularity of branches: When you're working on 
something, you probably don't want to branch every few minutes. With git the 
overhead is three or four commands (creating the branch, switching back, 
merging, and optionally deleting the branch). With aliases that could be 
reduced to one command for branching, one for the rest, but that would probably 
still get old pretty quickly. I guess in the end it's personal preference, but 
in my experience tasks often have a natural granularity ranging from half an 
hour to several hours. For shorter tasks it might not be worth bothering with a 
branch, longer ones usually can be split.

Other related posts: