Image may be NSFW.
Clik here to view.Like any good software developer, I use a source control system daily. But I’ve fallen behind the times. The latest source control paradigm out there is something called a Distributed Version Control System (DVCS). The two main DVCS’s are Git and Mercurial. GitHub, which hosts git projects, seems be getting written up weekly in technology and business publications. I’m playing catch up, but now I understand what the big deal is. The PLM world needs to take notice. We need Distributed PLM systems.
Here’s why.
What is Distributed Version Control?
Before we talk about PLM software, we need to talk about software source control. Just hang in there if you’re not a developer yourself. I’ll get back to PLM in a bit.
I Was Blind…
I had heard of Git a few years ago, but the point of it had eluded me. Something about everyone having a full copy of the repository. Say wha…? Whatever. I’ll just stick with Subversion, thank you. After all, that’s a modern system. It’s so much better than CVS, or so I hear.
I started to realize that I had missed the boat when I attended the 2012 Global Day of Code Retreat, (which, by the way, was an awesome event — I highly recommend it). Sitting elbow to elbow with some really sharp professional programmers, I kept hearing, Git this…,
and, Git that…,
. In fact, the organizers of the event had recommended that everyone have Git a repository on their laptops for the retreat.
The next clue was when I installed Aptana Studio on my laptop to work on a little python project of mine. Guess what, it came preconfigured to work with a Git repository. So I set one up for myself to work with, got a free GitHub account, and used Git for the first time.
But I still didn’t get what the big deal was.
…But Now I See
Recently I was checking out Joel Spolsky’s blog, Joel On Software. If you’re a software developer, you need to be following Joel’s work. Even if you haven’t heard of him, you’re probably familiar with some of his work. Among (many) other things, he’s one of the cofounders of Stack Overflow. And if you’re someone who employees software developers, you really need to be reading Joel. In particular, go read what he has to say about desk chairs and private offices. Please. Do it now, I’ll wait. I have to go refill my coffee anyhow.
Okay, everyone back now? Cool. Let’s get back on track.
So Joel has a recent post about how he’s come to realize that distributed version control is superior to centralized version control
In order to explain to the rest of us why he’s become a DVCS convert, Joel put together a tutorial on Mercurial with a special Re-Education section for those of us familiar with Subversion.
Joel on Subversion
Here’s what Joel has to say about Subversion:
Now, here’s how Subversion works:
- When you check new code in, everybody else gets it.
Since all new code that you write has bugs, you have a choice.
- You can check in buggy code and drive everyone else crazy, or
- You can avoid checking it in until it’s fully debugged.
Subversion always gives you this horrible dilemma. Either the repository is full of bugs because it includes new code that was just written, or new code that was just written is not in the repository.
…
Subversion team members often go days or weeks without checking anything in… All this fear about checkins means people write code for weeks and weeks without the benefit of version control…Why have version control if you can’t use it?
Centralized Version Control, Illustrated
Here’s how Joel illustrates life with a centralized Subversion Repository:
Image may be NSFW.
Clik here to view.
Everyone has a local working copy of the code base which they periodically synchronize with the master version on the server. Or not.
Joel on Mercurial
Now we start to get to the what I had missed regarding Distributed Version Control Systems. True, every user has a local repository. But there’s still a central repository. Users check work into their local repository while they’re developing, and then merge their changes into the central repository.
Distributed Version Control, Illustrated
It looks like this:
Image may be NSFW.
Clik here to view.
I’ll let Joel explain what this means.
So you can commit your code to your private repository, and get all the benefit of version control, whenever you like. Every time you reach a logical point where your code is a little bit better, you can commit it.
Once it’s solid, and you’re willing to let other people use your new code, you push your changes from your repository to a central repository that everyone else pulls from, and they finally see your code. When it’s ready.
Mercurial separates the act of committing new code from the act of inflicting it on everybody else.
And that means that you can commit (hg com) without anyone else getting your changes. When you’ve got a bunch of changes that you like that are stable and all is well, you push them (hg push) to the main repository.
The Problem with Centralized PLM
That our current PLM systems follow the centralized data model shouldn’t be a surprise or controversial. That’s just how it is. The question is, why is that a problem? After all software development is completely different from designing Airplanes and Automobiles, right?
No.
Our PLM users are facing the same d*** problems that software developers face.
Worse than that, not only are current PLM systems not as good as a Git or a Mercurial, they’re not even as good as Subversion.
So what’s wrong with centralized PLM? Recall the primary problem with Subversion that Joel highlighted, When you check new code in, everybody else gets it.
Since most of use use PLM to manage CAD data, let’s look at how that plays out for CAD.
Option A: Check in bad designs
I hope that it’s uncontroversial that designs aren’t perfect before they’re finished — if then! If you’re an NX user in a Teamcenter environment, every time you save your work you’re checking in a new change to the central repository. Congratulations, you’ve just polluted the system with your junk (sheesh, that’s sounds dirty). Oh sure, we have statuses and workflows and revision rules to make sure that other users don’t see your junk unless they want to (that doesn’t sound any better) but that stuff is hard to understand. Just last week someone made the comment that, I’ve run into very few engineering Organizations that understand precise/imprecise and Revision Rules
. In fact, my post on understanding revision rules is the one of the most popular posts on this site.
Option B: Avoid check in
The other option is to avoid checking in your work until you’re sure it’s ready. While this isn’t an option for NX, most of Teamcenter’s CAD integrations allow this behavior. Typically, CAD integrations copy files from Teamcenter down to a local working directory from which the CAD application works with the files. The central Teamcenter repository is not updated until the user manually checks in their work… which could be days, if not weeks, later.
So, what exactly is the benefit to thd user of using a PLM system?
The Promise of Distributed PLM
Do you see now that we have the same problems with PLM software that Joel was describing with centralized source control systems? So let’s imagine that we’re living in a future world where we have a distributed PLM system. And robot butlers and flying cars. Not that they’re relevant, but they would be so damn cool.
I am not talking about Classic or Global Multisite here. In order to get close to what I mean by Distributed PLM here every single user would have to have at least one personal instance of TC that was multi-sited back to the central site. That may be theoretically possible, but that would be a very heavyweight, and cumbersome, implementation. I suspect that a more usable implementation would maintain only the delta between what a user had checked into his or her own private repo and the central repository.
So imagine that you’re a CAD user and in addition to the central repository that you’re used to you have a private repository. Now when you save your NX model or check in your ProE model you’re checking into your own personal repository. The main repository knows nothing of your work until push your changes to it. We’re not putting unfinished work out where other users can find it but we still have the benefits of version control.
Let’s noodle what that means. For starters, revision rules become a lot less important.
#ifdef vs. Revision Rules
While running down the shortcomings of Subversion, Joel brought up the topic of branching and merging (which I’ll get to shortly myself) and how it doesn’t work very well in Subversion.
[A]lmost every Subversion team told me…they swore off branches. And now what they do is this: each new feature is in a big #ifdef block. So they can work in one single trunk, while customers never see the new code until it’s debugged, and frankly, that’s ridiculous.
Keeping stable and dev code separate is precisely what source code control is supposed to let you do.
Good lord, what an ugly way to write code.
#if TC_VERSION < 8 int foobar(tag_t rev) { // imlementation for TcEngineering ... } #elif TC_VERSION < 9.0 int foobar(tag_t rev) { // implementation for TC 8.x ... } #elif TC_VERSION < 10.0 int foobar(tag_t rev) { // implementation for TC 9.x ... } #else int foobar(tag_t rev) { // implementation for TC 10+ ... } #endif |
Egads. Thank God we don’t have to deal with that mess in Teamcenter, right?
Wrong.
We do the same exact thing. We just use revision rules instead of #ifdef
.
Don’t believe me? Pretend that foobar was an item instead of a function.
- Foobar
- Foobar/01 (Frozen)
- Foobar/02 (Frozen)
- Foobar/-.01 (Manufacturing Preview)
- Foobar/A (Released)
- Foobar/B (Released)
- Foobar/C (Unstatused, owner=Scott)
- Foobar/D (Unstatused, owner=Joel)
Tell me that this isn’t basically how we select which revision to load in an assembly.
#if RevisionRule == "Precise" LOAD(foobar/01) #elif RevisionRule == "Latest Frozen" LOAD(foobar/02) #elif RevisionRule == "Latest Manufacturing Preview" LOAD(foobar/-.01) #elif Revision Rule == "Latest Released" LOAD(foobar/B) #elif RevisionRule == "Latest Working, current user is owner" LOAD(foobar/C) #elif RevisionRule == "Latest Working" LOAD(foobar/D) #endif |
Holy crap, we have done the same thing that the Subversion users ended up doing. We’ve put everything into the “trunk” of the central repository and then we have a bunch of complicated rules which none of the users really understand in order to figure out which version of the model we should be seeing at any given time.
And this brings me to the other point I wanted to make about what’s missing from PLM. The Subversion users ended up with a crappy #ifdef
code implementation because branching and merging in Subversion doesn’t work very well.
We ended up with a complicated set of release statuses and revision rules because we never had the opportunity to branch our designs. Teamcenter just doesn’t support it. I hear that Windchill now offers a branching capability that they adopted from PTC’s older IntraLINK product. If any other PLM systems support branching, I’d love to hear more about it.
Branching and Merging
Now we get to why I said earlier that what we have now in PLM software isn’t even as good as what Subversion users have. Despite its problems, Subversion does have the ability to create an independent code branch for development and then merge that back into the trunk. Teamcenter forces us to just put all of our changes directly into the trunk.
Let’s return to our future world of robot butlers, flying cars, and Distributed PLM. And let’s stipulate that in this world we can branch our designs. If I want to propose a change I don’t create a new revision of the model, I create an independent branch of the design that only I can see. When I look at my branch I see the same things that everyone else sees except for the things I’m changing. But no one else sees it unless I share my branch with them. My branch could change a single model, or it could change an entire assembly. I do my work in that branch. When I want to submit the proposal for review I share my branch for review. Only if it’s approved do I merge my updates back into the central “trunk” of the repository, making them available for all. If my proposal is rejected, I just… do nothing. My branch can sit there forever for all I care. It’s not hurting anybody. But if the Powers That Be finally realize that my proposal was right, then it’s there, ready to be revived. Think about how much cleaner that is than having a everything that’s ever been attempted, accepted, and rejected living forever under the central item.
I won’t get into why Joel says that branching and merging is better under Mercurial than under Subversion, but it is interesting. (Briefly: Subversion tracks versions, Mercurial tracks changes.)
This is a Big Deal
If you haven’t figured it out by now, I think this is a big deal. We tend to think of PLM and Source Control as being separate worlds, but they’re really dealing with very similar problems. But while source control systems have been evolving, the central core of how PLM works seems to have stagnated a decade or more ago. I imagine that PLM vendors, always looking or a new feature to sell to a new customer (or use to retain an existing one) aren’t spending a lot of time rethinking the fundamental model of version control they’re built upon. Look! Shiny object!
It’s time PLM starts to adopt some of the capabilities source control systems are providing. This won’t be an incremental improvement, We’ve redesigned the interface to reduce the number of mouse clicks a typical user makes in a day by 5%!
No, this will be huge.
In closing, Joel Spolsky compares Subversion and Mercurial by saying,
If you are using Subversion, stop it. Just stop. Subversion = Leeches. Mercurial and Git = Antibiotics. We have better technology now.
Not only are our PLM systems not yet on the level of Antibiotics. Without support for branching and merging, they’re not even on the level of leeches. I’m not sure what quack medicine was considered state-of-the-art before leeches came into vogue, but that’s about where we’re at. Goat sacrifice maybe. And we’re the goats.
I’m really hoping we’ll see Distributed PLM in the future. As a Teamcenter guy, I hope Teamcenter implements it first. If not, Windchill or Aras or one of the others that I can’t think of right now might just use this to gain a market advantage — and more power to them if they do.
What do you think?
So what do you all think? Am I onto something here? Or do you figure that I must be on something? I don’t pretend to think that this wouldn’t be hard to implement. But I think it would be worth it.
I’m sure there are problems I’ve overlooked, I’m also sure there are ways to leverage branching, merging, and local repositories that I haven’t considered. Please share both in the comments below.
Lastly, if you liked this post, your +1′s, likes, and shares help to get the word out to the rest of the world and will be very much appreciated. Thank you!
The post Git, and why We Need Distributed PLM appeared first on The PLM Dojo.