Code management issues in big projects

In my experience, code refactoring has been an anguish in big projects. It may be repetitive and tedious and take a long time to finish. But the result might also be disappointing, as it does not directly create value, and the refactoring may cause some issues. But, with projects getting bigger and bigger, refactoring is always needed.

In this article, I will discuss the code mangement issues in big projects. I will also introduce a few tools and concepts regarding code management.

Code analysation

I have worked on a project with more than 300,000 lines of C++ code that had some code management issues. For example, some old code in the project still used the singleton pattern. However, we have replaced the old singleton pattern with a new factory, and the factory should create all the service classes to make the code easier to split and reuse. One day, the project leader suddenly decided to refactor the code base. Since the code base was too huge, four people were working on the refactoring at the same time. But the refactoring still took weeks to finish. The project had some code like this:

std::string filePath = ProjectService::GetInstance().DownloadProjectFile(projectID);
ViewerService::GetInstance().OpenFile(filePath);

If the code base is small, it is unnecessary to refactor this piece of code. However, when the code base has grown to 300,000 lines with hundreds of services, it makes sense to split some services into their library files. However, with the singleton pattern, it's impossible to separate the code as the GetInstance method is one of the member functions of a service and must exist in the main executable file.

Now, here is a question. How are we supposed to refactor code like this if we have tons of code?

In my opinion, repetitive and boring jobs should always be handled by machines to ensure humans can make no mistakes. Clang may be used in this case to solve similar issues.

Clang is not just a compiler but also a library providing APIs for generating ASTs. ASTs offer a way to interact with the code with meaningful details. For example, the above code may be translated into a tree like the following:

| `-CompoundStmt 0x11902e0 <line:20:1, line:24:1>
|   |-DeclStmt 0x1190158 <line:21:5, col:88>
|   | `-VarDecl 0x118fef0 <col:5, col:87> col:17 used filePath 'std::string':'std::__cxx11::basic_string<char>' cinit destroyed
|   |   `-ExprWithCleanups 0x1190140 <col:28, col:87> 'std::string':'std::__cxx11::basic_string<char>'