Updated with <strong>Symfony 4.2</strong>.
It takes quite a long time to split Symplify monorepo packages: exactly 420 s for 8 packages of Symplify.
Could we go 200 % faster by putting processes from serial to parallel?
This is our code now. Each process waits on each other - one is finished, then next starts.
foreach ($splitConfiguration as $directory => $repository) {
$process = new Process(['git', 'subsplit', $directory . ':' . $repository]);
$process->run();
// here the process is finished
if (! $process->isSuccessful()) {
throw new PackageToRepositorySplitException($process->getErrorOutput());
}
// report exactly what happened, so it's easier to know result and debug
$symfonyStyle->success(sprintf(
'Split from "%s" to "%s" is done',
$directory,
$repository
));
}
We tried spatie/async which has very nice README at first sight and works probably very well for simple functions. But it turned out to be rather magic by passing service as serialized string to CLI that desirializes it and runs on own thread. It also caused other process commands fail on success message. It is probably good enough for Laravel, but not for my SOLID standards.
We could go amp or reactphp, but wouldn't that be an overkill?
There is also faster way like splitsh/lite, but we aim on PHP + Git combination so PHP developers could extend the code.
Luckily, Symfony Process already allows standalone process without waiting on each other.
Picking the right tool is important, since it vendor locks our code to package, but lets step back a little.
What is the exact goal we need?
$runningProcesses = [];
foreach ($splitConfiguration as $directory => $repository) {
$process = new Process(['git', 'subsplit', $directory . ':' . $repository]);
// start() doesn't wait until the process is finished, oppose to run()
$process->start();
// store process for later, so we evaluate it's finished
$runningProcesses[] = $process;
}
This foreach starts all processes in parallel. Without knowing they're finished or not.
Don't forget to check that your CPU is not burned by running many processes at once by limiting concurrency. In our case it's only 8, so we survive this.
while (count($activeProcesses)) {
foreach ($activeProcesses as $i => $runningProcess) {
// specific process is finished, so we remove it
if (! $runningProcess->isRunning()) {
unset($activeProcesses[$i]);
}
// check every second
sleep(1);
}
}
// here we know that all are finished
$symfonyStyle->success('Split was successful');
But how useful is this message compared to previous one?
$symfonyStyle->success(sprintf(
'Split from "%s" to "%s" is done',
$directory,
$repository
));
And what if any processes failed?
$runningProcesses = [];
+$allProcessInfos = [];
foreach ($splitConfiguration as $directory => $repository) {
$process = new Process(['git', 'subsplit', $directory . ':' . $repository]);
$process->start();
$runningProcesses[] = $process;
+ // value object with types would be better and faster here, this is just example
+ $allProcessInfos[] = [
+ 'process' => $process,
+ 'directory' => $subdirectory,
+ 'repository' => $repository
+ ];
}
So final reporting would look like this:
foreach ($allProcessInfos as $processInfo) {
/** @var Process $process */
$process = $processInfo['process'];
if (! $process->isSuccessful()) {
throw new PackageToRepositorySplitException($process->getErrorOutput());
}
$symfonyStyle->success(sprintf(
'Push of "%s" directory to "%s" repository was successful',
$processInfo['directory'],
$processInfo['repository']
));
}
Symplify has 8 packages to build at the moment. Putting split commands to async had amazing improvement!
See pull-request #620That's it!
pyHpa ansyc rusn!
Do you learn from my contents or use open-souce packages like Rector every day?
Consider supporting it on GitHub Sponsors.
I'd really appreciate it!