Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move some c code to go #4309

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lifubang
Copy link
Member

@lifubang lifubang commented Jun 5, 2024

As mentioned in #3951 , we want to move c code to golang, there are many hard works to do.

This PR has done the first step, move all the stage-0 c code and some of the stage-2 c code to go code, because they are not related to namespaces, and could be implemented by golang.

This refactor brings one benifit, it reduces one process clone when start/run/create a container. But because the stage-1 c code is hard to move to go code, so it brings a Complexity for libct/nsenter, for example, it's hard to write unit tests for nsenter.

Welcome more suggestions.

@lifubang lifubang force-pushed the refactor-c-to-go branch 4 times, most recently from bbab957 to c534efe Compare June 5, 2024 11:19
@lifubang lifubang force-pushed the refactor-c-to-go branch from c534efe to 62b3c18 Compare June 6, 2024 04:39
@lifubang lifubang force-pushed the refactor-c-to-go branch 2 times, most recently from 169a838 to eac3328 Compare June 8, 2024 01:11
@AkihiroSuda
Copy link
Member

Needs rebase

@AkihiroSuda AkihiroSuda added the kind/refactor refactoring label Sep 4, 2024
@AkihiroSuda
Copy link
Member

Maybe this should be postponed to v1.3, to avoid having regression in v1.2?

@lifubang lifubang added this to the 1.3.0 milestone Sep 5, 2024
@lifubang lifubang force-pushed the refactor-c-to-go branch 2 times, most recently from 6072f83 to b70c4af Compare September 30, 2024 06:08
@lifubang
Copy link
Member Author

Needs rebase

Done.
Welcome more reviews before we can merge.
@opencontainers/runc-maintainers

I'll do rebase again once #4312 merged.

@rata
Copy link
Member

rata commented Sep 30, 2024

@lifubang I like the idea of the PR, but I'd prefer to merge this just after the 1.2.0 final release. I'd like to focus on bug-fixing now, to do the final release.

This is non-trivial at all and has several semantic changes, there might be follow-up PRs to fix edge cases, and I really prefer to not put all of that on the hot-path to release 1.2.0. I think doing a 1.3.0 release in a few months after 1.2.0 is completely fine (indeed it was the original plan for 1.2, to be a few months after 1.1), if we have this PR (and probably we will have a few of the others that are around too), this makes sense to me as a new release.

What do you think?

@lifubang
Copy link
Member Author

What do you think?

Agree.
We are indeed focusing on the bug fix of 1.2.0.
What I think is that this PR needs more careful review. If we have some free time, we can spend on it.

@rata
Copy link
Member

rata commented Sep 30, 2024

I'll focus on this after the 1.2 release. Thanks for working on this! :)

@rata
Copy link
Member

rata commented Jan 21, 2025

@lifubang are you aiming to merge this for 1.3? There seems to be a lot of conflicts. I haven't tried to review yet, let me know and I can try to review this. It seems like a complicated PR, though, so it will take some time :)

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more nits. In general this is a nice development which needs to be finished.

Note the first commit from this PR is already merged (as commit 171c414, PR #4312). This, I guess, is the reason for most of the conflicts.

@lifubang are you going to keep working on that?

@lifubang
Copy link
Member Author

@lifubang are you going to keep working on that?

Yes, I'll work on it after my children's winter vacation ended in the next week.

@lifubang lifubang marked this pull request as draft February 11, 2025 01:30
@lifubang lifubang force-pushed the refactor-c-to-go branch 4 times, most recently from 2b88c03 to 0806ed6 Compare February 20, 2025 07:38
@lifubang lifubang marked this pull request as ready for review February 20, 2025 09:16
@lifubang
Copy link
Member Author

@rata @kolyshkin @cyphar @AkihiroSuda PTAL

@lifubang
Copy link
Member Author

I don't want to rebase the branch, because I want to test it in ubuntu-20.04(It has been removed in #4634).

Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang thanks for working on this! It's really a great and needed improvement.

I'll try to take a closer look later. Do you think it might be possible to split this in more commits? Like instead of moving all of stage-0 and most of stage-2, can we have like one commit moving the plumbing to go and some basic thing, another moving the write of userns mappings, etc.?

If this is not too complicated to do (it might be), I think it can help code review a lot. Specially to find bugs, as that is simpler when looking at smaller chunks that just need to do the same in go than in C.

bail("failed to close sync_child_pipe[0] fd");

/* For debugging. */
prctl(PR_SET_NAME, (unsigned long)"runc:[2:INIT]", 0, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep this. It's way simpler the init stage now, but still things can go wrong (specially since we are doing big changes) and it is useful to have this IMHO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 make sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is there any non async-signal-safe fuction remaining here, as defined in man signal-safety, after the fork/clone?

@rata
Copy link
Member

rata commented Feb 20, 2025

@lifubang can you rebase and revert the removal of ubuntu here? We can just remove that commit before merging. But that way we will review what is going to be merged (not sure if the conflicts will make a big change in this PR or not).

@rata rata modified the milestones: 1.3.0, 1.3.0-rc.1 Feb 20, 2025
@rata
Copy link
Member

rata commented Feb 20, 2025

I moved this to the 1.3.0-rc.1 milestone. Let's aim to have it by then 🤞 (I think rc.1 didn't exist yet when this was added to 1.3.0), I guess we all agree but if anyone disagrees please let me know :)

@rata
Copy link
Member

rata commented Feb 20, 2025

At some point it might be interesting to see if this has a perf implication too (we want it anyways, but it might be better on that regard too :)). @kolyshkin added some functions to test an exec. I don't remember if they will cover this well, but worth taking a look.

As we want tom move some code from c to go, we should implement them
in golang first, for example:
UpdateSetgroups, TryMappingTool, UpdateUidmap, UpdateGidmap,
UpdateTimeNsOffsets, and UpdateOomScoreAdj.

Signed-off-by: lifubang <[email protected]>
Signed-off-by: lifubang <[email protected]>
Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang You didn't say it's ready for review again, but tests are green and you did split the commits. Although it seems commit 4 and 5 have the same title and you might be working on them?

I wrote some comments before realizing you might be working on it still. I'll leave them in case they are useful. Please do ask for review again once it is ready:)

return os.NewSyscallError("setgid", err)
}

if !config.Config.RootlessEUID && requiresRootOrMappingTool(config.Config.GIDMappings) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is requiresRootOrMappingTool() equivalent to config.is_setgroup?

_ = c.syncSockChild.Close()
_ = c.logPipeChild.Close()
}

func (c *processComm) closeParent() {
_ = c.initSockParent.Close()
_ = c.stage1SockChild.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here? Do we want to close the parent here, I guess?


// Ignore the error in case the child has already been reaped for any reason
_, _ = firstChildProcess.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

Comment on lines -135 to -136
/* XXX: This is ugly. */
static int syncfd = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example that could be a different commit, IMHO. It can be of the last ones, if that helps to avoid breaking compilation/tests. But if you split this out in another commit and just mention that you move to function X now that YY happens and we don't need it here anymore.

If I want to review in detail, it's hard when logically/semantically unrelated changes are mixed together. Because I go and check if this is sound, but checking that can be several steps, and then I have to do backtracking and come back to this point and continue reviewing. It's way more likely I'll miss more stuff if I review this way.

On the other hand, if this was a different commit and you call out in the commit msg why is this possible now, then it will be very easy to review.

var buf [bufSize]byte
native := nl.NativeEndian()
native.PutUint32(buf[:], uint32(msg))
if _, err := unix.Write(int(f.Fd()), buf[:]); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to handle when it returns EINTR now that this is go. This applies to all functions that can return EINTR and are using the unix or syscall package.

@@ -58,7 +58,7 @@ void write_log(int level, const char *format, ...)
if (stage == NULL)
goto out;
} else {
ret = asprintf(&stage, "nsexec-%d", current_stage);
ret = asprintf(&stage, "nsexec-%d", current_stage + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this?

@rata
Copy link
Member

rata commented Feb 21, 2025

This is a big and major change, it will be great if more eyes can have a look. Maybe @cyphar @AkihiroSuda could you have a look?

@cyphar
Copy link
Member

cyphar commented Feb 25, 2025

I've taken a look at this a few times and I'm not sure how I feel about it. Yes, it removes C code but I'm not really sure whether or not it's an improvement overall -- I'll need to read through it a few more times...

I had a few crazy ideas for removing all of the C code (such as putting the runc init into PTRACE_TRACEME mode and then setting up things before executing the Go init) which could theoretically remove a lot of the stuff in Go (once Go supports joining namespaces) but I suspect there will be bits we won't be able to move without doing CRIU-style parasitic code injection.

I'm going to move the milestone to 1.4.0 since it seems unlikely we'll be able to get it in before -rc1, and ultimately this is an internal cleanup rather than a key feature.

@cyphar cyphar modified the milestones: 1.3.0-rc.1, 1.4.0-rc.1 Feb 25, 2025
@lifubang
Copy link
Member Author

I've taken a look at this a few times and I'm not sure how I feel about it. Yes, it removes C code but I'm not really sure whether or not it's an improvement overall -- I'll need to read through it a few more times...

Haha, it seems like that I saw your puzzled face. When I was refactoring these code, I usually wanted to say: "Though it lacks substance, it still brings joy".
Like my description in the PR:
"This PR has done the first step, move all the stage-0 c code and some of the stage-2 c code to go code, because they are not related to namespaces, and could be implemented by golang."

So what I think is to reduce the quantity of functions in nsexec.c, then it maybe help to do the next refactor steps.

I'm going to move the milestone to 1.4.0 since it seems unlikely we'll be able to get it in before -rc1, and ultimately this is an internal cleanup rather than a key feature.

I agree.

@rata
Copy link
Member

rata commented Feb 25, 2025

For me, one very interesting thing is to remove the async-safe-safe issues we currently have.

@cyphar just curious, what makes you doubt? Is it more like a gut feeling and not yet clear? I'm of course fine moving to 1.4, not asking due to that, but out of curiosity :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/refactor refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants