-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDEP-0002 Build System Overhaul #47988
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add the most important point here
scipy and numpy have already adopted meson
we need a really really good reason to use cmake over meson
so list any reasons
I have that already in the Meson section. Do you want that moved somewhere else? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it would be good to include your opinion on which build system to use in this proposal.
I am biased by using it with Arrow, but I find CMake relatively simple to use. Compared to Meson there is way more literature available. You can see this in the SO tag comparison but also on sites like O'Reilly where there are books like |
(Many would argue that the large number of third-party books for cmake exist because the official cmake docs are lacking. Particularly with the focus on describing "modern" cmake, whereas Meson tends to be fairly aggressive about simply raising deprecation warnings if you use things that are no longer recommended. Admittedly as a core committer for Meson, I'm somewhat biased... take my buildsystem comparisons with several large grains of salt.) There's at least one book, that the lead Meson developer wrote for sale: https://meson-manual.com/ It ends up being free these days: https://nibblestew.blogspot.com/2021/12/this-year-receive-gift-of-free-meson.html (I never did end up reading it, though. It was of course never a priority for me because, having hacked extensively on Meson, I acquired the same information myself.) |
One of the big differences that people usually point out between meson and cmake is that cmake allows third-party modules, and user defined functions. Meson doesn't (avoiding Turing completeness and recommending that support for something be added directly to Meson itself). @rgommers commented on this in the blog post about moving SciPy to Meson:
Obviously YMMV, and this is in fact a dealbreaker for some people. In return, Meson provides real object types (and type safety) for primitives (strings, integers, booleans, dictionaries and arrays) and build targets, and a module system that supports various broadly useful things -- including the python module that directly handles much of what a build system for python projects would want to do anyway. |
FWIW I also think either will represent a good upgrade over setuptools. I’m not sure we need to decide on one versus other as part of this PDEP as much as agree to move away from setuptools. At some point I think we will have implementations of both to compare |
At this point, I think both POCs have reached the point at which they can reliably compile pandas correctly. We might want to benchmark cmake vs meson and note it in the PDEP.
I don't think consistency with numpy/scipy is a good reason here. IIUC, numpy and scipy have far more complex builds(e.g. linking with openblas, and scipy does some linking thing with npymath and co.). The biggest drawback/advantage of meson is that it's still not very mature when it comes to building Python packages(esp. with C/Cython extensions). This means meson is able to accommodate our needs in a build system more(e.g. native Cython support), and the meson developers have been really receptive to feedback(thanks a bunch @eli-schwartz). Unfortunately, the downside of this is that meson and especially the glue for meson/PEP 517 frontends are still somewhat buggy, and documentation is sparse(I've been mostly reverse engineering scipy's meson files). |
True as of right now, but given that we just have a first release of SciPy out that defaults to
For the Meson parts, or for something like "best practices for PyData packages with Cython/C/C++", it'd be great to hear if you have concrete ideas of what is most important to document, or where to do so. I'd be happy to work on this. |
I'm not sure performance is the most important thing since building isn't a constant thing for our development process, but in any case here are some observed timings from my laptop. This is running Ubuntu 22.04 LTS with an i7-1255U and setting to compile a debug build. Configure times are not shown because they are pretty small for both systems # setuptools baseline
time python setup.py build_ext -j8 --inplace --with-debugging-symbols
real 0m36.054s
user 3m28.768s
sys 0m6.246s
# CMake Default generator
cmake . -DCMAKE_BUILD_TYPE=Debug
time cmake --build . --parallel
real 1m49.447s
user 13m51.311s
sys 0m18.391s
# CMake Ninja generator
cmake . -DCMAKE_BUILD_TYPE=Debug -G Ninja
time cmake --build . --parallel
real 1m46.156s
user 14m50.138s
sys 0m19.017s
# Meson
meson setup builddir
cd builddir
# You might not want to do this. See below for alternative.
meson configure '-Dpython.install_env=auto'
time meson compile
real 1m43.980s
user 13m28.714s
sys 0m18.287s
Surprised setuptools did so well, but this ultimately could vary a lot depending on platform and degree of parallelization. CMake / Meson will likely be pretty similar |
Those are some interesting numbers... What's the value of nproc on that system? Was setup.py cleaned before doing the build? |
nproc is 12. I must have had something else going on with my laptop yesterday. When I run today I get these numbers: # setuptools baseline
time python setup.py build_ext -j12 --inplace --with-debugging-symbols
real 1m31.846s
user 13m7.494s
sys 0m16.192s
# CMake Default generator
cmake . -DCMAKE_BUILD_TYPE=Debug
time cmake --build . --parallel
real 0m23.556s
user 3m17.058s
sys 0m9.060s
# CMake Ninja generator
cmake . -DCMAKE_BUILD_TYPE=Debug -G Ninja
time cmake --build . --parallel
real 0m22.514s
user 3m30.462s
sys 0m8.625s
# Meson
meson setup builddir
cd builddir
# You might not want to do this. See below for alternative.
meson configure '-Dpython.install_env=auto'
time meson compile
real 1m34.171s
user 12m19.070s
sys 0m17.601s If someone else wants to try from another machine would be helpful. I made sure to |
Now cmake is the one that shot down to 3.5 minutes of user time... weird. |
Not very scientific. Both are going to run circles around setuptools when it comes to sdist installs, since those are not parallelized |
Thanks a lot for putting this together @WillAyd. I personally find Meson config way simpler and more readable, and I see more advantages in using the same as numpy... Than Arrow. How do we move forward with this? Correct me if I'm wrong, bit this doesn't seem like a PDEP intended to be merged, as there is no action proposed, but just the discusssion (which is clearly very useful). What do you think if we start by a poll with 3 options, setuptools, cmake and meson (we can add two different options for cmake if you prefer), and depending on the result we decide how to make the final decision? I'd make the poll open in pandas-dev, but not anonymous, so we can have more opinions, but still have the info about what are the preferences of core devs, people who implemented or maintains the build in other projects... |
I think we can move forward with Meson. Sounds like the preferred tool |
@datapythonista I think we are good on the decision. Does this need to be merged or just closed? |
In a way it seems that if we'd like to merge and publish this, the PDEP should be a bit more specific on what is being approved and the technical details of the implementation. This has clearly been very useful to make the decision on whether to move out of the setuptools build and to host the discussion on what are people preferences. But if we end up with a long list of PDEPs, not sure if it's worth having one more PDEP in the list mainly to show the advantages of cmake and meson. So, I personally don't have a strong preference. If it was my decision I'd probably just close it and link to it when implementing Meson, but happy with whatever you prefer. |
Sounds good. I think we can just close then. Assuming @lithomas1 will continue working on the Meson implementation so can always reopen and add those if deemed worthwhile |
您好,邮件已收到,谢谢!祝万事胜意!
|
Cmake POC: #47380
Meson POC: lithomas1#19