Skip to content

Conversation

WillAyd
Copy link
Contributor

@WillAyd WillAyd commented Sep 16, 2025

closes #228

@wgtmac

include(IcebergBuildUtils)
include(IcebergSanitizer)
include(IcebergThirdpartyToolchain)
include(GenerateExportHeader)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meson does not have this feature. Rather than generate something in the build directory, I created the export header directly in the source and modeled the structure of it after similar visibility.h files in Apache Arrow

Copy link
Contributor Author

@WillAyd WillAyd Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also appears that this was being used incorrectly, as shared library builds on Windows are broken on main (they aren't tested in the CMake CI)


#include "iceberg/json_internal.h"
#include "iceberg/test/test_config.h"
#include "test_config.h"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why this needed to change is that there is an inconsistency between the source directory layout and the build directory layout when it comes to the CMake configuration. The test_config.h file is located in the test directory in the source tree, but CMake installs it to iceberg/test in the build directory.

Meson does not allow you to modify the structure of the build directory like that; generally where you place the meson.build file in the source tree will reflect where files from that are output into the build directory.

I think the ideal approach would be to move the tests to iceberg/test so that the source, build (and potentially install) trees all have the same layout. If we want to do that however, I think that's best to a separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea - it would require moving the tests directory, but as long as we are OK with that I can go ahead and submit a PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #247

run: |
call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
bash -c "ci/scripts/build_example.sh $(pwd)/example"
ubuntu-meson:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to try and create a matrix of these jobs as required. There weren't any matrix jobs today so I didn't want to introduce that here. Can do as a follow up or here if desired

run: |
python3 -m pip install --upgrade pip
python3 -m pip install meson ninja
- name: Build Iceberg
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These CI jobs just invoke Meson directly, as I think the steps are pretty straightforward. However, some of the other Apache projects choose to still put this all into the ci/build_xxx.sh script - happy to migrate there if preferable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we provide ci/build_xxx.sh, it is much easier for developers to verify it locally, right?

)
meson.override_dependency('iceberg', iceberg_dep)
pkg = import('pkgconfig')
pkg.generate(iceberg_lib)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current CMake package I think produces a FindIceberg.cmake file, while this would produce iceberg.pc.

So there is some inconsistency there that we should probably fix. We can either capitalize here or change the cmake config name; for now I've kept this lowercase because that is standard with pkg-config files, and the other Apache libraries use all lowercase as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean IcebergConfig.cmake and IcebergTarget.cmake, right? We can switch to iceberg-config.cmake and iceberg-target.cmake if this makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think that makes more sense for the entire ecosystem

@WillAyd WillAyd force-pushed the add-meson branch 2 times, most recently from 4d1e5fe to 510790f Compare September 16, 2025 15:22
@WillAyd
Copy link
Contributor Author

WillAyd commented Sep 16, 2025

FWIW I did not bother to implement the ICEBERG_BUILD_BUNDLE option in Meson. My initial (possibly flawed) reason is that its not necessary to do that, but I admittedly didn't look too deep into all the machinery that option is providing.

@WillAyd WillAyd force-pushed the add-meson branch 4 times, most recently from 121f8a7 to 79b2c06 Compare September 16, 2025 20:10
@WillAyd
Copy link
Contributor Author

WillAyd commented Sep 17, 2025

I see that Roaring was just integrated into main. It looks like the Meson wrapdb entry for that is out of date, so I'll submit a new one upstream and rebase here

@wgtmac
Copy link
Member

wgtmac commented Sep 24, 2025

Just FYI that #236 will introduce a new library libiceberg_rest and add cpr as a new dependency.

I will try to learn Meson in order to review this PR and maintain it hereafter.

@WillAyd
Copy link
Contributor Author

WillAyd commented Sep 24, 2025

Thanks for the heads up! I'll get that included on this branch with the next push

@WillAyd
Copy link
Contributor Author

WillAyd commented Sep 24, 2025

I will try to learn Meson in order to review this PR and maintain it hereafter.

I am here to help - ping me with any questions :-)

@WillAyd
Copy link
Contributor Author

WillAyd commented Sep 25, 2025

The croaring updates should be upstream in Meson soon (see mesonbuild/wrapdb#2407) - will fix things up and continue after that becomes available

- title: AMD64 Ubuntu 24.04
runs-on: ubuntu-24.04
- title: AMD64 Windows 2022
runs-on: windows-2022
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider windows-2025?

# specific language governing permissions and limitations
# under the License.

install_headers(['in_memory_catalog.h'], subdir: 'iceberg/catalog')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any syntax like glob so we don't need to bother with manual addition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meson does not support globbing, I think mainly because it's build system is Ninja, which also does not support globbing. There's some more reasoning documented in https://mesonbuild.com/FAQ.html#why-cant-i-specify-target-files-with-a-wildcard

Typically the best approach for this would be to have your headers in a separate directory from source files, and then meson could install the entire directory contents. Otherwise you will want to explicitly list them

OUTPUTS
ICEBERG_LIBRARIES)

foreach(LIB_TARGET ${ICEBERG_LIBRARIES})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we embed this into add_iceberg_lib in the IcebergBuildUtils.cmake?

Copy link
Contributor Author

@WillAyd WillAyd Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let's try - I modeled this after the Arrow codebase, which may have some reasons for not embedding this into their add_lib function, but perhaps that is some legacy cruft

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any finding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry - I made the changes to IcebergBuildUtils.cmake but forgot to remove these lines

target_compile_definitions(${LIB_TARGET} PRIVATE ICEBERG_EXPORTING)
endforeach()

if(ICEBERG_BUILD_STATIC)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this

# specific language governing permissions and limitations
# under the License.

option('tests', type: 'feature', description: 'Build tests', value: 'enabled')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add other CMake options?

Copy link
Contributor Author

@WillAyd WillAyd Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options like ICEBERG_BUILD_STATIC and ICEBERG_BUILD_SHARED won't be here - Meson has built-in support for generating shared, static, or both libraries via the default_library argument. So for example, if you wanted to build all static libraries, you would run:

meson setup ... --default_library=static

ICEBERG_ENABLE_ASAN / ICEBERG_ENABLE_UBSAN are similar in that Meson has a built-in command like argument for sanitizers. To enable both of those, you would do:

meson setup ... -Db_sanitize=address,undefined

(you can also supply other sanitizers like fuzzing directly to the CLI)

ICEBERG_INSTALL_LIBDIR / ICEBERG_INSTALL_BINDIR / ICEBERG_INSTALL_INCLUDEDIR have corresponding CLI arguments of --libdir / --bindir / --includedir. I haven't looked closely at what ICEBERG_INSTALL_DOCDIR is but I assume that maps to Meson's --datadir

ICEBERG_BUILD_REST I will add when I get this PR up to speed with recent changes. If there's any other option let me know

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps adding these as comments to this file so impatient developers do not need to do research by themselves?

),
}

foreach test_name, sources : iceberg_tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just a little bit worried that it may add more maintenance overhead to contributors who don't know Meson before. In addition, people may forget to update this file when adding a new test case to the CMakeLists.txt. Any best practice for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how much we want to support Meson. Arrow, which doesn't officially support it, has an "Extra" CI job for Meson that only runs when tagged. It's not a critical CI job.

Smaller projects like arrow-adbc and nanoarrow have just included Meson as part of the main CI. Given they are smaller projects it hasn't been a huge issue, and Meson arguably makes it much easier to set up valgrind, sanitizer, and coverage jobs than CMake.

I'd suggest starting with the adbc/nanoarrow approach given this config is pretty small at the moment and theoretically not that hard to maintain, but either approach is workable

)
meson.override_dependency('iceberg', iceberg_dep)
pkg = import('pkgconfig')
pkg.generate(iceberg_lib)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean IcebergConfig.cmake and IcebergTarget.cmake, right? We can switch to iceberg-config.cmake and iceberg-target.cmake if this makes sense.


iceberg_deps = [nanoarrow_dep, nlohmann_json_dep, spdlog_dep, zlib_dep]

iceberg_lib = library(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any plan to add iceberg_bundle_lib?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point of the bundle lib just to make it easier for downstream projects to not have to resolve the third party dependencies themselves?

If so, I don't think it's worth replicating - Mesons wrap system makes that a far more trivial effort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean this specific library: https://github.com/apache/iceberg-cpp/blob/main/src/iceberg/CMakeLists.txt#L164. Perhaps the name is misleading. It provides builtin support for Parquet, FileSystem and Avro implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK I see - sure we can add that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer look we can add this, but I would advise waiting for the following two things first:

  1. Apache Arrow 22.0 release, which will support exposing the parquet dependency (see GH-46410: [C++] Add parquet options to Meson configuration arrow#46647)
  2. Avro to make its way into the wrapdb, which @stephanlachnit is working on in Add Apache Avro wrap mesonbuild/wrapdb#2270

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create an issue and add a comment to follow up with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #256

@WillAyd WillAyd force-pushed the add-meson branch 5 times, most recently from 925d1a9 to d3a3b56 Compare September 30, 2025 16:15
meson.build Outdated

install_data(
['LICENSE', 'NOTICE'],
install_dir: get_option('datadir') / 'doc/Iceberg',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is following the lead of CMake, but its potentially something else we want to change? Not sure about Windows, but at least on ubuntu 24.04 I don't see anything in /usr/share/doc that starts with a capital letter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Let's fix it in this PR?

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @WillAyd for keeping it updated! I've left some comments. I think this PR need a rebase to add those new files. Let's get it in first.

PUBLIC "$<BUILD_INTERFACE:iceberg_sanitizer_flags>")

string(TOUPPER ${LIB_NAME} VISIBILITY_NAME)
target_compile_definitions(${LIB_NAME}_static PUBLIC ${VISIBILITY_NAME}_STATIC)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
target_compile_definitions(${LIB_NAME}_static PUBLIC ${VISIBILITY_NAME}_STATIC)
target_compile_definitions(${LIB_NAME}_static PUBLIC ${VISIBILITY_NAME}_STATIC)

meson.build Outdated

install_data(
['LICENSE', 'NOTICE'],
install_dir: get_option('datadir') / 'doc/Iceberg',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Let's fix it in this PR?

# specific language governing permissions and limitations
# under the License.

option('tests', type: 'feature', description: 'Build tests', value: 'enabled')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps adding these as comments to this file so impatient developers do not need to do research by themselves?

OUTPUTS
ICEBERG_LIBRARIES)

foreach(LIB_TARGET ${ICEBERG_LIBRARIES})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any finding?

${ICEBERG_BUNDLE_SHARED_INSTALL_INTERFACE_LIBS})
${ICEBERG_BUNDLE_SHARED_INSTALL_INTERFACE_LIBS}
OUTPUTS
ICEBERG_BUNDLE_LIBRARIES)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need same logic to add _EXPORTING and _STATIC to iceberg_bundle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be handled by the add_iceberg_lib function now

endif()

include("${CMAKE_CURRENT_LIST_DIR}/IcebergTargets.cmake")
include("${CMAKE_CURRENT_LIST_DIR}/icebergTargets.cmake")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming it to iceberg-targets.cmake and rename the current file to iceberg-config.cmake.in to be consistent?


iceberg_deps = [nanoarrow_dep, nlohmann_json_dep, spdlog_dep, zlib_dep]

iceberg_lib = library(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create an issue and add a comment to follow up with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Meson Build System Support

2 participants