Skip to content

Conversation

@NRHelmi
Copy link
Contributor

@NRHelmi NRHelmi commented Aug 9, 2022

This PR adds support to:

  • Metadata protobuf
relations {
  relation_id {
    arguments {
      tag: CONSTANT_TYPE
      constant_type {
        rel_type {
          tag: PRIMITIVE_TYPE
          primitive_type: SYMBOL
        }
        value {
          arguments {
            tag: SYMBOL
            string_val: "output"
          }
        }
      }
    }
    arguments {
      tag: PRIMITIVE_TYPE
      primitive_type: INT_64
    }
  }
  file_name: "0.arrow"
}
  • Structured v2 response
{'transaction': {'id': '3d1588b0-042a-f244-76ce-c4a5e38b76d3', 'response_format_version': '2.0.3', 'state': 'COMPLETED'}, 'metadata': relations {
  relation_id {
    arguments {
      tag: CONSTANT_TYPE
      constant_type {
        rel_type {
          tag: PRIMITIVE_TYPE
          primitive_type: SYMBOL
        }
        value {
          arguments {
            tag: SYMBOL
            string_val: "output"
          }
        }
      }
    }
    arguments {
      tag: PRIMITIVE_TYPE
      primitive_type: INT_64
    }
  }
  file_name: "0.arrow"
}
, 'results': [{'relationId': '/:output/Int64', 'table': pyarrow.Table
v1: int64 not null
----
v1: [[1]]}], 'problems': []}
  • v2 show results
/:output/Int64/Int64
(1, 200)
(2, 300)

/:output/String
('hi',)

/:output/Int64
(1,)
(2,)
(3,)

  • transaction async integration tests added
  • CI github workflow
  • flake8 linter

@NRHelmi NRHelmi marked this pull request as ready for review August 10, 2022 23:26
@denisgursky
Copy link
Contributor

Hm, string_val: "output" is comes out as a string?

@denisgursky
Copy link
Contributor

Does it make sense to include metadata as a part of results? Basically the same thing as in RelationalAI/rai-sdk-javascript#44

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 11, 2022

Hm, string_val: "output" is comes out as a string?

I believe it's printed correctly because no json serialization performed here

@NHDaly
Copy link
Member

NHDaly commented Aug 11, 2022

Hm, string_val: "output" is comes out as a string?

are we using python2 or python3? The spec says that bytes should become a "bytes" type in Python3: bytes (Python 3)
https://developers.google.com/protocol-buffers/docs/proto3#scalar

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 12, 2022

Hm, string_val: "output" is comes out as a string?

are we using python2 or python3? The spec says that bytes should become a "bytes" type in Python3: bytes (Python 3) https://developers.google.com/protocol-buffers/docs/proto3#scalar

The SDK uses python 3, python 2 was deprecated on January 1, 2020 😢

@@ -0,0 +1,45 @@
# -*- coding: utf-8 -*-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little surprised that this file is coming out as _pb2 when the proto file has syntax = "proto3" in it 🤔 any idea why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was also surprised the first time I generated the proto files but it seems like syntax = "proto3" has nothing to do with _pb2
grpc/grpc#15444 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the link, makes sense!

railib/api.py Outdated
txn.name = name.group(1).decode()
filename = re.match(b'.*filename="(.+?)"', disposition)
if not(filename is None):
txn.filename = name.group(1).decode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bummer that requests-toolbelt doesn't parse this for us, but this seems like a good workaround…

Maybe we could call this variable txn_file or something instead of txn?

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 15, 2022

Does it make sense to include metadata as a part of results? Basically the same thing as in RelationalAI/rai-sdk-javascript#44

@denisgursky since I did also some code refactoring in this PR I think it would be better to address this change in a separate PR (probably when working on ResultTable for this SDK) for the sake of simplicity 🙏

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 19, 2022

if no other updates are required, I'll merge by the end of the day thank you 🙏

@larf311
Copy link

larf311 commented Aug 19, 2022

In the future, please separate out code formatting changes into their own PR. It makes it next to impossible to review the actual changes.

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 19, 2022

In the future, please separate out code formatting changes into their own PR. It makes it next to impossible to review the actual changes.

Yeah sorry for that 🙏 , didn't notice that this PR is getting really complicated to review, maybe a quick guideline on how to review this PR could help:

  • For protobuf metadata:
    • message_pb2.py and schema_pb2.py protobuf generated code are under railib/pb.
    • please check _parse_metadata_proto function under api.py: this function deserializes the byte array received into MetadataInfo object.
    • get_transaction_metadata function under api.py returns the metadata proto content (we removed the old get_transaction_metadata, similar behavior but the new one sends in headers {"Accept": "application/x-protobuf"}.
    • TransactionAsyncResponse class under api.py represents the v2 response object {transaction: ..., metadata:..., results: ..., problems: ...}
  • Multipart parts are converted into TransactionAsyncFile object {name: ..., filename: ..., content_type: ..., content: ...} contents. (please notice that
  • Integration tests under tests/integration.py contains a test for transaction, metadata proto, problems and arrow results
  • CI github workflow is under .github/workflows/build.yaml: should install deps, lint using flake8 and run integration tests

Please ignore the other changes since they are mainly related to code formatting
Thank you 🙏

@larf311
Copy link

larf311 commented Aug 19, 2022

Installing from source and trying to run the examples gives me:

➜  rai-sdk-python git:(hnr-metadata-proto) python3 ./examples/get_transaction_metadata.py cf686a5e-7b8f-54c8-ff2b-d8ebdb68618d
Traceback (most recent call last):
  File "/Users/trevorpaddock/Dev/rai-sdk-python/./examples/get_transaction_metadata.py", line 19, in <module>
    from railib import api, config, show
  File "<frozen zipimport>", line 259, in load_module
  File "/usr/local/lib/python3.9/site-packages/rai_sdk-0.6.4-py3.9.egg/railib/api.py", line 24, in <module>
ModuleNotFoundError: No module named 'requests_toolbelt

I need to run:

python3 -m pip install -r requirements.txt

Then everything works. Do we need to add that to the README or do you need to update setup.py?

@NRHelmi
Copy link
Contributor Author

NRHelmi commented Aug 21, 2022

Thanks @larf311 just updated the README file, it was missing the dependencies installation

pip install -r requirements.txt

@NRHelmi NRHelmi requested a review from bradlo August 22, 2022 15:09
@NRHelmi
Copy link
Contributor Author

NRHelmi commented Sep 1, 2022

If no other updates are required, I'll merge this PR by the end of the day
thank you 🙏

@NRHelmi NRHelmi merged commit 4241815 into main Sep 2, 2022
@NRHelmi NRHelmi deleted the hnr-metadata-proto branch September 2, 2022 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants