Skip to content

Conversation

@ssaamm
Copy link

@ssaamm ssaamm commented Sep 27, 2016

The Python 2 API and the Python 3 API differ unnecessarily, making it more difficult to write code that supports both major Python versions. Additionally, a lot of these function names were non-Pythonic.

This contribution is my original work, and I license the work to the project under the project's open source license.

Tests pass:

/Users/sam/dev/avro/lang/py3/env/lib/python3.5/site-packages/setuptools/dist.py:294: UserWarning: The version specified ('1.9.0-SNAPSHOT') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details.
  "details." % self.metadata.version
testCsv (avro.tests.test_script.TestCat) ... ok
testCsvHeader (avro.tests.test_script.TestCat) ... ok
testFields (avro.tests.test_script.TestCat) ... ok
testFiles (avro.tests.test_script.TestCat) ... ok
testFilter (avro.tests.test_script.TestCat) ... ok
testHelp (avro.tests.test_script.TestCat) ... ok
testJsonPretty (avro.tests.test_script.TestCat) ... ok
testPrint (avro.tests.test_script.TestCat) ... ok
testPrintSchema (avro.tests.test_script.TestCat) ... ok
testSkip (avro.tests.test_script.TestCat) ... ok
testVersion (avro.tests.test_script.TestCat) ... ok
testAppend (avro.tests.test_datafile.TestDataFile) ... ok
testContextManager (avro.tests.test_datafile.TestDataFile) ... ok
testMetadata (avro.tests.test_datafile.TestDataFile) ... ok
testRoundTrip (avro.tests.test_datafile.TestDataFile) ... ok
testInterop (avro.tests.test_datafile_interop.TestDataFileInterop) ... ok
testSymbolsInOrder (avro.tests.test_enum.TestEnum) ... ok
testSymbolsInReverseOrder (avro.tests.test_enum.TestEnum) ... ok
testBinaryIntEncoding (avro.tests.test_io.TestIO) ... ok
testBinaryLongEncoding (avro.tests.test_io.TestIO) ... ok
testDefaultValue (avro.tests.test_io.TestIO) ... ok
testFieldOrder (avro.tests.test_io.TestIO) ... ok
testNoDefaultValue (avro.tests.test_io.TestIO) ... ok
testProjection (avro.tests.test_io.TestIO) ... ok
testRoundTrip (avro.tests.test_io.TestIO) ... ok
testSchemaPromotion (avro.tests.test_io.TestIO) ... ok
testSkipInt (avro.tests.test_io.TestIO) ... ok
testSkipLong (avro.tests.test_io.TestIO) ... ok
testTypeException (avro.tests.test_io.TestIO) ... ok
testUnknownSymbol (avro.tests.test_io.TestIO) ... ok
testValidate (avro.tests.test_io.TestIO) ... ok
testEchoService (avro.tests.test_ipc.TestIPC)
Tests client-side of the Echo service. ... 2016-09-27 18:14:00,348 INFO test_ipc.py:118 : Echo RPC Server listening on 127.0.0.1:57632
2016-09-27 18:14:00,348 INFO test_ipc.py:119 : RPC socket: <socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 57632)>
2016-09-27 18:14:00,349 INFO ipc.py:179 : Sending handshake request: {'serverHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2', 'clientHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2'}
2016-09-27 18:14:00,350 INFO ipc.py:204 : writing request: {'ping': {'text': 'hello ping', 'timestamp': 31415}}
2016-09-27 18:14:00,350 INFO ipc.py:654 : Serialized request: b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2\x00\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2\x00\x00\x08ping\xee\xea\x03\x14hello ping'
2016-09-27 18:14:00,351 INFO ipc.py:412 : Processing handshake request: {'serverHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2', 'clientProtocol': None, 'meta': None, 'clientHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2'}
2016-09-27 18:14:00,351 INFO ipc.py:441 : Handshake response: {'match': 'BOTH'}
2016-09-27 18:14:00,351 INFO ipc.py:375 : Processing request: {'ping': {'text': 'hello ping', 'timestamp': 31415}}
2016-09-27 18:14:00,351 INFO test_ipc.py:90 : Message: {"response": {"type": "record", "namespace": "org.apache.avro.ipc.echo", "name": "Pong", "fields": [{"type": "long", "name": "timestamp", "default": -1}, {"type": "org.apache.avro.ipc.echo.Ping", "name": "ping"}]}, "errors": [], "request": [{"type": {"type": "record", "namespace": "org.apache.avro.ipc.echo", "name": "Ping", "fields": [{"type": "long", "name": "timestamp", "default": -1}, {"type": "string", "name": "text", "default": ""}]}, "name": "ping"}]}
2016-09-27 18:14:00,351 INFO test_ipc.py:91 : Request: {'ping': {'text': 'hello ping', 'timestamp': 31415}}
2016-09-27 18:14:00,352 INFO ipc.py:656 : Serialized response: b'\x00\x00\x00\x00\x00\x00\xbe\x90\xe9\xde\xedU\xee\xea\x03\x14hello ping'
127.0.0.1 - - [27/Sep/2016 18:14:00] "POST / HTTP/1.1" 200 -
2016-09-27 18:14:00,352 INFO ipc.py:665 : Response sent
2016-09-27 18:14:00,353 INFO ipc.py:219 : Processing handshake response: {'serverHash': None, 'match': 'BOTH', 'meta': None, 'serverProtocol': None}
2016-09-27 18:14:00,353 INFO test_ipc.py:143 : Received echo response: {'timestamp': 1475018040351, 'ping': {'text': 'hello ping', 'timestamp': 31415}}
2016-09-27 18:14:00,353 INFO ipc.py:179 : Sending handshake request: {'serverHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2', 'clientHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2'}
2016-09-27 18:14:00,353 INFO ipc.py:204 : writing request: {'ping': {'text': 'hello again', 'timestamp': 123456}}
2016-09-27 18:14:00,354 INFO ipc.py:654 : Serialized request: b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2\x00\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2\x00\x00\x08ping\x80\x89\x0f\x16hello again'
2016-09-27 18:14:00,354 INFO ipc.py:412 : Processing handshake request: {'serverHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2', 'clientProtocol': None, 'meta': None, 'clientHash': b'\xcc9\x86\xfb\xb6\xad\x8a\ni\xf9X\xbb\x06\xc6\x15\xb2'}
2016-09-27 18:14:00,355 INFO ipc.py:441 : Handshake response: {'match': 'BOTH'}
2016-09-27 18:14:00,355 INFO ipc.py:375 : Processing request: {'ping': {'text': 'hello again', 'timestamp': 123456}}
2016-09-27 18:14:00,355 INFO test_ipc.py:90 : Message: {"response": {"type": "record", "namespace": "org.apache.avro.ipc.echo", "name": "Pong", "fields": [{"type": "long", "name": "timestamp", "default": -1}, {"type": "org.apache.avro.ipc.echo.Ping", "name": "ping"}]}, "errors": [], "request": [{"type": {"type": "record", "namespace": "org.apache.avro.ipc.echo", "name": "Ping", "fields": [{"type": "long", "name": "timestamp", "default": -1}, {"type": "string", "name": "text", "default": ""}]}, "name": "ping"}]}
2016-09-27 18:14:00,355 INFO test_ipc.py:91 : Request: {'ping': {'text': 'hello again', 'timestamp': 123456}}
2016-09-27 18:14:00,355 INFO ipc.py:656 : Serialized response: b'\x00\x00\x00\x00\x00\x00\xc6\x90\xe9\xde\xedU\x80\x89\x0f\x16hello again'
127.0.0.1 - - [27/Sep/2016 18:14:00] "POST / HTTP/1.1" 200 -
2016-09-27 18:14:00,355 INFO ipc.py:665 : Response sent
2016-09-27 18:14:00,356 INFO ipc.py:219 : Processing handshake response: {'serverHash': None, 'match': 'BOTH', 'meta': None, 'serverProtocol': None}
2016-09-27 18:14:00,356 INFO test_ipc.py:149 : Received echo response: {'timestamp': 1475018040355, 'ping': {'text': 'hello again', 'timestamp': 123456}}
ok
testEquivalenceAfterRoundTrip (avro.tests.test_protocol.TestProtocol) ... ok
testInnerNamespaceNotRendered (avro.tests.test_protocol.TestProtocol) ... ok
testInnerNamespaceSet (avro.tests.test_protocol.TestProtocol) ... ok
testParse (avro.tests.test_protocol.TestProtocol) ... ok
testValidCastToStringAfterParse (avro.tests.test_protocol.TestProtocol) ... ok
testCorrectRecursiveExtraction (avro.tests.test_schema.TestSchema) ... ok
testDocAttributes (avro.tests.test_schema.TestSchema) ... ok
testEquivalenceAfterRoundTrip (avro.tests.test_schema.TestSchema) ... ok
testFullname (avro.tests.test_schema.TestSchema)
The fullname is determined in one of the following ways: ... ok
testOtherAttributes (avro.tests.test_schema.TestSchema) ... ok
testParse (avro.tests.test_schema.TestSchema) ... ok
testValidCastToStringAfterParse (avro.tests.test_schema.TestSchema) ... ok
testMultiFile (avro.tests.test_script.TestWrite) ... ok
testOutfile (avro.tests.test_script.TestWrite) ... ok
testStdin (avro.tests.test_script.TestWrite) ... ok
testVersion (avro.tests.test_script.TestWrite) ... ok
testWriteCsv (avro.tests.test_script.TestWrite) ... ok
testWriteJson (avro.tests.test_script.TestWrite) ... ok

----------------------------------------------------------------------
Ran 50 tests in 2.305s

OK

Copy link
Contributor

@spacharya spacharya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Just one nit

return Names(names=self._names, default_namespace=namespace)

def GetName(self, name, namespace=None):
def _get_name(self, name, namespace=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might break the usage of GetName in exisitng usage.
Not sure if we would want to make the function an internal use only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't 100% sure on this one, because get_name is already a function. When I did a usage search for GetName, the usages looked to be all within this file. When I searched for get_name, it seemed like other modules were using it.

Being that the existing get_name function delegates to this one, I think the functionality is still available for external use.

Does that seem appropriate to you? What are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit conflicted on this. I see that this is the only usage, but I am unsure how people use it.
@rdblue can you check this out and let us know if moving a function as an internal use only in a minor is okay.
Else, +1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, convention is for downstream users to consider methods that do not start with _ to be part of the public API and safe to use. But, all of the renames in the PR are breaking changes so we could just bump the version to indicate what happened.

@ssaamm
Copy link
Author

ssaamm commented Nov 1, 2016

Hi, are there any updates on this PR?

@rdblue
Copy link
Contributor

rdblue commented Nov 1, 2016

@ssaamm, how different are the py and py3 implementations? Is this something we could address by updating the python version to work with py3?

@ssaamm
Copy link
Author

ssaamm commented Nov 2, 2016

I think unifying the py and py3 implementations would be ideal (added side benefit: you wouldn't need separate avro and avro-python3 PyPI packages).

From a cursory 2to3, it looks like it would be possible to make the py version work on Python 3 as well.

@rdblue
Copy link
Contributor

rdblue commented Nov 2, 2016

Are they mostly similar right now? What is the main difference between the existing implementations?

@spacharya
Copy link
Contributor

One solution i can think of is creating methods with the same name. Throwing a warning message saying this has been deprecated and use the latest method. Then call the new method.
In the next release we remove it. Thus we have some backward compatibility in 1.9.0, and getting rid of it in the next 1.10.X release.
This will give people time to move to the new structure and not break all the stuff when they upgrade.

For example, this is what pyopenSSL does if your binding is older.
/lib/python2.6/site-packages/cryptography/hazmat/bindings/openssl/binding.py:212: DeprecationWarning: OpenSSL versions less than 1.0.1 are no longer supported by the OpenSSL project, please upgrade. A future version of cryptography will drop support for these versions. DeprecationWarning

@ssaamm
Copy link
Author

ssaamm commented Nov 9, 2016

@rdblue, As a spot check, I've looked through a diff of schema.py between the Python 2 and 3 implementations, and they look very similar. Main differences look (to me) like (1) slightly different implementations of methods/classes that have either the same or very similar names and (2) the Python 3 version uses the @property decorator.

Do any current maintainers have any insight into why the Python 2 and 3 implementations are separate?

@manu-chroma
Copy link

@ssaamm what can be possibly done to get this PR merged or get it reviewed by the maintainers ?

@ssaamm
Copy link
Author

ssaamm commented Mar 20, 2017

Unsure. Perhaps @spacharya or @rdblue could comment on what next steps would be?

@rdblue
Copy link
Contributor

rdblue commented Mar 20, 2017

I'm reluctant to move forward with this because it makes breaking changes to the python3 API without fixing the underlying problem -- that we have two python APIs. I'd much rather make py work with python 3 and remove py3 entirely. Another option is to add both APIs to py3 and deprecate py, but that seems like more work in the long term to maintain both APIs moving forward.

As it stands, I don't think we should change the py3 API to match the py API.

@takluyver
Copy link

Have you looked at using python-modernize on the Python 2 code? That's an automated code changing tool like 2to3, but it aims to produce code that works on both Python 3 and 2. You'll probably still need some manual changes afterwards, but it can deal with a lot of the time-consuming routine changes that need to be done.

@mr-c
Copy link

mr-c commented Jun 27, 2017

I concur with @takluyver , the python-modernize approach would solve this issue and keep a single codebase.

@takluyver
Copy link

I had a go at this in #234, but had trouble running the tests with Python 3.

@iemejia iemejia added the Python label Nov 29, 2018
iemejia referenced this pull request in iemejia/avro May 24, 2021
This also includes a bump for sha2 library. As it stands, the trait changed,
meaning this will actually be a breaking change for anyone using the fingerprint
functionality of `avro-rs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants