-
Notifications
You must be signed in to change notification settings - Fork 1.7k
AVRO-2656: Python3 Support for lang/py #744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d164d04 to
dda2d4c
Compare
|
Avro lang/py passes tests on all these Python versions, when I run it locally: It's quite fast once tox caches all the different versions' pip install packages. Installing the packages fresh adds about a minute for each one. |
2ac1716 to
20dbf70
Compare
| """Mixin for methods common to both reading and writing.""" | ||
|
|
||
| block_count = 0 | ||
| _meta = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be class variables? There's a self.block_count in one of the subclasses as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean versus instance variables? It doesn't really make a difference for most usage.
>>> class Foo(object):
... bar = 123
... def m(self):
... self.bar = 'abc'
...
>>> f = Foo()
>>> f.bar
123
>>> g = Foo()
>>> g.bar
123
>>> f.m()
>>> f.bar
'abc'
>>> g.bar
123Here, setting block_count as a constant class variable just makes it a default for the instance. Things do get gnarly if you use a mutable type here, but mutable default types are to be avoided in any case.
20dbf70 to
0e9880d
Compare
Python 2 is officially expired, so the avro project has to decide as a whole how much effort to put into continuing support if a bug comes up that is mainly due to differences in Python 2/3 string handling. My main goal here was to consolidate support into a single lang/py, the one with the more consistent, longer-tenured and pythonic api. For most changes, the “ruling” under-api is that of python’s own json lib. A schema string has to be Unicode because that’s what json.loads requires. I don’t think that will result in any surprises for the lingering python2 users, and this change will be the first time python3 users can use avro at all from this api, so they shouldn’t have any held-over expectations to violate (at least as far as string types). I would really love some help testing this code against real-world data in Python 3. Please open bugs and report your findings. As for the release process, I need to discuss that with @cutting and @nielsbasjes and others, since this is a significant change. |
|
Great work @kojiromike! I've just filed a few minor issues I've found: |
* AVRO-2656: Python3 Support * AVRO-2656: Consolidate DataFile Code * AVRO-2656: Must Seek after Truncate * AVRO-2656: Fix Protocol Unicode * AVRO-2656: Add Additional Trove Classifiers
|
Looking forward to having this available! @kojiromike , any updates to share on releasing these changes? |
|
@cutting , @nielsbasjes as you were mentioned to be involved in the discussion of the release process for these changes -- are there any updates or an ETA to share for these changes being released? |
|
@julianjk I am not sure. These changes are in master now, but they did not go out with 1.9.2, as they are too much for a point release. Therefore I surmise they will go out with 1.10.0. |
|
1.10.0 is planned around May 2020: https://lists.apache.org/thread.html/rb9693e90a8141b2c9f0f9c901c488a079fa6245b2e4d475e022ab1e8%40%3Cdev.avro.apache.org%3E |
Jira
Tests
Commits
Documentation